Обсуждение: cryptohash: missing locking functions for OpenSSL <= 1.0.2?

Поиск
Список
Период
Сортировка

cryptohash: missing locking functions for OpenSSL <= 1.0.2?

От
Jacob Champion
Дата:
While reviewing the NSS patch [1], I noticed that the cryptohash
implementation for OpenSSL doesn't set up any locking callbacks in
frontend code. I think there has to be a call to
OPENSSL_set_locking_callback() before libpq starts reaching into the
EVP_* API, if ENABLE_THREAD_SAFETY and HAVE_CRYPTO_LOCK are both true.

This would only affect threaded libpq clients running OpenSSL 1.0.2 and
below, and it looks like the most likely code path to be affected is
the OpenSSL error stack. So if anything went wrong with one of those
hash calls, it's possible that libpq would crash (or worse, silently
misbehave somewhere in the TLS stack) instead of gracefully reporting
an error. [2] is an example of this in the wild.

--Jacob

[1] https://www.postgresql.org/message-id/40095f48c3c6d556293cb0ecf80ea10cdf7d26b3.camel%40vmware.com
[2] https://github.com/openssl/openssl/issues/4690

Re: cryptohash: missing locking functions for OpenSSL <= 1.0.2?

От
Michael Paquier
Дата:
On Wed, Feb 17, 2021 at 06:34:36PM +0000, Jacob Champion wrote:
> This would only affect threaded libpq clients running OpenSSL 1.0.2 and
> below, and it looks like the most likely code path to be affected is
> the OpenSSL error stack. So if anything went wrong with one of those
> hash calls, it's possible that libpq would crash (or worse, silently
> misbehave somewhere in the TLS stack) instead of gracefully reporting
> an error. [2] is an example of this in the wild.

I have been discussing a bit this issue with Jacob, and that's a
problem we would need to address on HEAD.  First, I have been looking
at this stuff in older versions with MD5 and SHA256 used by SCRAM when
it comes to ~13 with libpq:
- MD5 is based on the internal implementation of Postgres even when
building libpq with OpenSSL, so that would not be an issue.
- SHA256 is a different story though, because when building with
OpenSSL we would go through SHA256_{Init,Update,Final} for SCRAM
authentication.  In the context of a SSL connection, the crypto part
is initialized.  But that would *not* be the case of a connection in a
non-SSL context.  Fortunately, and after looking at the OpenSSL code
(fips_md_init_ctx, HASH_UPDATE, etc.), there is no sign of locking
handling or errors, so I think that we are safe there.

Now comes the case of HEAD that uses EVP for MD5 and SHA256.  A SSL
connection would initialize the crypto part, but that does not happen
for a non-SSL connection.  So, logically, one could run into issues if
using MD5 or SCRAM with OpenSSL <= 1.0.2 (pgbench with a high number
of threads does not complain by the way), and we are not yet in a
stage where we should drop this version either, even if it has been
EOL'd by upstream at the end of 2019.

We have the code in place to properly initialize the crypto locking in
libpq with ENABLE_THREAD_SAFETY, but the root of the issue is that the
SSL and crypto initializations are grouped together.  What we need to
do here is to split those phases of the initialization so as non-SSL
connections can use the crypto part properly, as pqsecure_initialize
gets only called now when libpq negotiates SSL with the postmaster.
It seems to me that we should make sure of a proper reset of the
crypto part within pqDropConnection(), while the initialization needs
to happen in PQconnectPoll().
--
Michael

Вложения

Re: cryptohash: missing locking functions for OpenSSL <= 1.0.2?

От
Michael Paquier
Дата:
On Thu, Feb 18, 2021 at 11:04:05AM +0900, Michael Paquier wrote:
> We have the code in place to properly initialize the crypto locking in
> libpq with ENABLE_THREAD_SAFETY, but the root of the issue is that the
> SSL and crypto initializations are grouped together.  What we need to
> do here is to split those phases of the initialization so as non-SSL
> connections can use the crypto part properly, as pqsecure_initialize
> gets only called now when libpq negotiates SSL with the postmaster.
> It seems to me that we should make sure of a proper reset of the
> crypto part within pqDropConnection(), while the initialization needs
> to happen in PQconnectPoll().

So, I have tried a couple of things with a debug build of OpenSSL
1.0.2 at hand (two locks for the crypto and SSL initializations but
SSL_new() grabs some random bytes that need the same callback to be
set or the state of the threads is messed up, some global states to
control the load), and the simplest solution I have come up with is to
control in each pg_conn if the crypto callbacks have been initialized
or not so as we avoid multiple inits and/or drops of the state for a
single connection.  I have arrived at this conclusion after hunting
down cases with pqDropConnection() which would could be called
multiple times, particularly if there are connection attempts to
multiple hosts.

The attached patch implements things this way, and initializes the
crypto callbacks before sending the startup packet, before deciding if
SSL needs to be requested or not.  I have played with several
threading scenarios with this patch, with and without OpenSSL, and the
numbers match in terms of callback loading and unloading (the global
counter used in fe-secure-openssl.c gets to zero).
--
Michael

Вложения