Обсуждение: cryptohash: missing locking functions for OpenSSL <= 1.0.2?
While reviewing the NSS patch [1], I noticed that the cryptohash implementation for OpenSSL doesn't set up any locking callbacks in frontend code. I think there has to be a call to OPENSSL_set_locking_callback() before libpq starts reaching into the EVP_* API, if ENABLE_THREAD_SAFETY and HAVE_CRYPTO_LOCK are both true. This would only affect threaded libpq clients running OpenSSL 1.0.2 and below, and it looks like the most likely code path to be affected is the OpenSSL error stack. So if anything went wrong with one of those hash calls, it's possible that libpq would crash (or worse, silently misbehave somewhere in the TLS stack) instead of gracefully reporting an error. [2] is an example of this in the wild. --Jacob [1] https://www.postgresql.org/message-id/40095f48c3c6d556293cb0ecf80ea10cdf7d26b3.camel%40vmware.com [2] https://github.com/openssl/openssl/issues/4690
On Wed, Feb 17, 2021 at 06:34:36PM +0000, Jacob Champion wrote: > This would only affect threaded libpq clients running OpenSSL 1.0.2 and > below, and it looks like the most likely code path to be affected is > the OpenSSL error stack. So if anything went wrong with one of those > hash calls, it's possible that libpq would crash (or worse, silently > misbehave somewhere in the TLS stack) instead of gracefully reporting > an error. [2] is an example of this in the wild. I have been discussing a bit this issue with Jacob, and that's a problem we would need to address on HEAD. First, I have been looking at this stuff in older versions with MD5 and SHA256 used by SCRAM when it comes to ~13 with libpq: - MD5 is based on the internal implementation of Postgres even when building libpq with OpenSSL, so that would not be an issue. - SHA256 is a different story though, because when building with OpenSSL we would go through SHA256_{Init,Update,Final} for SCRAM authentication. In the context of a SSL connection, the crypto part is initialized. But that would *not* be the case of a connection in a non-SSL context. Fortunately, and after looking at the OpenSSL code (fips_md_init_ctx, HASH_UPDATE, etc.), there is no sign of locking handling or errors, so I think that we are safe there. Now comes the case of HEAD that uses EVP for MD5 and SHA256. A SSL connection would initialize the crypto part, but that does not happen for a non-SSL connection. So, logically, one could run into issues if using MD5 or SCRAM with OpenSSL <= 1.0.2 (pgbench with a high number of threads does not complain by the way), and we are not yet in a stage where we should drop this version either, even if it has been EOL'd by upstream at the end of 2019. We have the code in place to properly initialize the crypto locking in libpq with ENABLE_THREAD_SAFETY, but the root of the issue is that the SSL and crypto initializations are grouped together. What we need to do here is to split those phases of the initialization so as non-SSL connections can use the crypto part properly, as pqsecure_initialize gets only called now when libpq negotiates SSL with the postmaster. It seems to me that we should make sure of a proper reset of the crypto part within pqDropConnection(), while the initialization needs to happen in PQconnectPoll(). -- Michael
Вложения
On Thu, Feb 18, 2021 at 11:04:05AM +0900, Michael Paquier wrote: > We have the code in place to properly initialize the crypto locking in > libpq with ENABLE_THREAD_SAFETY, but the root of the issue is that the > SSL and crypto initializations are grouped together. What we need to > do here is to split those phases of the initialization so as non-SSL > connections can use the crypto part properly, as pqsecure_initialize > gets only called now when libpq negotiates SSL with the postmaster. > It seems to me that we should make sure of a proper reset of the > crypto part within pqDropConnection(), while the initialization needs > to happen in PQconnectPoll(). So, I have tried a couple of things with a debug build of OpenSSL 1.0.2 at hand (two locks for the crypto and SSL initializations but SSL_new() grabs some random bytes that need the same callback to be set or the state of the threads is messed up, some global states to control the load), and the simplest solution I have come up with is to control in each pg_conn if the crypto callbacks have been initialized or not so as we avoid multiple inits and/or drops of the state for a single connection. I have arrived at this conclusion after hunting down cases with pqDropConnection() which would could be called multiple times, particularly if there are connection attempts to multiple hosts. The attached patch implements things this way, and initializes the crypto callbacks before sending the startup packet, before deciding if SSL needs to be requested or not. I have played with several threading scenarios with this patch, with and without OpenSSL, and the numbers match in terms of callback loading and unloading (the global counter used in fe-secure-openssl.c gets to zero). -- Michael