Обсуждение: Re: No error checking when reading from file using zstd in pg_dump
16.06.2025 14:25, Daniel Gustafsson пишет: >> On 16 Jun 2025, at 10:14, Evgeniy Gorbanev <gorbanyoves@basealt.ru> wrote: >> In src/bin/pg_dump/compress_zstd.c, the Zstd_read function always >> returns true. But if you look at the Zstd_gets and Zstd_getc functions, >> where Zstd_read is called via CFH->read_func, it is expected that >> the Zstd_read function can also return false. In case of >> a read-from-file error, the process is expected to terminate, but >> pg_dump will continue the process. >> I assume that after checking >> if (cnt == 0) >> should be >> return false; > if (cnt == 0) > - break; > + return false; > > As cnt is coming from fread() returning false here would be wrong as you cannot > from 0 alone know if it's EOF or an error. Instead it needs to inspect the > stream with feof() and ferror() to know how to proceed. > > -- > Daniel Gustafsson The feof() check is in the calling functions, e.g. in the Zstd_getc function. Regards, Evgeniy
> On 16 Jun 2025, at 10:52, Evgeniy Gorbanev <gorbanyoves@basealt.ru> wrote: > > 16.06.2025 14:25, Daniel Gustafsson пишет: >>> On 16 Jun 2025, at 10:14, Evgeniy Gorbanev <gorbanyoves@basealt.ru> wrote: >>> In src/bin/pg_dump/compress_zstd.c, the Zstd_read function always >>> returns true. But if you look at the Zstd_gets and Zstd_getc functions, >>> where Zstd_read is called via CFH->read_func, it is expected that >>> the Zstd_read function can also return false. In case of >>> a read-from-file error, the process is expected to terminate, but >>> pg_dump will continue the process. >>> I assume that after checking >>> if (cnt == 0) >>> should be >>> return false; >> if (cnt == 0) >> - break; >> + return false; >> >> As cnt is coming from fread() returning false here would be wrong as you cannot >> from 0 alone know if it's EOF or an error. Instead it needs to inspect the >> stream with feof() and ferror() to know how to proceed. > > The feof() check is in the calling functions, e.g. in the Zstd_getc > function. That function is using feof/ferror to log an appropriate error message, what you are proposing is to consider all cases of EOF as an error. If you test that you will see lots of test starting to fail. -- Daniel Gustafsson
> On 16 Jun 2025, at 11:26, Evgeniy Gorbanev <gorbanyoves@basealt.ru> wrote: > I ran tests for pg_dump and they all passed. Logs attached. Files=7, Tests=11918, 35 wallclock secs ( 0.59 usr 0.07 sys + 7.92 cusr 4.32 csys = 12.90 CPU) That seems like a low number of tests executed, with ZStd enabled I see over 13200 tests in the log. Are you sure you are building with ZStd compression enabled? -- Daniel Gustafsson
16.06.2025 15:43, Daniel Gustafsson пишет: >> On 16 Jun 2025, at 11:26, Evgeniy Gorbanev <gorbanyoves@basealt.ru> wrote: >> I ran tests for pg_dump and they all passed. Logs attached. > Files=7, Tests=11918, 35 wallclock secs ( 0.59 usr 0.07 sys + 7.92 cusr 4.32 csys = 12.90 CPU) > > That seems like a low number of tests executed, with ZStd enabled I see over > 13200 tests in the log. Are you sure you are building with ZStd compression > enabled? > > -- > Daniel Gustafsson > Yes, you are right. Now I see the errors.
I think the real problem here is that what the code is doing is precisely not what is documented in compress_io.h: /* * Read 'size' bytes of data from the file and store them into 'ptr'. * Optionally it will store the number of bytes read in 'rsize'. * * Returns true on success and throws an internal error otherwise. */ bool (*read_func) (void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH); I've not checked to see what the other users of this API do, but if they're all like this then we need to fix that comment. Otherwise we probably need to make them all alike; even if there's not a bug today, one will soon appear from someone believing the comment. regards, tom lane
> On 16 Jun 2025, at 15:56, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I've not checked to see what the other users of this API do, but > if they're all like this then we need to fix that comment. AFAICT all other callers of this API are throwing an error with pg_fatal, and so does the function in question for ZStd decompression errors. If we handle the case of fread() returning 0 to indicate an error like the below *untested sketch* (with a better error message) this function is fully API compliant as well. /* If we have no more input to consume, we're done */ if (cnt == 0) + { + if (ferror(unconstify(void *, input->src))) + pg_fatal("could not read data to decompress: %m"); + break; + } If this seems like a good approach then Zstd_getc can be simplified as well as it no longer needs to call ferror, it still needs to check feof though. -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes: >> On 16 Jun 2025, at 15:56, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I've not checked to see what the other users of this API do, but >> if they're all like this then we need to fix that comment. > AFAICT all other callers of this API are throwing an error with pg_fatal, and > so does the function in question for ZStd decompression errors. I think I agree that we need to handle the ferror() case inside the read_func for consistency. But there is another problem, which is that neither of its callers are paying attention to the API: as defined, the read_func can never return anything but "true", which is how Zstd_read behaves. But both the callers in compress_zstd.c seem to think they should test its result to detect EOF. Apparently, our tests do not cover the case of an unexpected EOF. This API is actually quite bizarrely defined, if you ask me. Returning the byte count is optional, but if you don't pay attention to the byte count you cannot know if you got any data. At best, that's bug-encouraging. regards, tom lane
Daniel Gustafsson <daniel@yesql.se> writes: > On 16 Jun 2025, at 16:20, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> This API is actually quite bizarrely defined, if you ask me. >> Returning the byte count is optional, but if you don't pay >> attention to the byte count you cannot know if you got any >> data. At best, that's bug-encouraging. > Agreed. Given that many of implementations in the gzip support code directly > return the gzXXX function I suspect it was modeled around GZip but thats an > unresearched guess. The fact that this returns Z_NULL where the API defines > NULL is unlikely to ever be an issue, but it also doesn't seem entirely neat.. After looking around a bit more I realized that this API is a complete disaster: it's not only bug-prone, but there are actual bugs in multiple callers, eg _ReadBuf() totally fails to detect early EOF as it intends to. We need to fix it, not slap some band-aids on. Draft patch attached. The write_func side is a bit of a mess too. I fixed some obvious API violations in the attached, but I think we need to do more, because it seems that zero thought was given to reporting errors sanely. The callers all assume that strerror() is what to use to report an incomplete write, but this is not appropriate in every case, for instance we need to pay attention to gzerror() in gzip cases. I'm inclined to think that the best answer is to move error reporting into the write_funcs, and just make the API spec be "write or die". I've not tackled that here though. I was depressed to find that Zstd_getc reads one byte into an otherwise-uninitialized int and then returns the whole int. Even if we're lucky enough for the variable to start off zero, this would fail utterly on big-endian machines. The only reason we've not noticed is that the regression tests do not reach Zstd_getc, nor most of the other getc_func implementations. Now I'm afraid to look at the rest of the compress_io.h API. If it's as badly written as these parts, there are more bugs to be found. regards, tom lane diff --git a/src/bin/pg_dump/compress_gzip.c b/src/bin/pg_dump/compress_gzip.c index 5a30ebf9bf5..8db121b94a3 100644 --- a/src/bin/pg_dump/compress_gzip.c +++ b/src/bin/pg_dump/compress_gzip.c @@ -251,14 +251,14 @@ InitCompressorGzip(CompressorState *cs, *---------------------- */ -static bool -Gzip_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH) +static size_t +Gzip_read(void *ptr, size_t size, CompressFileHandle *CFH) { gzFile gzfp = (gzFile) CFH->private_data; int gzret; gzret = gzread(gzfp, ptr, size); - if (gzret <= 0 && !gzeof(gzfp)) + if (gzret < 0) { int errnum; const char *errmsg = gzerror(gzfp, &errnum); @@ -267,10 +267,7 @@ Gzip_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH) errnum == Z_ERRNO ? strerror(errno) : errmsg); } - if (rsize) - *rsize = (size_t) gzret; - - return true; + return (size_t) gzret; } static bool @@ -278,7 +275,7 @@ Gzip_write(const void *ptr, size_t size, CompressFileHandle *CFH) { gzFile gzfp = (gzFile) CFH->private_data; - return gzwrite(gzfp, ptr, size) > 0; + return gzwrite(gzfp, ptr, size) == size; } static int diff --git a/src/bin/pg_dump/compress_io.h b/src/bin/pg_dump/compress_io.h index db9b38744c8..dad0c91fec5 100644 --- a/src/bin/pg_dump/compress_io.h +++ b/src/bin/pg_dump/compress_io.h @@ -123,21 +123,23 @@ struct CompressFileHandle CompressFileHandle *CFH); /* - * Read 'size' bytes of data from the file and store them into 'ptr'. - * Optionally it will store the number of bytes read in 'rsize'. + * Read up to 'size' bytes of data from the file and store them into + * 'ptr'. * - * Returns true on success and throws an internal error otherwise. + * Returns number of bytes read (this might be less than 'size' if EOF was + * reached). Throws error for error conditions other than EOF. */ - bool (*read_func) (void *ptr, size_t size, size_t *rsize, + size_t (*read_func) (void *ptr, size_t size, CompressFileHandle *CFH); /* * Write 'size' bytes of data into the file from 'ptr'. * - * Returns true on success and false on error. + * Returns true on success and false on error (it is caller's + * responsibility to deal with error conditions). */ bool (*write_func) (const void *ptr, size_t size, - struct CompressFileHandle *CFH); + CompressFileHandle *CFH); /* * Read at most size - 1 characters from the compress file handle into diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c index e99f0cad71f..c94f9dcd4cf 100644 --- a/src/bin/pg_dump/compress_lz4.c +++ b/src/bin/pg_dump/compress_lz4.c @@ -583,6 +583,7 @@ LZ4Stream_write(const void *ptr, size_t size, CompressFileHandle *CFH) return false; } + errno = 0; if (fwrite(state->buffer, 1, status, state->fp) != status) { errno = (errno) ? errno : ENOSPC; @@ -598,8 +599,8 @@ LZ4Stream_write(const void *ptr, size_t size, CompressFileHandle *CFH) /* * fread() equivalent implementation for LZ4 compressed files. */ -static bool -LZ4Stream_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH) +static size_t +LZ4Stream_read(void *ptr, size_t size, CompressFileHandle *CFH) { LZ4State *state = (LZ4State *) CFH->private_data; int ret; @@ -607,10 +608,7 @@ LZ4Stream_read(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH) if ((ret = LZ4Stream_read_internal(state, ptr, size, false)) < 0) pg_fatal("could not read from input file: %s", LZ4Stream_get_error(CFH)); - if (rsize) - *rsize = (size_t) ret; - - return true; + return (size_t) ret; } /* diff --git a/src/bin/pg_dump/compress_none.c b/src/bin/pg_dump/compress_none.c index 3fc89c99854..4924935a5bd 100644 --- a/src/bin/pg_dump/compress_none.c +++ b/src/bin/pg_dump/compress_none.c @@ -83,23 +83,17 @@ InitCompressorNone(CompressorState *cs, * Private routines */ -static bool -read_none(void *ptr, size_t size, size_t *rsize, CompressFileHandle *CFH) +static size_t +read_none(void *ptr, size_t size, CompressFileHandle *CFH) { FILE *fp = (FILE *) CFH->private_data; size_t ret; - if (size == 0) - return true; - ret = fread(ptr, 1, size, fp); - if (ret != size && !feof(fp)) + if (ferror(fp)) pg_fatal("could not read from input file: %m"); - if (rsize) - *rsize = ret; - - return true; + return ret; } static bool diff --git a/src/bin/pg_dump/compress_zstd.c b/src/bin/pg_dump/compress_zstd.c index cb595b10c2d..8f718212d9f 100644 --- a/src/bin/pg_dump/compress_zstd.c +++ b/src/bin/pg_dump/compress_zstd.c @@ -258,8 +258,8 @@ InitCompressorZstd(CompressorState *cs, * Compressed stream API */ -static bool -Zstd_read(void *ptr, size_t size, size_t *rdsize, CompressFileHandle *CFH) +static size_t +Zstd_read(void *ptr, size_t size, CompressFileHandle *CFH) { ZstdCompressorState *zstdcs = (ZstdCompressorState *) CFH->private_data; ZSTD_inBuffer *input = &zstdcs->input; @@ -292,6 +292,9 @@ Zstd_read(void *ptr, size_t size, size_t *rdsize, CompressFileHandle *CFH) if (input->pos == input->size) { cnt = fread(unconstify(void *, input->src), 1, input_allocated_size, zstdcs->fp); + if (ferror(zstdcs->fp)) + pg_fatal("could not read from input file: %m"); + input->size = cnt; Assert(cnt <= input_allocated_size); @@ -320,10 +323,7 @@ Zstd_read(void *ptr, size_t size, size_t *rdsize, CompressFileHandle *CFH) break; /* We read all the data that fits */ } - if (rdsize != NULL) - *rdsize = output->pos; - - return true; + return output->pos; } static bool @@ -358,22 +358,16 @@ Zstd_write(const void *ptr, size_t size, CompressFileHandle *CFH) } } - return size; + return true; } static int Zstd_getc(CompressFileHandle *CFH) { - ZstdCompressorState *zstdcs = (ZstdCompressorState *) CFH->private_data; - int ret; + unsigned char ret; - if (CFH->read_func(&ret, 1, NULL, CFH) != 1) - { - if (feof(zstdcs->fp)) - pg_fatal("could not read from input file: end of file"); - else - pg_fatal("could not read from input file: %m"); - } + if (CFH->read_func(&ret, 1, CFH) != 1) + pg_fatal("could not read from input file: end of file"); return ret; } @@ -390,11 +384,7 @@ Zstd_gets(char *buf, int len, CompressFileHandle *CFH) */ for (i = 0; i < len - 1; ++i) { - size_t readsz; - - if (!CFH->read_func(&buf[i], 1, &readsz, CFH)) - break; - if (readsz != 1) + if (CFH->read_func(&buf[i], 1, CFH) != 1) break; if (buf[i] == '\n') { diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c index bc2a2fb4797..eb2ce8d5a6c 100644 --- a/src/bin/pg_dump/pg_backup_directory.c +++ b/src/bin/pg_dump/pg_backup_directory.c @@ -351,7 +351,7 @@ _EndData(ArchiveHandle *AH, TocEntry *te) static void _PrintFileData(ArchiveHandle *AH, char *filename) { - size_t cnt = 0; + size_t cnt; char *buf; size_t buflen; CompressFileHandle *CFH; @@ -366,7 +366,7 @@ _PrintFileData(ArchiveHandle *AH, char *filename) buflen = DEFAULT_IO_BUFFER_SIZE; buf = pg_malloc(buflen); - while (CFH->read_func(buf, buflen, &cnt, CFH) && cnt > 0) + while ((cnt = CFH->read_func(buf, buflen, CFH)) > 0) { ahwrite(buf, 1, cnt, AH); } @@ -531,10 +531,10 @@ _ReadBuf(ArchiveHandle *AH, void *buf, size_t len) CompressFileHandle *CFH = ctx->dataFH; /* - * If there was an I/O error, we already exited in readF(), so here we - * exit on short reads. + * We do not expect a short read, so fail if we get one. The read_func + * already dealt with any outright I/O error. */ - if (!CFH->read_func(buf, len, NULL, CFH)) + if (CFH->read_func(buf, len, CFH) != len) pg_fatal("could not read from input file: end of file"); }
On 6/16/25 17:41, Daniel Gustafsson wrote: >> On 16 Jun 2025, at 16:20, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> Daniel Gustafsson <daniel@yesql.se> writes: >>>> On 16 Jun 2025, at 15:56, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> I've not checked to see what the other users of this API do, but >>>> if they're all like this then we need to fix that comment. >> >>> AFAICT all other callers of this API are throwing an error with pg_fatal, and >>> so does the function in question for ZStd decompression errors. >> >> I think I agree that we need to handle the ferror() case inside the >> read_func for consistency. But there is another problem, which is >> that neither of its callers are paying attention to the API: as >> defined, the read_func can never return anything but "true", >> which is how Zstd_read behaves. But both the callers in >> compress_zstd.c seem to think they should test its result to >> detect EOF. > > Attached is a stab at fixing both the error handling in read_func as well as > the callers misuse of the API. > >> Apparently, our tests do not cover the case of an >> unexpected EOF. > > I admittedly ran out of steam when looking at adding something like this to our > pg_dump tests. > >> This API is actually quite bizarrely defined, if you ask me. >> Returning the byte count is optional, but if you don't pay >> attention to the byte count you cannot know if you got any >> data. At best, that's bug-encouraging. > > Agreed. Given that many of implementations in the gzip support code directly > return the gzXXX function I suspect it was modeled around GZip but thats an > unresearched guess. The fact that this returns Z_NULL where the API defines > NULL is unlikely to ever be an issue, but it also doesn't seem entirely neat.. > Yes. It's been a while since this commit, but I recall we started with a gzip-only implementation. 16 introduced this API, but it's definitely the case it was based on the initial gzip code. Regarding the Z_NULL, I believe it has always been ignored like this, at least since 9.1. The code simply returns what gzgets() returns, and then compares that to NULL, etc. Is there there's a better way to deal with Z_NULL? I suppose we could explicitly check/translate Z_NULL to NULL, although Z_NULL is simply defined as 0. I don't recall if NULL has some additional magic. regards -- Tomas Vondra
On 6/16/25 19:56, Tom Lane wrote: > ... > > After looking around a bit more I realized that this API is a complete > disaster: it's not only bug-prone, but there are actual bugs in > multiple callers, eg _ReadBuf() totally fails to detect early EOF as > it intends to. We need to fix it, not slap some band-aids on. > Draft patch attached. > > The write_func side is a bit of a mess too. I fixed some obvious API > violations in the attached, but I think we need to do more, because > it seems that zero thought was given to reporting errors sanely. > The callers all assume that strerror() is what to use to report an > incomplete write, but this is not appropriate in every case, for > instance we need to pay attention to gzerror() in gzip cases. > I'm inclined to think that the best answer is to move error reporting > into the write_funcs, and just make the API spec be "write or die". > I've not tackled that here though. > > I was depressed to find that Zstd_getc reads one byte into > an otherwise-uninitialized int and then returns the whole int. > Even if we're lucky enough for the variable to start off zero, > this would fail utterly on big-endian machines. The only > reason we've not noticed is that the regression tests do not > reach Zstd_getc, nor most of the other getc_func implementations. > > Now I'm afraid to look at the rest of the compress_io.h API. > If it's as badly written as these parts, there are more bugs > to be found. > Well, that doesn't sound great :-( Most of this is likely my fault, as the one who pushed e9960732a961 (and some commits that built on it). Sorry about that. I'll take a fresh look at the commits this week, but considering I missed the issues before commit ... For a moment I was worried about breaking ABI when fixing this in the backbranches, but I guess that's not an issue for tools like pg_dump. regards -- Tomas Vondra
Tomas Vondra <tomas@vondra.me> writes: > For a moment I was worried about breaking ABI when fixing this in the > backbranches, but I guess that's not an issue for tools like pg_dump. Yeah, I think it'd be okay to change compress_io.h APIs in the back branches; it's hard to see how anything outside pg_dump+pg_restore would be depending on that. regards, tom lane
> On 16 Jun 2025, at 21:45, Tomas Vondra <tomas@vondra.me> wrote: > Regarding the Z_NULL, I believe it has always been ignored like this, at > least since 9.1. The code simply returns what gzgets() returns, and then > compares that to NULL, etc. Is there there's a better way to deal with > Z_NULL? I suppose we could explicitly check/translate Z_NULL to NULL, > although Z_NULL is simply defined as 0. I don't recall if NULL has some > additional magic. Right, to be clear, I don't think there is a bug here (or the risk one going forward). It's just my own preference of not mixing API concepts -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes: > I spent a little bit of time reading over all the implementations and cross > referencing the API for conformity, and came up with the attached. The 0001 > patch is the one from upstream, and each subsequent commit is fixing one > function for all the implementations. Before pushing it should all be squashed > into a single commit IMHO. Thanks for tackling this! I looked over this patchset briefly, and found a couple of nits: v5-0002, in compress_io.h: + * Returns true on success and throws error for all error conditions. It doesn't return true anymore. Should be more like + * Returns nothing. Exits via pg_fatal for all error conditions. In LZ4Stream_write: you dropped the bit about - errno = (errno) ? errno : ENOSPC; but I think that's still necessary: we must assume ENOSPC if fwrite doesn't set errno. Other fwrite callers (write_none, Zstd_write) need this too. v5-0004 has an instance too, in Zstd_close. I did not check to see if other fwrite calls are OK, but it'd be good to verify that they all follow the pattern of presetting errno to 0 and then replacing that with ENOSPC. regards, tom lane
Daniel Gustafsson <daniel@yesql.se> writes: > On 25 Jun 2025, at 17:58, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> It doesn't return true anymore. Should be more like >> + * Returns nothing. Exits via pg_fatal for all error conditions. > Instead of this I changed the write_func signature to return the number of > bytes written as size_t. The API is loosely modelled around the libc stream API so > going to void seemed less justifiable than size_t, even if the actual > returnvalue is fairly useless due to erroring out via pg_fatal. Hm. My one concern about that is that using "void" ensures that the compiler will catch any write_funcs or callsites that we missed updating, whereas replacing bool with size_t probably prevents that detection. Of course this is now moot for the in-core code since we presumably caught everything already. But I wonder about patches (say to support an additional compression method) that might be in the pipeline, or in use in some local fork somewhere. There's no certainty that they'd get the word, especially since any tests that didn't exercise failure conditions would still work. So on the whole I prefer the "void" approach. I'm not dead set on that though, it's just a niggling worry. regards, tom lane
> On 26 Jun 2025, at 20:01, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 26 Jun 2025, at 15:33, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> So on the whole I prefer the "void" approach. I'm not dead >> set on that though, it's just a niggling worry. > > I think the likelyhood of it being a problem in practice is pretty slim, but > it's still a stronger argument than my "match an API we're still not aligned > with". The attached v7 reverts back to void return. In preparing for concluding this I've attached a v8 which is the patchset in v7 squashed into a single commit with an attempt at a commit message. This version has been tested against v17 and v16 where it applies and passes all tests (the latter isn't as assuring as it should be since there is a lack of testcoverage). -- Daniel Gustafsson
Вложения
On 7/1/25 16:24, Daniel Gustafsson wrote: >> On 26 Jun 2025, at 20:01, Daniel Gustafsson <daniel@yesql.se> wrote: >> >>> On 26 Jun 2025, at 15:33, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >>> So on the whole I prefer the "void" approach. I'm not dead >>> set on that though, it's just a niggling worry. >> >> I think the likelyhood of it being a problem in practice is pretty slim, but >> it's still a stronger argument than my "match an API we're still not aligned >> with". The attached v7 reverts back to void return. > > In preparing for concluding this I've attached a v8 which is the patchset in v7 > squashed into a single commit with an attempt at a commit message. > Thanks! > This version has been tested against v17 and v16 where it applies and passes > all tests (the latter isn't as assuring as it should be since there is a lack > of testcoverage). > Could you elaborate what you mean by lack of test coverage? Doesn't pg_dump have TAP tests exercising all compression methods? Perhaps it does not exercise all parts of the code, and we could improve that? regards -- Tomas Vondra
> On 1 Jul 2025, at 17:11, Tomas Vondra <tomas@vondra.me> wrote: > On 7/1/25 16:24, Daniel Gustafsson wrote: >> This version has been tested against v17 and v16 where it applies and passes >> all tests (the latter isn't as assuring as it should be since there is a lack >> of testcoverage). > > Could you elaborate what you mean by lack of test coverage? Doesn't > pg_dump have TAP tests exercising all compression methods? Perhaps it > does not exercise all parts of the code, and we could improve that? Sorry, I was unclear. There are indeed lots of pg_dump tests and all compression methods are tested (the 0% for Zstd on coverage.pg.org is due to that instance not being compiled with zstd support) but there are functions which evade testing like getc_func. It's also quite hard to test all the error paths as that would require some sort of fault injection. I don't think the current coverage is cause for holding back this patch, but there is room for improvement to made in the coming cycle. -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes: > In preparing for concluding this I've attached a v8 which is the patchset in v7 > squashed into a single commit with an attempt at a commit message. Glancing through this, I observe a couple of minor typos: + * Returns number of bytes read (this might be less than 'size' if EOF was + * reached). Exits via pg_fatal all for error conditions. s/all for/for all/ + pg_fatal("coud not write to file: %m"); s/coud/could/ There are some minor typos in the proposed commit message, too. regards, tom lane
Daniel Gustafsson <daniel@yesql.se> writes: >> On 1 Jul 2025, at 17:42, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> There are some minor typos in the proposed commit message, too. > It seems that my grasp of the english language went on holiday before I did. > The attached v9 is hopefully less terrible. This thread seems to have faded away, which I fear is my fault. I took another look through v9, and it seems fine. Do you want to deal with pushing it, or shall I? regards, tom lane
> On 24 Aug 2025, at 23:21, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >>> On 1 Jul 2025, at 17:42, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> There are some minor typos in the proposed commit message, too. > >> It seems that my grasp of the english language went on holiday before I did. >> The attached v9 is hopefully less terrible. > > This thread seems to have faded away, which I fear is my fault. > I took another look through v9, and it seems fine. Do you want > to deal with pushing it, or shall I? Thanks for the review, this one had admittedly slipped my mind. I'll work on getting it in at the start of the week. -- Daniel Gustafsson
> On 24 Aug 2025, at 23:23, Daniel Gustafsson <daniel@yesql.se> wrote: > I'll work on getting it in at the start of the week. While not exactly the start of the week, this is now done. -- Daniel Gustafsson