Re: Streaming I/O, vectored I/O (WIP)

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Streaming I/O, vectored I/O (WIP)
Дата
Msg-id 4533e76e-9519-4715-acd0-d4fa552619b0@iki.fi
обсуждение исходный текст
Ответ на Re: Streaming I/O, vectored I/O (WIP)  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: Streaming I/O, vectored I/O (WIP)  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
On 09/12/2023 02:41, Thomas Munro wrote:
> On Sat, Dec 9, 2023 at 7:25 AM Andres Freund <andres@anarazel.de> wrote:
>> On 2023-11-30 13:01:46 +1300, Thomas Munro wrote:
>>> On Thu, Nov 30, 2023 at 12:16 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>>> Maybe we should bite the bullet and always retry short writes in
>>>> FileWriteV(). Is that what you meant by "handling them"?
>>>> If the total size is expensive to calculate, how about passing it as an
>>>> extra argument? Presumably it is cheap for the callers to calculate at
>>>> the same time that they build the iovec array?
> 
>>> There is another problem with pushing it down to fd.c, though.
>>> Suppose you try to write 8192 bytes, and the kernel says "you wrote
>>> 4096 bytes" so your loop goes around again with the second half the
>>> data and now the kernel says "-1, ENOSPC".  What are you going to do?
>>> fd.c doesn't raise errors for I/O failure, it fails with -1 and errno,
>>> so you'd either have to return -1, ENOSPC (converting short writes
>>> into actual errors, a lie because you did write some data), or return
>>> 4096 (and possibly also set errno = ENOSPC as we have always done).
>>> So you can't really handle this problem at this level, can you?
>>> Unless you decide that fd.c should get into the business of raising
>>> errors for I/O failures, which would be a bit of a departure.
>>>
>>> That's why I did the retry higher up in md.c.
>>
>> I think that's the right call. I think for AIO we can't do retry handling
>> purely in fd.c, or at least it'd be quite awkward. It doesn't seem like it'd
>> buy us that much in md.c anyway, we still need to handle the cross segment
>> case and such, from what I can tell?
> 
> Heikki, what do you think about this:  we could go with the v3 fd.c
> and md.c patches, but move adjust_iovec_for_partial_transfer() into
> src/common/file_utils.c, so that at least that slightly annoying part
> of the job is available for re-use by future code that faces the same
> problem?

Ok, works for me.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Junwang Zhao
Дата:
Сообщение: Re: Make COPY format extendable: Extract COPY TO format implementations
Следующее
От: Hannu Krosing
Дата:
Сообщение: Why are wal_keep_size, max_slot_wal_keep_size requiring server restart?