Re: [HACKERS] fatal copy in/out error (6.5.3)
От | Tatsuo Ishii |
---|---|
Тема | Re: [HACKERS] fatal copy in/out error (6.5.3) |
Дата | |
Msg-id | 20000126102351B.t-ishii@sra.co.jp обсуждение исходный текст |
Ответ на | Re: [HACKERS] fatal copy in/out error (6.5.3) (Michael Robinson <robinson@netrinsics.com>) |
Список | pgsql-hackers |
> Tatsuo Ishii <t-ishii@sra.co.jp> writes: > >Yes, it's not a PostgreSQL's business but is a really big problem in > >the real world. Maybe some HTML gurus might have good suggestions on > >these issues (something like using a language tag?) > > The only solution is defensive programming. Even if there were a standard > that everyone followed, if malicious people could break things by not > following the standard, then you can be certain that somebody would do so. Defensive programming saves the system but does not user. Once corrupted data is stored in the system, it's totally useless for the user anyway. What about validating data *before* inserting it into a table? You expect EUC_CN data, and it should be possible to determine if the data is valid or not by doing some simple checking in most cases. Maybe I could provide a new libpq function something like: bool pg_validate_euc_cn(const unsigned char *euc_str); If it returns false, then euc_str is not a valid EUC_CN. So you show a message: "Sorry, but we only accepts EUC_CN data. Please try another input method..." or jump to other pages for EUC_TW or Big5 or whatever... Of course the function does not guarantee the string is 100% correct EUC_CN (on the other hand it can tell that the string is not valid) because: 1) there are chances that, for example, a EUC_CN string and a EUC_JP string has same bit patterns accidently. 2) I do not have enough information to implement it perfectly. At this point I could only perform minimal checking. However, it can be good a start point for someone who has more knowledge (on the other hand, I could implement pg_validate_euc_jp in much better way, since I have precise info for EUC_JP). > >Here it is. With this patch, copy out should be happy even with the > >wrong data. I'm not sure if it could be displayed correctly, though. > > Thank you very much. However, I think even this is too optimistic: > > >! if (*s & 0x80) > > Shouldn't it be something like: > > if ((*s & 0x80) && (*(s+1) & 0x80)) > > Even though "\242\242\242\0" is an invalid EUC sequence, it still shouldn't be > allowed to break the software. Thanks for the suggestion. More robust code is always good. -- Tatsuo Ishii
В списке pgsql-hackers по дате отправления: