Re: Storing large documents - one table or partition by doc?
От | Jim Nasby |
---|---|
Тема | Re: Storing large documents - one table or partition by doc? |
Дата | |
Msg-id | 649c3d87-0df1-a8e2-49da-bc200bc16e86@BlueTreble.com обсуждение исходный текст |
Ответ на | Re: Storing large documents - one table or partition by doc? (Dev Nop <devnop0@gmail.com>) |
Список | pgsql-performance |
On 9/24/16 6:33 AM, Dev Nop wrote: > This means that the applications are sensitive to the size of ids. A > previous incarnation used GUIDs which was a brutal overhead for large > documents. If GUIDs *stored in a binary format* were too large, then you won't be terribly happy with the 24 byte per-row overhead in Postgres. What I would look into at this point is using int ranges and arrays to greatly reduce your overhead: CREATE TABLE ...( document_version_id int NOT NULL REFERENCES document_version , document_line_range int4range NOT NULL , document_lines text[] NOT NULL , EXCLUDE USING gist( document_version_id =, document_line_range && ) ); That allows you to store the lines of a document as an array of values, ie: INSERT INTO ... VALUES( 1 , '[11-15]' , '[11:15]={line11,line12,line13,line14,line15}' ); Note that I'm using explicit array bounds syntax to make the array bounds match the line numbers. I'm not sure that's a great idea, but it is possible. > My nightmares are of a future filled with hours of down-time caused by > struggling to restore a gargantuan table from a backup due to a problem > with just one tiny document or schema changes that require disconnecting > all clients for hours when instead I could ignore best practice, create > 10k tables and process them iteratively and live in a utopia where I > never have 100% downtime only per document unavailability. At some size you'd certainly want partitioning. The good news is that you can mostly hide partitioning from the application and other database logic, so there's not a lot of incentive to set it up immediately. You can always do that after the fact. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
В списке pgsql-performance по дате отправления: