Re: Storing files: 2.3TBytes, 17M file count
| От | Thomas Güttler |
|---|---|
| Тема | Re: Storing files: 2.3TBytes, 17M file count |
| Дата | |
| Msg-id | f3f83bb0-1031-ace9-b6ed-bec74099a793@thomas-guettler.de обсуждение исходный текст |
| Ответ на | Re: Storing files: 2.3TBytes, 17M file count ("Mike Sofen" <msofen@runbox.com>) |
| Ответы |
Re: Storing files: 2.3TBytes, 17M file count
Re: Storing files: 2.3TBytes, 17M file count Re: Storing files: 2.3TBytes, 17M file count |
| Список | pgsql-general |
Am 29.11.2016 um 01:52 schrieb Mike Sofen: > From: Thomas Güttler Sent: Monday, November 28, 2016 6:28 AM > > ...I have 2.3TBytes of files. File count is 17M > > Since we already store our structured data in postgres, I think about storing the files in PostgreSQL, too. > > Is it feasible to store file in PostgreSQL? > > ------- > > I am doing something similar, but in reverse. The legacy mysql databases I’m converting into a modern Postgres data > model, have very large genomic strings stored in 3 separate columns. Out of the 25 TB of legacy data storage (in 800 > dbs across 4 servers, about 22b rows), those 3 columns consume 90% of the total space, and they are just used for > reference, never used in searches or calculations. They range from 1k to several MB. > > > > Since I am collapsing all 800 dbs into a single PG db, being very smart about storage was critical. Since we’re also > migrating everything to AWS, we’re placing those 3 strings (per row) into a single json document and storing the > document in S3 bins, with the pointer to the file being the globally unique PK for the row…super simple. The app tier > knows to fetch the data from the db and large string json from the S3 bins. The retrieval time is surprisingly fast, > this is all real time web app stuff. > > > > This is a model that could work for anyone dealing with large objects (text or binary). The nice part is, the original > 25TB of data storage drops to 5TB – a much more manageable number, allowing for significant growth, which is on the horizon. Thank you Mike for your feedback. Yes, I think I will drop my idea. Encoding binary (the file content) to text and decoding to binary again makes no sense. I was not aware that this is needed. I guess I will use some key-to-blob store like s3. AFAIK there are open source s3 implementations available. Thank you all for your feeback! Regards, Thomas -- Thomas Guettler http://www.thomas-guettler.de/
В списке pgsql-general по дате отправления: