Re: announce: spark-postgres 3 released
| От | Adrian Klaver |
|---|---|
| Тема | Re: announce: spark-postgres 3 released |
| Дата | |
| Msg-id | fe7e6d7b-00f8-3de6-8eec-231932277179@aklaver.com обсуждение исходный текст |
| Ответ на | announce: spark-postgres 3 released (Nicolas Paris <nicolas.paris@riseup.net>) |
| Список | pgsql-general |
On 11/10/19 4:05 PM, Nicolas Paris wrote: > Hello postgres users, Interesting. FYI, the announcement list is: https://www.postgresql.org/list/pgsql-announce/ > > Spark-postgres is designed for reliable and performant ETL in big-data > workload and offers read/write/scd capability to better bridge spark and > postgres. The version 3 introduces a datasource API. It outperforms > sqoop by factor 8 and the apache spark core jdbc by infinity. > > Features: > - use of pg COPY statements > - parallel reads/writes > - use of hdfs to store intermediary csv > - reindex after bulk-loading > - SCD1 computations done on the spark side > - use unlogged tables when needed > - handle arrays and multiline string columns > - useful jdbc functions (ddl, updates...) > > The official repository: > https://framagit.org/parisni/spark-etl/tree/master/spark-postgres > > And its mirror on microsoft github: > https://github.com/EDS-APHP/spark-etl/tree/master/spark-postgres > -- Adrian Klaver adrian.klaver@aklaver.com
В списке pgsql-general по дате отправления: