ParaText: CSV parsing at 2.5 GB per second
For almost 50 years, CSV has been the format of choice for tabular data. Given the ubiquity of CSV and the pervasive need to deal with CSV in real workflows -- where speed, accuracy, and fault tolerance is a must -- we decided to build a CSV reader that runs in parallel. We conducted extensive benchmarks of ParaText against 7 CSV readers and 5 binary readers. Please refer to our benchmarking whitepaper for more details. In our tests, ParaText can load a CSV file from a cold disk at a rate of 2.5 GB/second and 4.2 GB/second out-of-core from a warm disk.
Jun-8-2016, 03:20:16 GMT
- Country:
- North America > United States
- New Mexico > Los Alamos County > Los Alamos (0.05)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.05)
- North America > United States
- Technology: