[Discussion] An openwebtext equivalent for papers on arxiv and other pre-print websites?
I was wondering if there are any on-going efforts to build a database such as openwebtext but for academic pre-print repositories such as arxiv and ssrn? Of course, having final versions of papers as published in journals or conferences would be best but that may prove harder to get. And then again, most authors always put the final version of the paper on the pre-print websites. I was thinking about how having a model such as GPT-3 but trained on domain knowledge from the above-mentioned pre-print sites, can help surface deep connections during the writing process. Imagine giving GPT-4 a latex code for your table and having it produce a discussion of the results and drawing insights between your numbers and other similar numbers as reported in the literature.
Jul-27-2020, 18:01:30 GMT
- Technology: