[Discussion] An openwebtext equivalent for papers on arxiv and other pre-print websites?

Jul-27-2020, 18:01:30 GMT–#artificialintelligence

I was wondering if there are any on-going efforts to build a database such as openwebtext but for academic pre-print repositories such as arxiv and ssrn? Of course, having final versions of papers as published in journals or conferences would be best but that may prove harder to get. And then again, most authors always put the final version of the paper on the pre-print websites. I was thinking about how having a model such as GPT-3 but trained on domain knowledge from the above-mentioned pre-print sites, can help surface deep connections during the writing process. Imagine giving GPT-4 a latex code for your table and having it produce a discussion of the results and drawing insights between your numbers and other similar numbers as reported in the literature.

large language model, machine learning, natural language, (9 more...)

#artificialintelligence

Jul-27-2020, 18:01:30 GMT

News Web Page

Add feedback

Industry:
- Media > News (0.40)

Technology:
- Information Technology
  - Communications > Social Media (0.76)
  - Artificial Intelligence
    - Natural Language
      - Large Language Model (0.66)
      - Chatbot (0.66)
    - Machine Learning > Neural Networks
      - Deep Learning (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found