Goto

Collaborating Authors

 nougat


AdaParse: An Adaptive Parallel PDF Parsing and Resource Scaling Engine

Siebenschuh, Carlo, Hippe, Kyle, Gokdemir, Ozan, Brace, Alexander, Khan, Arham, Hossain, Khalid, Babuji, Yadu, Chia, Nicholas, Vishwanath, Venkatram, Stevens, Rick, Ramanathan, Arvind, Foster, Ian, Underwood, Robert

arXiv.org Artificial Intelligence

Language models for scientific tasks are trained on text from scientific publications, most distributed as PDFs that require parsing. PDF parsing approaches range from inexpensive heuristics (for simple documents) to computationally intensive ML-driven systems (for complex or degraded ones). The choice of the "best" parser for a particular document depends on its computational cost and the accuracy of its output. To address these issues, we introduce an Adaptive Parallel PDF Parsing and Resource Scaling Engine (AdaParse), a data-driven strategy for assigning an appropriate parser to each document. We enlist scientists to select preferred parser outputs and incorporate this information through direct preference optimization (DPO) into AdaParse, thereby aligning its selection process with human judgment. AdaParse then incorporates hardware requirements and predicted accuracy of each parser to orchestrate computational resources efficiently for large-scale parsing campaigns. We demonstrate that AdaParse, when compared to state-of-the-art parsers, improves throughput by $17\times$ while still achieving comparable accuracy (0.2 percent better) on a benchmark set of 1000 scientific documents. AdaParse's combination of high accuracy and parallel scalability makes it feasible to parse large-scale scientific document corpora to support the development of high-quality, trillion-token-scale text datasets. The implementation is available at https://github.com/7shoe/AdaParse/


LOCR: Location-Guided Transformer for Optical Character Recognition

Sun, Yu, Zhou, Dongzhan, Lin, Chen, He, Conghui, Ouyang, Wanli, Zhong, Han-Sen

arXiv.org Artificial Intelligence

Academic documents are packed with texts, equations, tables, and figures, requiring comprehensive understanding for accurate Optical Character Recognition (OCR). While end-to-end OCR methods offer improved accuracy over layout-based approaches, they often grapple with significant repetition issues, especially with complex layouts in Out-Of-Domain (OOD) documents.To tackle this issue, we propose LOCR, a model that integrates location guiding into the transformer architecture during autoregression. We train the model on a dataset comprising over 77M text-location pairs from 125K academic document pages, including bounding boxes for words, tables and mathematical symbols. LOCR adeptly handles various formatting elements and generates content in Markdown language. It outperforms all existing methods in our test set constructed from arXiv, as measured by edit distance, BLEU, METEOR and F-measure.LOCR also reduces repetition frequency from 4.4% of pages to 0.5% in the arXiv dataset, from 13.2% to 1.3% in OOD quantum physics documents and from 8.1% to 1.8% in OOD marketing documents. Additionally, LOCR features an interactive OCR mode, facilitating the generation of complex documents through a few location prompts from human.


Online Centralized Non-parametric Change-point Detection via Graph-based Likelihood-ratio Estimation

de la Concha, Alejandro, Kalogeratos, Argyris, Vayatis, Nicolas

arXiv.org Artificial Intelligence

Consider each node of a graph to be generating a data stream that is synchronized and observed at near real-time. At a change-point $\tau$, a change occurs at a subset of nodes $C$, which affects the probability distribution of their associated node streams. In this paper, we propose a novel kernel-based method to both detect $\tau$ and localize $C$, based on the direct estimation of the likelihood-ratio between the post-change and the pre-change distributions of the node streams. Our main working hypothesis is the smoothness of the likelihood-ratio estimates over the graph, i.e connected nodes are expected to have similar likelihood-ratios. The quality of the proposed method is demonstrated on extensive experiments on synthetic scenarios.


Samsung's Bixby Works On Galaxy S7, Older Devices Running Android 7.0 Nougat

International Business Times

When Samsung unveiled the Galaxy S8 and S8 to the public last Wednesday, it also introduced its new AI assistant, called Bixby. The Siri rival basked in the spotlight as Samsung officials talked about what makes it stand out and different from other smart assistants during the Unpacked event. Bixby is rumored to come to other devices, but it's hard to imagine how it would be implemented in older devices that do not have the dedicated hardware button found on the new flagship phones. Interestingly, some developers have successfully deployed Bixby on older Galaxy smartphones, most especially on the 1-year-old Galaxy S7. On the Android website XDA-Developers, software developers shared that hey got Bixby working on their Android smartphones with the help of Bixby's Android Package Kit (APK) that was shared by user takerhbk.


Google Assistant Isn't Coming To Android Tablets Anytime Soon

International Business Times

Google Assistant, Google's own AI digital assistant, is now rolling out to several smartphones running Android Marshmallow and Nougat. However, it looks like the company's digital assistant won't be making its way to tablets. Earlier this month, Google Assistant started rolling out to older, non-Pixel smartphones. The company also made it clear on its blog that the service will be available on "Android phones" and "smartphones running Android 7.0 Nougat and Android 6.0 Marshmallow." Android Police has now reached out to Google regarding the availability of Assistant on tablets, and this was the company's response: "The Assistant will be available on Android Marshmallow and Nougat phones with Google Play Services, this does not include tablets."


Android Circuit: New Galaxy S8 Leaks, Android Biggest Success In 2016, New Google Pixel Problem

Forbes - Tech

Taking a look back at seven days of news and headlines across the world of Android, this week's Android Circuit includes a new voice for the Galaxy S8, the return of the S-Pen, Pixel power problems, Android's battery win, the shutdown of Cyanogen, WileyFox's quick change to Nougat, a North Korean Android tablet's spyware, and Super Mario Run prepares for its Android arrival. Android Circuit is here to remind you of a few of the many things that have happened around Android in the last week (and you can find the weekly Apple news digest here). The Samsung Galaxy S8 could be picking up a new tool named Bixby, a voice-powered digital assistant along the lines of Siri and Google Assistant. Viv Labs is the company behind the technology, and Samsung recently acquired it, so it makes sense for the South Koreans to stake its claim in this space. But will that upset Google?


Google AI's photo recognition achieves 94 percent accuracy

#artificialintelligence

We've all enjoyed the simple benefits of Google's artificial intelligence photo recognition. Google Photos employs a very stripped down version of the algorithm to identify pictures as containing cats, dogs, food, or specific people. However, the search giant has been working on much more advanced photo recognition capabilities, and today they've released their progress to developers. The Google Research Blog reports that the Google Brain team's AI image captioning system has achieved a 93.9 percent accuracy rating. Their results in 2014 used the Inception V1 image classification model and achieved 89.6 percent accuracy.


Google AI's photo recognition achieves 94 percent accuracy

#artificialintelligence

We've all enjoyed the simple benefits of Google's artificial intelligence photo recognition. Google Photos employs a very stripped down version of the algorithm to identify pictures as containing cats, dogs, food, or specific people. However, the search giant has been working on much more advanced photo recognition capabilities, and today they've released their progress to developers. The Google Research Blog reports that the Google Brain team's AI image captioning system has achieved a 93.9 percent accuracy rating. Their results in 2014 used the Inception V1 image classification model and achieved 89.6 percent accuracy.