AITopics | Belyi, Anton

Collaborating Authors

Belyi, Anton

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

McKinzie, Brandon, Gan, Zhe, Fauconnier, Jean-Philippe, Dodge, Sam, Zhang, Bowen, Dufter, Philipp, Shah, Dhruti, Du, Xianzhi, Peng, Futang, Weers, Floris, Belyi, Anton, Zhang, Haotian, Singh, Karanjeet, Kang, Doug, Jain, Ankur, Hè, Hongyu, Schwarzer, Max, Gunter, Tom, Kong, Xiang, Zhang, Aonan, Wang, Jianyu, Wang, Chong, Du, Nan, Lei, Tao, Wiseman, Sam, Yin, Guoli, Lee, Mark, Wang, Zirui, Pang, Ruoming, Grasch, Peter, Toshev, Alexander, Yang, Yinfei

arXiv.org Artificial IntelligenceApr-18-2024

In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through careful and comprehensive ablations of the image encoder, the vision language connector, and various pre-training data choices, we identified several crucial design lessons. For example, we demonstrate that for large-scale multimodal pre-training using a careful mix of image-caption, interleaved image-text, and text-only data is crucial for achieving stateof-the-art (SOTA) few-shot results across multiple benchmarks, compared to other published multimodal pre-training results. Further, we show that the image encoder together with image resolution and the image token count has substantial impact, while the vision-language connector design is of comparatively negligible importance. By scaling up the presented recipe, we build MM1, a family of multimodal models, including both dense variants up to 30B and mixture-of-experts (MoE) variants up to 64B, that are SOTA in pre-training metrics and achieve competitive performance after supervised fine-tuning on a range of established multimodal benchmarks. Thanks to large-scale pre-training, MM1 enjoys appealing properties such as enhanced in-context learning, and multi-image reasoning, enabling few-shot chain-of-thought prompting.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2403.09611

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Open Domain Knowledge Extraction for Knowledge Graphs

Qian, Kun, Belyi, Anton, Wu, Fei, Khorshidi, Samira, Nikfarjam, Azadeh, Khot, Rahul, Sang, Yisi, Luna, Katherine, Chu, Xianqi, Choi, Eric, Govind, Yash, Seivwright, Chloe, Sun, Yiwen, Fakhry, Ahmed, Rekatsinas, Theo, Ilyas, Ihab, Qi, Xiaoguang, Li, Yunyao

arXiv.org Artificial IntelligenceOct-30-2023

The quality of a knowledge graph directly impacts the quality of downstream applications (e.g. the number of answerable questions using the graph). One ongoing challenge when building a knowledge graph is to ensure completeness and freshness of the graph's entities and facts. In this paper, we introduce ODKE, a scalable and extensible framework that sources high-quality entities and facts from open web at scale. ODKE utilizes a wide range of extraction models and supports both streaming and batch processing at different latency. We reflect on the challenges and design decisions made and share lessons learned when building and deploying ODKE to grow an industry-scale open domain knowledge graph.

artificial intelligence, data mining, pipeline, (16 more...)

arXiv.org Artificial Intelligence

2312.09424

Country:

North America > United States (0.48)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)

Add feedback

FLEEK: Factual Error Detection and Correction with Evidence Retrieved from External Knowledge

Bayat, Farima Fatahi, Qian, Kun, Han, Benjamin, Sang, Yisi, Belyi, Anton, Khorshidi, Samira, Wu, Fei, Ilyas, Ihab F., Li, Yunyao

arXiv.org Artificial IntelligenceOct-25-2023

Detecting factual errors in textual information, whether generated by large language models (LLM) or curated by humans, is crucial for making informed decisions. LLMs' inability to attribute their claims to external knowledge and their tendency to hallucinate makes it difficult to rely on their responses. Humans, too, are prone to factual errors in their writing. Since manual detection and correction of factual errors is labor-intensive, developing an automatic approach can greatly reduce human effort. We present FLEEK, a prototype tool that automatically extracts factual claims from text, gathers evidence from external knowledge sources, evaluates the factuality of each claim, and suggests revisions for identified errors using the collected evidence. Initial empirical evaluation on fact error detection (77-85\% F1) shows the potential of FLEEK. A video demo of FLEEK can be found at https://youtu.be/NapJFUlkPdQ.

artificial intelligence, large language model, natural language, (4 more...)

arXiv.org Artificial Intelligence

2310.17119

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)

Add feedback