AITopics | Chevalier, Alexis

Collaborating Authors

Chevalier, Alexis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TEDDY: A Family Of Foundation Models For Understanding Single Cell Biology

Chevalier, Alexis, Ghosh, Soumya, Awasthi, Urvi, Watkins, James, Bieniewska, Julia, Mitrea, Nichita, Kotova, Olga, Shkura, Kirill, Noble, Andrew, Steinbaugh, Michael, Delile, Julien, Meier, Christoph, Zhukov, Leonid, Khalil, Iya, Mukherjee, Srayanta, Mueller, Judith

arXiv.org Artificial IntelligenceMar-5-2025

The complexity of cell biology and the mechanisms of disease pathogenesis are driven by an intricate regulatory network of genes [Chatterjee and Ahituv, 2017, Theodoris et al., 2015, 2021]. A better resolution of this complex interactome network would enhance our ability to design drugs that target the causal mechanism of the disease rather than interventions that aim to modulate the downstream effects [Ding et al., 2022]. However, accurate inference of gene regulatory networks is challenging. The possible space for genetic interactions is vast [Bunne et al., 2024], the networks to be inferred are highly context-dependent, different cell types and tissue types exhibit different regulatory networks and exhibit significant variations across donors [Chen and Dahl, 2024]. Moreover, the data required to study gene regulatory networks for a specific disease is usually limited and highly specialized, often plagued by experimental artifacts [Hicks et al., 2018]. However, a confluence of recent technological progress promises to make this challenging problem more tractable. The advent of accurate single-cell sequencing technologies that remove the artifacts of bulk cell data, better reflect natural variability, and provide signals at higher resolutions. This, along with the increasing availability of atlas-scale scRNAseq datasets that span an extensive range of diseases, cell types, tissue types, and donors provide an unprecedented opportunity for studying disease mechanisms at scale.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.03485

Country:

North America > United States (0.14)
Europe > United Kingdom (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Language Models as Science Tutors

Chevalier, Alexis, Geng, Jiayi, Wettig, Alexander, Chen, Howard, Mizera, Sebastian, Annala, Toni, Aragon, Max Jameson, Fanlo, Arturo Rodríguez, Frieder, Simon, Machado, Simon, Prabhakar, Akshara, Thieu, Ellie, Wang, Jiachen T., Wang, Zirui, Wu, Xindi, Xia, Mengzhou, Jia, Wenhan, Yu, Jiatong, Zhu, Jun-Jie, Ren, Zhiyong Jason, Arora, Sanjeev, Chen, Danqi

arXiv.org Artificial IntelligenceFeb-16-2024

NLP has recently made exciting progress toward training language models (LMs) with strong scientific problem-solving skills. However, model development has not focused on real-life use-cases of LMs for science, including applications in education that require processing long scientific documents. To address this, we introduce TutorEval and TutorChat. TutorEval is a diverse question-answering benchmark consisting of questions about long chapters from STEM textbooks, written by experts. TutorEval helps measure real-life usability of LMs as scientific assistants, and it is the first benchmark combining long contexts, free-form generation, and multi-disciplinary scientific knowledge. Moreover, we show that fine-tuning base models with existing dialogue datasets leads to poor performance on TutorEval. Therefore, we create TutorChat, a dataset of 80,000 long synthetic dialogues about textbooks. We use TutorChat to fine-tune Llemma models with 7B and 34B parameters. These LM tutors specialized in math have a 32K-token context window, and they excel at TutorEval while performing strongly on GSM8K and MATH. Our datasets build on open-source materials, and we release our models, data, and evaluations.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.11111

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.75)

Add feedback

Adapting Language Models to Compress Contexts

Chevalier, Alexis, Wettig, Alexander, Ajith, Anirudh, Chen, Danqi

arXiv.org Artificial IntelligenceNov-4-2023

Transformer-based language models (LMs) are powerful and widely-applicable tools, but their usefulness is constrained by a finite context window and the expensive computational cost of processing long text documents. We propose to adapt pre-trained LMs into AutoCompressors. These language models are capable of compressing long contexts into compact summary vectors, which are then accessible to the model as soft prompts. Summary vectors are trained with an unsupervised objective, whereby long documents are processed in segments, and summary vectors from all previous segments are used in language modeling. We fine-tune OPT and Llama-2 models on sequences of up to 30,720 tokens and show that AutoCompressors can utilize long contexts to improve perplexity. We evaluate AutoCompressors on in-context learning by compressing task demonstrations and find that summary vectors are good substitutes for plain-text demonstrations, increasing accuracy while reducing inference costs. Finally, we explore the benefits of pre-computing summary vectors for large corpora by applying summary vectors to retrievalaugmented language modeling and a passage re-ranking task. Overall, AutoCompressors emerge as a simple and inexpensive solution to extend the context window of LMs while speeding up inference over long contexts.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2305.14788

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kansai (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.76)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Mathematical Capabilities of ChatGPT

Frieder, Simon, Pinchetti, Luca, Chevalier, Alexis, Griffiths, Ryan-Rhys, Salvatori, Tommaso, Lukasiewicz, Thomas, Petersen, Philipp Christian, Berner, Julius

arXiv.org Artificial IntelligenceJul-20-2023

We investigate the mathematical capabilities of two iterations of ChatGPT (released 9-January-2023 and 30-January-2023) and of GPT-4 by testing them on publicly available datasets, as well as hand-crafted ones, using a novel methodology. In contrast to formal mathematics, where large databases of formal proofs are available (e.g., the Lean Mathematical Library), current datasets of natural-language mathematics, used to benchmark language models, either cover only elementary mathematics or are very small. We address this by publicly releasing two new datasets: GHOSTS and miniGHOSTS. These are the first natural-language datasets curated by working researchers in mathematics that (1) aim to cover graduate-level mathematics, (2) provide a holistic overview of the mathematical capabilities of language models, and (3) distinguish multiple dimensions of mathematical reasoning. These datasets also test whether ChatGPT and GPT-4 can be helpful assistants to professional mathematicians by emulating use cases that arise in the daily professional activities of mathematicians. We benchmark the models on a range of fine-grained performance metrics. For advanced mathematics, this is the most detailed evaluation effort to date. We find that ChatGPT can be used most successfully as a mathematical assistant for querying facts, acting as a mathematical search engine and knowledge base interface. GPT-4 can additionally be used for undergraduate-level mathematics but fails on graduate-level difficulty. Contrary to many positive reports in the media about GPT-4 and ChatGPT's exam-solving abilities (a potential case of selection bias), their overall mathematical performance is well below the level of a graduate student. Hence, if your goal is to use ChatGPT to pass a graduate-level math exam, you would be better off copying from your average peer!

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2301.13867

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.27)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Information Technology (0.92)
Education > Educational Setting > Higher Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback