AITopics

Technology:

Information Technology > Game Theory (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.81)

arXiv.org Artificial IntelligenceFeb-4-2025

Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning

Yan, Yibo, Wang, Shen, Huo, Jiahao, Ye, Jingheng, Chu, Zhendong, Hu, Xuming, Yu, Philip S., Gomes, Carla, Selman, Bart, Wen, Qingsong

Scientific reasoning, the process through which humans apply logic, evidence, and critical thinking to explore and interpret scientific phenomena, is essential in advancing knowledge reasoning across diverse fields. However, despite significant progress, current scientific reasoning models still struggle with generalization across domains and often fall short of multimodal perception. Multimodal Large Language Models (MLLMs), which integrate text, images, and other modalities, present an exciting opportunity to overcome these limitations and enhance scientific reasoning. Therefore, this position paper argues that MLLMs can significantly advance scientific reasoning across disciplines such as mathematics, physics, chemistry, and biology. First, we propose a four-stage research roadmap of scientific reasoning capabilities, and highlight the current state of MLLM applications in scientific reasoning, noting their ability to integrate and reason over diverse data types. Second, we summarize the key challenges that remain obstacles to achieving MLLM's full potential. To address these challenges, we propose actionable insights and suggestions for the future. Overall, our work offers a novel perspective on MLLM integration with scientific reasoning, providing the LLM community with a valuable vision for achieving Artificial General Intelligence (AGI).

large language model, machine learning, natural language, (18 more...)

2502.02871

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > Thailand > Bangkok > Bangkok (0.04)
Oceania > Australia > New South Wales (0.04)
(9 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Education > Educational Setting (0.46)
Education > Curriculum (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Shariatnasab, Mahshad, Rini, Stefano, Shirani, Farhad, Iyengar, S. Sitharama

The Query/Hit Model for Sequential Hypothesis Testing

arXiv.org Artificial IntelligenceFeb-1-2025

This work introduces the Query/Hit (Q/H) learning model. The setup consists of two agents. One agent, Alice, has access to a streaming source, while the other, Bob, does not have direct access to the source. Communication occurs through sequential Q/H pairs: Bob sends a sequence of source symbols (queries), and Alice responds with the waiting time until each query appears in the source stream (hits). This model is motivated by scenarios with communication, computation, and privacy constraints that limit real-time access to the source. The error exponent for sequential hypothesis testing under the Q/H model is characterized, and a querying strategy, the Dynamic Scout-Sentinel Algorithm (DSSA), is proposed. The strategy employs a mutual information neural estimator to compute the error exponent associated with each query and to select the query with the highest efficiency. Extensive empirical evaluations on both synthetic and real-world datasets -- including mouse movement trajectories, typesetting patterns, and touch-based user interactions -- are provided to evaluate the performance of the proposed strategy in comparison with baselines, in terms of probability of error, query choice, and time-to-detection.

artificial intelligence, machine learning, query, (18 more...)

2502.00605

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > United States > Florida > Hillsborough County > University (0.04)
North America > United States > California (0.04)
(2 more...)

Genre: Research Report (0.65)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.62)

arXiv.org Artificial IntelligenceJan-29-2025

International AI Safety Report

Bengio, Yoshua, Mindermann, Sören, Privitera, Daniel, Besiroglu, Tamay, Bommasani, Rishi, Casper, Stephen, Choi, Yejin, Fox, Philip, Garfinkel, Ben, Goldfarb, Danielle, Heidari, Hoda, Ho, Anson, Kapoor, Sayash, Khalatbari, Leila, Longpre, Shayne, Manning, Sam, Mavroudis, Vasilios, Mazeika, Mantas, Michael, Julian, Newman, Jessica, Ng, Kwan Yee, Okolo, Chinasa T., Raji, Deborah, Sastry, Girish, Seger, Elizabeth, Skeadas, Theodora, South, Tobin, Strubell, Emma, Tramèr, Florian, Velasco, Lucia, Wheeler, Nicole, Acemoglu, Daron, Adekanmbi, Olubayo, Dalrymple, David, Dietterich, Thomas G., Felten, Edward W., Fung, Pascale, Gourinchas, Pierre-Olivier, Heintz, Fredrik, Hinton, Geoffrey, Jennings, Nick, Krause, Andreas, Leavy, Susan, Liang, Percy, Ludermir, Teresa, Marda, Vidushi, Margetts, Helen, McDermid, John, Munga, Jane, Narayanan, Arvind, Nelson, Alondra, Neppel, Clara, Oh, Alice, Ramchurn, Gopal, Russell, Stuart, Schaake, Marietje, Schölkopf, Bernhard, Song, Dawn, Soto, Alvaro, Tiedrich, Lee, Varoquaux, Gaël, Yao, Andrew, Zhang, Ya-Qin, Albalawi, Fahad, Alserkal, Marwan, Ajala, Olubunmi, Avrin, Guillaume, Busch, Christian, de Carvalho, André Carlos Ponce de Leon Ferreira, Fox, Bronwyn, Gill, Amandeep Singh, Hatip, Ahmet Halit, Heikkilä, Juha, Jolly, Gill, Katzir, Ziv, Kitano, Hiroaki, Krüger, Antonio, Johnson, Chris, Khan, Saif M., Lee, Kyoung Mu, Ligot, Dominic Vincent, Molchanovskyi, Oleksii, Monti, Andrea, Mwamanzi, Nusu, Nemer, Mona, Oliver, Nuria, Portillo, José Ramón López, Ravindran, Balaraman, Rivera, Raquel Pezoa, Riza, Hammam, Rugege, Crystal, Seoighe, Ciarán, Sheehan, Jerry, Sheikh, Haroon, Wong, Denise, Zeng, Yi

I am honoured to present the International AI Safety Report. It is the work of 96 international AI experts who collaborated in an unprecedented effort to establish an internationally shared scientific understanding of risks from advanced AI and methods for managing them. We embarked on this journey just over a year ago, shortly after the countries present at the Bletchley Park AI Safety Summit agreed to support the creation of this report. Since then, we published an Interim Report in May 2024, which was presented at the AI Seoul Summit. We are now pleased to publish the present, full report ahead of the AI Action Summit in Paris in February 2025. Since the Bletchley Summit, the capabilities of general-purpose AI, the type of AI this report focuses on, have increased further. For example, new models have shown markedly better performance at tests of Professor Yoshua Bengio programming and scientific reasoning.

data mining, large language model, machine learning, (27 more...)

2501.17805

Country:

South America (1.00)
North America > Canada (1.00)
Asia > Middle East (1.00)
(7 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(5 more...)

Industry:

Transportation > Air (1.00)
Social Sector (1.00)
Media > News (1.00)
(30 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Quality (1.00)
(21 more...)

Neural Information Processing SystemsJan-20-2025, 16:06:57 GMT

Reviews: Hypothesis Testing in Unsupervised Domain Adaptation with Applications in Alzheimer's Disease

The paper presents an interesting and smart way of performing covariate shift by aiming to make the two distributions indistinguishable by minimizing MMD. The paper however could benefit of more clarity and completeness so it can make impact. In terms of the applicability of this approach, the authors talk about the importance of being able to perform statistical tests and not just optimize the performance of a classifier. The bounds that they derive for their statistical test is useful to know how big the sample size should be to perform an appropriate shift. However, this doesn't say much on how it affects the scientific questions asked in the experiment, which need different statistical test.

application, hypothesis testing, unsupervised domain adaptation, (6 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.40)

Neural Information Processing SystemsJan-20-2025, 04:25:42 GMT

Reviews: Robust Hypothesis Testing Using Wasserstein Uncertainty Sets

The rebuttal addressed my technical concerns, and also I seemed to have misjudged the size of the contributions at first. My score has been updated. This paper studies the two-sample non-parametric hypothesis testing problem. Given two collections of probability distribution, the paper studies approximating the best detector against the worst distributions from both collections. A standard surrogate loss approximation is used to upper bound the worst case risk (the maximum of the type I and type II errors) with a convex surrogate function, which is known to yield a good solution.

empirical distribution, robust hypothesis testing, wasserstein uncertainty set, (9 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.66)

McGovern, Hope, Sirin, Hale, Lippincott, Tom

Computational Discovery of Chiasmus in Ancient Religious Text

arXiv.org Artificial IntelligenceJan-18-2025

Chiasmus, a debated literary device in Biblical texts, has captivated mystics while sparking ongoing scholarly discussion. In this paper, we introduce the first computational approach to systematically detect chiasmus within Biblical passages. Our method leverages neural embeddings to capture lexical and semantic patterns associated with chiasmus, applied at multiple levels of textual granularity (half-verses, verses). We also involve expert annotators to review a subset of the detected patterns. Despite its computational efficiency, our method achieves robust results, with high inter-annotator agreement and system precision@k of 0.80 at the verse level and 0.60 at the half-verse level. We further provide a qualitative analysis of the distribution of detected chiasmi, along with selected examples that highlight the effectiveness of our approach.

artificial intelligence, machine learning, natural language, (20 more...)

2501.10739

Country:

North America > United States (0.46)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.64)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.40)
Information Technology > Artificial Intelligence > Cognitive Science > Creativity & Intelligence (0.40)

Neural Information Processing SystemsJan-15-2025, 16:35:39 GMT

A unified framework for bandit multiple testing

In bandit multiple hypothesis testing, each arm corresponds to a different null hypothesis that we wish to test, and the goal is to design adaptive algorithms that correctly identify large set of interesting arms (true discoveries), while only mistakenly identifying a few uninteresting ones (false discoveries). One common metric in non-bandit multiple testing is the false discovery rate (FDR). We propose a unified, modular framework for bandit FDR control that emphasizes the decoupling of exploration and summarization of evidence. We utilize the powerful martingale-based concept of "e-processes" to ensure FDR control for arbitrary composite nulls, exploration rules and stopping times in generic problem settings. In particular, valid FDR control holds even if the reward distributions of the arms could be dependent, multiple arms may be queried simultaneously, and multiple (cooperating or competing) agents may be querying arms, covering combinatorial semi-bandit type settings as well.

artificial intelligence, machine learning, multiple testing, (7 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.79)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.61)

Mother JonesDec-31-2024, 11:00:00 GMT

Blob-Headed Fish, Meat-Eating Squirrels, and Other Fascinating Science Stories From 2024

So much of this year felt like a fever dream: The attempted assassination of Donald Trump. Which is why, this year, I'm leaning into my nerdish tendencies and rounding up some good, interesting, or inspiring news stories from the science world--promising discoveries, exciting new data, historic events, and unsung heroes. In the hope of providing relief from the hell that has been 2024, here's a non-comprehensive list of the year's coolest science stories, both big and small: Wildlife filmmaker Carlos Gauna and University of California, Riverside, PhD student Phillip Sternes spotted what appears to be a baby great white shark off the coast of California last year. In January, the team published the photos in the journal Environmental Biology of Fishes. "Where white sharks give birth is one of the holy grails of shark science. No one has ever been able to pinpoint where they are born, nor has anyone seen a newborn baby shark alive," Gauna said in a UC Riverside press release.

fascinating science story, meat-eating squirrel, scientist, (11 more...)

Mother Jones

Country:

North America > United States > California > Riverside County > Riverside (0.25)
North America > United States > Illinois (0.06)
South America > Peru (0.05)
(6 more...)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology > HIV (0.32)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.40)
Information Technology > Artificial Intelligence > Cognitive Science > Creativity & Intelligence (0.40)

Eschker, Samuel J., Liu, Chuanhai

Towards Strong AI: Transformational Beliefs and Scientific Creativity

arXiv.org Artificial IntelligenceDec-27-2024

Strong artificial intelligence (AI) is envisioned to possess general cognitive abilities and scientific creativity comparable to human intelligence, encompassing both knowledge acquisition and problem-solving. While remarkable progress has been made in weak AI, the realization of strong AI remains a topic of intense debate and critical examination. In this paper, we explore pivotal innovations in the history of astronomy and physics, focusing on the discovery of Neptune and the concept of scientific revolutions as perceived by philosophers of science. Building on these insights, we introduce a simple theoretical and statistical framework of weak beliefs, termed the Transformational Belief (TB) framework, designed as a foundation for modeling scientific creativity. Through selected illustrative examples in statistical science, we demonstrate the TB framework's potential as a promising foundation for understanding, analyzing, and even fostering creativity -- paving the way toward the development of strong AI. We conclude with reflections on future research directions and potential advancements.

artificial intelligence, machine learning, natural language, (18 more...)

2412.19938

Country:

North America > United States > New York (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report > Promising Solution (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)