AITopics | Scientific Discovery

Collaborating Authors

Scientific Discovery

"The problem of giving rules for producing true scientific statements has been replaced by the problem of finding efficient heuristic rules for culling the reasonable candidates for an explanation from an appropriate set of possible candidates [and finding methods for constructing the candidates]."
– B. Buchanan, quoted in Lindley Darden. Recent Work in Computational Scientific Discovery.

News Overviews Instructional Materials AI-Alerts Classics

Kernel-Based Tests for Likelihood-Free Hypothesis Testing

Neural Information Processing SystemsOct-11-2024, 01:21:36 GMT

Given n observations from two balanced classes, consider the task of labeling an additional m inputs that are known to all belong to \emph{one} of the two classes. Special cases of this problem are well-known: with completeknowledge of class distributions ( n \infty) theproblem is solved optimally by the likelihood-ratio test; when m 1 it corresponds to binary classification; and when m\approx n it is equivalent to two-sample testing. The intermediate settings occur in the field of likelihood-free inference, where labeled samples are obtained by running forward simulations and the unlabeled sample is collected experimentally. In recent work it was discovered that there is a fundamental trade-offbetween m and n: increasing the data sample m reduces the amount n of training/simulationdata needed. In this work we (a) introduce a generalization where unlabeled samples come from a mixture of the two classes -- a case often encountered in practice; (b) study the minimax sample complexity for non-parametric classes of densities under \textit{maximum meandiscrepancy} (MMD) separation; and (c) investigate the empirical performance of kernels parameterized by neural networks on two tasks: detectionof the Higgs boson and detection of planted DDPM generated images amidstCIFAR-10 images.

kernel-based test, likelihood-free hypothesis testing

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.40)

Add feedback

Greedy Approximation Algorithms for Active Sequential Hypothesis Testing

Neural Information Processing SystemsOct-9-2024, 20:34:07 GMT

In the problem of \emph{active sequential hypothesis testing} (ASHT), a learner seeks to identify the \emph{true} hypothesis from among a known set of hypotheses. The learner is given a set of actions and knows the random distribution of the outcome of any action under any true hypothesis. Given a target error \delta 0, the goal is to sequentially select the fewest number of actions so as to identify the true hypothesis with probability at least 1 - \delta . Motivated by applications in which the number of hypotheses or actions is massive (e.g., genomics-based cancer detection), we propose efficient (greedy, in fact) algorithms and provide the first approximation guarantees for ASHT, under two types of adaptivity. Both of our guarantees are independent of the number of actions and logarithmic in the number of hypotheses.

active sequential hypothesis testing, greedy approximation algorithm, hypothesis, (3 more...)

Neural Information Processing Systems

Industry: Health & Medicine (0.90)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.65)

Add feedback

Reviews: Adaptive Active Hypothesis Testing under Limited Information

Neural Information Processing SystemsOct-8-2024, 05:12:43 GMT

The paper studies the problem of active hypothesis testing with limited information. More specifically, denoting the probability of'correct indication' by q(j,w) if the true hypothesis is j and the chosen action is w, they assume that q(j,w) is bounded below by \alpha(w) for all j, and this quantity is available to the controller. They first study the incomplete-bayesian update rule, where all q's are replaced with alpha's, hence approximate, and show that with this IB update rule, one needs at least O(log(1/\delta)) to achieve (1-\delta) posterior error probability. Then, they show that their gradient-based policy achieves the order-wise optimal sample complexity. I have a few technical questions.

adaptive active hypothesis testing, artificial intelligence, machine learning, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.99)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.64)

Add feedback

Reasoning with Natural Language Explanations

Valentino, Marco, Freitas, André

arXiv.org Artificial IntelligenceOct-5-2024

Explanation constitutes an archetypal feature of human rationality, underpinning learning and generalisation, and representing one of the media supporting scientific discovery and communication. Due to the importance of explanations in human reasoning, an increasing amount of research in Natural Language Inference (NLI) has started reconsidering the role that explanations play in learning and inference, attempting to build explanation-based NLI models that can effectively encode and use natural language explanations on downstream tasks. Research in explanation-based NLI, however, presents specific challenges and opportunities, as explanatory reasoning reflects aspects of both material and formal inference, making it a particularly rich setting to model and deliver complex reasoning. In this tutorial, we provide a comprehensive introduction to the field of explanation-based NLI, grounding this discussion on the epistemological-linguistic foundations of explanations, systematically describing the main architectural trends and evaluation methodologies that can be used to build systems capable of explanatory reasoning.

explanation, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.04148

Country:

Asia > Thailand > Bangkok > Bangkok (0.05)
North America > Mexico > Mexico City > Mexico City (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
(9 more...)

Genre:

Instructional Material (1.00)
Overview (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Abductive Reasoning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.46)
(2 more...)

Add feedback

Adaptive Active Hypothesis Testing under Limited Information

Fabio Cecchi, Nidhi Hegde

Neural Information Processing SystemsOct-4-2024, 00:30:42 GMT

We consider the problem of active sequential hypothesis testing where a Bayesian decision maker must infer the true hypothesis from a set of hypotheses. The decision maker may choose for a set of actions, where the outcome of an action is corrupted by independent noise. In this paper we consider a special case where the decision maker has limited knowledge about the distribution of observations for each action, in that only a binary value is observed. Our objective is to infer the true hypothesis with low error, while minimizing the number of action sampled. Our main results include the derivation of a lower bound on sample size for our system under limited knowledge and the design of an active learning policy that matches this lower bound and outperforms similar known algorithms.

algorithm, hypothesis, true hypothesis, (16 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Brabant > Eindhoven (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > France (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.63)

Add feedback

A Statistical Viewpoint on Differential Privacy: Hypothesis Testing, Representation and Blackwell's Theorem

Su, Weijie J.

arXiv.org Machine LearningSep-14-2024

Differential privacy is widely considered the formal privacy for privacy-preserving data analysis due to its robust and rigorous guarantees, with increasingly broad adoption in public services, academia, and industry. Despite originating in the cryptographic context, in this review paper we argue that, fundamentally, differential privacy can be considered a \textit{pure} statistical concept. By leveraging a theorem due to David Blackwell, our focus is to demonstrate that the definition of differential privacy can be formally motivated from a hypothesis testing perspective, thereby showing that hypothesis testing is not merely convenient but also the right language for reasoning about differential privacy. This insight leads to the definition of $f$-differential privacy, which extends other differential privacy definitions through a representation theorem. We review techniques that render $f$-differential privacy a unified framework for analyzing privacy bounds in data analysis and machine learning. Applications of this differential privacy definition to private deep learning, private convex optimization, shuffled mechanisms, and U.S.~Census data are discussed to highlight the benefits of analyzing privacy bounds under this framework compared to existing alternatives.

differential privacy, f-differential privacy, privacy, (13 more...)

arXiv.org Machine Learning

2409.09558

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Statistically Valid Information Bottleneck via Multiple Hypothesis Testing

Farzaneh, Amirmohammad, Simeone, Osvaldo

arXiv.org Artificial IntelligenceSep-11-2024

The information bottleneck (IB) problem is a widely studied framework in machine learning for extracting compressed features that are informative for downstream tasks. However, current approaches to solving the IB problem rely on a heuristic tuning of hyperparameters, offering no guarantees that the learned features satisfy information-theoretic constraints. In this work, we introduce a statistically valid solution to this problem, referred to as IB via multiple hypothesis testing (IB-MHT), which ensures that the learned features meet the IB constraints with high probability, regardless of the size of the available dataset. The proposed methodology builds on Pareto testing and learn-then-test (LTT), and it wraps around existing IB solvers to provide statistical guarantees on the IB constraints. We demonstrate the performance of IB-MHT on classical and deterministic IB formulations, validating the effectiveness of IB-MHT in outperforming conventional methods in terms of statistical robustness and reliability.

constraint, ib-mht, information, (12 more...)

arXiv.org Artificial Intelligence

2409.07325

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

A new paradigm for global sensitivity analysis

Mazo, Gildas

arXiv.org Machine LearningSep-10-2024

Current theory of global sensitivity analysis, based on a nonlinear functional ANOVA decomposition of the random output, is limited in scope-for instance, the analysis is limited to the output's variance and the inputs have to be mutually independent-and leads to sensitivity indices the interpretation of which is not fully clear, especially interaction effects. Alternatively, sensitivity indices built for arbitrary user-defined importance measures have been proposed but a theory to define interactions in a systematic fashion and/or establish a decomposition of the total importance measure is still missing. It is shown that these important problems are solved all at once by adopting a new paradigm. By partitioning the inputs into those causing the change in the output and those which do not, arbitrary user-defined variability measures are identified with the outcomes of a factorial experiment at two levels, leading to all factorial effects without assuming any functional decomposition. To link various well-known sensitivity indices of the literature (Sobol indices and Shapley effects), weighted factorial effects are studied and utilized.

decomposition, global sensitivity analysis, sensitivity analysis, (16 more...)

arXiv.org Machine Learning

2409.06271

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Los Angeles County > Santa Monica (0.04)
Europe > France (0.04)
Asia > Japan (0.04)

Genre:

Research Report > Strength High (0.36)
Research Report > Experimental Study (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.61)
Information Technology > Artificial Intelligence > Cognitive Science > Creativity & Intelligence (0.61)

Add feedback

A Chance Discovery on Vacation Changed My Whole Perspective on Wine

SlateSep-8-2024, 15:00:00 GMT

During a recent stay in London, my wife and I took an early evening stroll down Lamb's Conduit Street, in the West End district of Bloomsbury, a mere 15-minute walk from the din of Piccadilly Circus but a world away: peaceful, elegant, a medley of quaint shops and stately houses, tucked amid which was a restaurant called Noble Rot. The atmosphere was convivial, the food delicious, the wine exquisitely paired. On the bar counter were copies of a small but thick magazine, also called Noble Rot, adorned with a bright hipster-cartoon cover. Out of curiosity, I bought one--and, as sometimes happens in random moments, my take on a whole slice of life began to change. In this case, what changed was my attitude toward wine and its seriously playful possibilities.

artificial intelligence, data mining, keeling, (16 more...)

Slate

Country:

Europe > France (0.05)
North America > United States > California (0.05)
Europe > Switzerland (0.05)
(3 more...)

Industry: Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (1.00)

Technology:

Information Technology > Data Science > Data Mining (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.40)

Add feedback

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Lu, Chris, Lu, Cong, Lange, Robert Tjarko, Foerster, Jakob, Clune, Jeff, Ha, David

arXiv.org Artificial IntelligenceAug-31-2024

One of the grand challenges of artificial general intelligence is developing agents capable of conducting scientific research and discovering new knowledge. While frontier models have already been used as aides to human scientists, e.g. for brainstorming ideas, writing code, or prediction tasks, they still conduct only a small part of the scientific process. This paper presents the first comprehensive framework for fully automatic scientific discovery, enabling frontier large language models to perform research independently and communicate their findings. We introduce The AI Scientist, which generates novel research ideas, writes code, executes experiments, visualizes results, describes its findings by writing a full scientific paper, and then runs a simulated review process for evaluation. In principle, this process can be repeated to iteratively develop ideas in an open-ended fashion, acting like the human scientific community. We demonstrate its versatility by applying it to three distinct subfields of machine learning: diffusion modeling, transformer-based language modeling, and learning dynamics. Each idea is implemented and developed into a full paper at a cost of less than $15 per paper. To evaluate the generated papers, we design and validate an automated reviewer, which we show achieves near-human performance in evaluating paper scores. The AI Scientist can produce papers that exceed the acceptance threshold at a top machine learning conference as judged by our automated reviewer. This approach signifies the beginning of a new era in scientific discovery in machine learning: bringing the transformative benefits of AI agents to the entire research process of AI itself, and taking us closer to a world where endless affordable creativity and innovation can be unleashed on the world's most challenging problems. Our code is open-sourced at https://github.com/SakanaAI/AI-Scientist

adaptation capability and computational efficiency, result demonstrate significant improvement, sophisticated style representation technique, (16 more...)

arXiv.org Artificial Intelligence

2408.06292

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > Promising Solution (0.92)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(3 more...)

Add feedback