AITopics | statistical reasoning

Collaborating Authors

statistical reasoning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Rarity Blind Spot: A Framework for Evaluating Statistical Reasoning in LLMs

Maekawa, Seiji, Iso, Hayate, Bhutani, Nikita

arXiv.org Artificial IntelligenceOct-2-2025

Effective decision-making often relies on identifying what makes each candidate distinctive. While existing benchmarks for LLMs emphasize retrieving or summarizing information relevant to a given query, they do not evaluate a model's ability to identify globally distinctive features across a set of documents. We introduce Distinctive Feature Mining (DFM), a new task that challenges models to analyze a small-to-medium collection (10-40 documents) and surface features that are rare in the global context (e.g., appearing in less than 10% of documents). This setting mirrors real-world scenarios such as candidate selection or product differentiation, where statistical reasoning, not retrieval, is key. To enable systematic evaluation of this capability, we present DiFBench, a configurable benchmark creation framework with controllable parameters such as document set size and distinctiveness thresholds. Using DiFBench, we perform a large-scale assessment of distinctive feature mining across ten state-of-the-art LLMs. Our findings reveal a significant performance gap between general-purpose and reasoning-enhanced models. All models, however, substantially degrade as the task complexity and document count increase. We also find that a common failure mode is misidentifying frequent features as distinctive. These insights reveal core limitations in contemporary LLMs' abilities to perform fine-grained, statistical reasoning and rarity detection.

distinctive feature, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.00245

Country:

North America > United States (0.93)
North America > Mexico > Mexico City (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ClimateViz: A Benchmark for Statistical Reasoning and Fact Verification on Scientific Charts

Su, Ruiran, Si, Jiasheng, Guo, Zhijiang, Pierrehumbert, Janet B.

arXiv.org Artificial IntelligenceJun-12-2025

Scientific fact-checking has mostly focused on text and tables, overlooking scientific charts, which are key for presenting quantitative evidence and statistical reasoning. We introduce ClimateViz, the first large-scale benchmark for scientific fact-checking using expert-curated scientific charts. ClimateViz contains 49,862 claims linked to 2,896 visualizations, each labeled as support, refute, or not enough information. To improve interpretability, each example includes structured knowledge graph explanations covering trends, comparisons, and causal relations. We evaluate state-of-the-art multimodal language models, including both proprietary and open-source systems, in zero-shot and few-shot settings. Results show that current models struggle with chart-based reasoning: even the best systems, such as Gemini 2.5 and InternVL 2.5, reach only 76.2 to 77.8 percent accuracy in label-only settings, far below human performance (89.3 and 92.7 percent). Explanation-augmented outputs improve performance in some models. We released our dataset and code alongside the paper.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.087

Country:

North America > United States (0.93)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.47)
Government > Regional Government > North America Government > United States Government (0.46)
Energy (0.46)

Technology:

Information Technology > Visualization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

GPT's Judgements Under Uncertainty

Saeedi, Payam, Goodarzi, Mahsa

arXiv.org Artificial IntelligenceSep-26-2024

--We investigate the presence of cognitive biases in three large language models (LLMs): GPT -4o, Gemma 2, and Llama 3.1. The study uses 1,500 experiments across nine established cognitive biases to evaluate the responses and consistency of the models. GPT -4o demonstrated the strongest overall performance. Gemma 2 showed strengths in addressing the sunk cost fallacy and prospect theory; however, its performance varied across different biases. Llama 3.1 consistently underperformed, relying on heuristics and exhibiting frequent inconsistencies and contradictions. The findings highlight the challenges of achieving robust and generalizable reasoning in LLMs, and underscore the need for further development to mitigate biases in artificial general intelligence (AGI). The study emphasizes the importance of integrating statistical reasoning and ethical considerations in future AI development. Cognitive biases and heuristics are well-established phenomena of the human mind, shaping how individuals process information, make judgments, and make decisions. These biases emerge from heuristics -- mental shortcuts that simplify complex tasks by substituting them with cognitively easier alternatives [1]. While heuristics enable quick and efficient reasoning, they also introduce systematic errors that impact judgment and decision-making [2]-[4]. Understanding whether such biases, embedded in the data and interactions that shape Large Language Models (LLMs), are reflected in their outputs is not only critical for evaluating their alignment with human cognition but also vital for the development of Artificial General Intelligence (AGI). AGI, envisioned as systems capable of performing any intellectual task a human can, must navigate the intricacies of human-like reasoning while avoiding harmful or irresponsible biases.

gemma 2, llama 3, reasoning, (14 more...)

arXiv.org Artificial Intelligence

2410.0282

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Switzerland (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Statistical Reasoning for Public Health 2: Regression Methods Coursera

@machinelearnbotMay-27-2018, 21:16:39 GMT

Structure: Good structure and went through all the basic principles of statistics in detail. Appreciated how it did not have to go through the methodology of each method, but taught us how to appreciate it and understand the data as it was presented in the literature. I liked how John went through the examples in the literature so it was good to see how it was utilised in practice. I wish there was a separate course to teach us how to use these methods with sample data, perhaps a taster of this would have been good to include? but I do understand that would be challenging for some. I think some in-video questions would have been good to check-up on the progress of learning.

artificial intelligence, machine learning, regression method coursera, (4 more...)

@machinelearnbot

Industry:

Health & Medicine > Public Health (0.40)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)

Add feedback

Statistical Reasoning for Public Health 2: Regression Methods Coursera

#artificialintelligenceMay-6-2018, 12:26:43 GMT

This module, along with module 2B introduces two key concepts in statistics/epidemiology, confounding and effect modification. A relation between an outcome and exposure of interested can be confounded if a another variable (or variables) is associated with both the outcome and the exposure. In such cases the crude outcome/exposure associate may over or under-estimate the association of interest. Confounding is an ever-present threat in non-randomized studies, but results of interest can be adjusted for potential confounders.

artificial intelligence, machine learning, regression method coursera, (3 more...)

#artificialintelligence

Industry:

Health & Medicine > Public Health (0.40)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)

Add feedback

Intelligent Things It's all about machine learning

#artificialintelligenceDec-25-2016, 11:45:15 GMT

Evolving from the study of pattern recognition and computational learning theory in artificial intelligence, machine learning explores software algorithms that can learn from, and make predictions on volumes of data. Simply stated... Machine learning helps humans make data-driven decisions. Machine learning offers practical solutions that can maximize resource utilization, prolong the lifespan of IoT sensors, platforms and networks, and enables dynamic services architecture. Our connected world is increasingly dependent on big data -- at rest, and in years to come, streaming fast data -- in motion." With real-time predictive models, once a streaming fast data point has been observed it might never be seen again.

artificial intelligence, machine learning, prediction, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.56)

Add feedback

Intelligent Things It's all about machine learning

#artificialintelligenceDec-11-2016, 17:20:07 GMT

Machine learning is increasingly being employed as a tool to help companies collect billions of data points, boil them down to what is actually meaningful, and predict what is likely to happen in the future. Simply stated... Machine learning helps make data-driven decisions. Machine learning offers practical solutions that can maximize resource utilization, prolong the lifespan of IoT sensors, platforms and networks, and enables dynamic services architecture. Our connected world is increasingly dependent on big data -- at rest, and in years to come, streaming fast data -- in motion." With real-time predictive models, once a streaming fast data point has been observed it might never be seen again.

application, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Artificial Intelligence: Logic Reasoning v. Statistical Reasoning - DATAVERSITY

#artificialintelligenceApr-6-2016, 20:10:02 GMT

Mitch De Felice recently wrote in CIO.com, "As a technology decision maker, all the vocabulary of artificial intelligence might be a bit overwhelming. In Figure 1 [to the left], starting from the bottom going up illustrates knowledge acquisition capabilities from a data usage perspective. By no means does this represent all the approaches to achieving an AI solution, but rather it illustrates how big data fits into the AI picture. Machine learning is represented by the right side of the above diagram, labeled, 'Statistical Reasoning.' There are two types of machine learning, unsupervised and supervised. When big data vendors speak of machine learning, they are usually speaking of supervised machine learning that has existed since the 1950s."

artificial intelligence, machine learning, statistical reasoning, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback