AITopics | evidence model

Collaborating Authors

evidence model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Aggregating empirical evidence from data strategy studies: a case on model quantization

del Rey, Santiago, Santos, Paulo Sérgio Medeiros dos, Travassos, Guilherme Horta, Franch, Xavier, Martínez-Fernández, Silverio

arXiv.org Artificial IntelligenceMay-5-2025

--Background: As empirical software engineering evolves, more studies adopt data strategies--approaches that investigate digital artifacts such as models, source code, or system logs rather than relying on human subjects. Synthesizing results from such studies introduces new methodological challenges. Aims: This study assesses the effects of model quantization on correctness and resource efficiency in deep learning (DL) systems. Additionally, it explores the methodological implications of aggregating evidence from empirical studies that adopt data strategies. Method: We conducted a research synthesis of six primary studies that empirically evaluate model quantization. We applied the Structured Synthesis Method (SSM) to aggregate the findings, which combines qualitative and quantitative evidence through diagrammatic modeling. A total of 19 evidence models were extracted and aggregated. Results: The aggregated evidence indicates that model quantization weakly negatively affects correctness metrics while consistently improving resource efficiency metrics, including storage size, inference latency, and GPU energy consumption--a manageable trade-off for many DL deployment contexts. Evidence across quantization techniques remains fragmented, underscoring the need for more focused empirical studies per technique. Conclusions: Model quantization offers substantial efficiency benefits with minor trade-offs in correctness, making it a suitable optimization strategy for resource-constrained environments. This study also demonstrates the feasibility of using SSM to synthesize findings from data strategy-based research. Software engineering (SE) increasingly relies on data strategy studies [1] to understand and improve software development and deployment practices. Data strategies refer to "empirical studies that rely primarily on archival, generated or simulated data" [1], using a wide range of specific methods, including experiments and data mining studies. It is also partially funded by the Joan Or o pre-doctoral support program (BDNS 657443), co-funded by the European Union. Although these studies provide valuable information, they remain largely disconnected, with findings often limited to specific contexts and lacking broader theoretical integration. Therefore, the SE field struggles with few theories and needs more structured syntheses of existing research to guide future advancements.

data mining, machine learning, quantization, (17 more...)

arXiv.org Artificial Intelligence

2505.00816

Country:

Europe (0.48)
South America > Brazil (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models

Sun, Guangzhi, Manakul, Potsawee, Liusie, Adian, Pipatanakul, Kunat, Zhang, Chao, Woodland, Phil, Gales, Mark

arXiv.org Artificial IntelligenceMay-22-2024

Multimodal foundation models are prone to hallucination, generating outputs that either contradict the input or are not grounded by factual information. Given the diversity in architectures, training data and instruction tuning techniques, there can be large variations in systems' susceptibility to hallucinations. To assess system hallucination robustness, hallucination ranking approaches have been developed for specific tasks such as image captioning, question answering, summarization, or biography generation. However, these approaches typically compare model outputs to gold-standard references or labels, limiting hallucination benchmarking for new domains. This work proposes "CrossCheckGPT", a reference-free universal hallucination ranking for multimodal foundation models. The core idea of CrossCheckGPT is that the same hallucinated content is unlikely to be generated by different independent systems, hence cross-system consistency can provide meaningful and accurate hallucination assessment scores. CrossCheckGPT can be applied to any model or task, provided that the information consistency between outputs can be measured through an appropriate distance metric. Focusing on multimodal large language models that generate text, we explore two information consistency measures: CrossCheck-explicit and CrossCheck-implicit. We showcase the applicability of our method for hallucination ranking across various modalities, namely the text, image, and audio-visual domains. Further, we propose the first audio-visual hallucination benchmark, "AVHalluBench", and illustrate the effectiveness of CrossCheckGPT, achieving correlations of 98% and 89% with human judgements on MHaluBench and AVHalluBench, respectively.

correlation, crosscheckgpt, hallucination, (16 more...)

arXiv.org Artificial Intelligence

2405.13684

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Singapore (0.05)
Asia > Indonesia > Bali (0.05)
(6 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evidence and plausibility in neighborhood structures

van Benthem, Johan, Fernández-Duque, David, Pacuit, Eric

arXiv.org Artificial IntelligenceJul-4-2013

The intuitive notion of evidence has both semantic and syntactic features. In this paper, we develop an {\em evidence logic} for epistemic agents faced with possibly contradictory evidence from different sources. The logic is based on a neighborhood semantics, where a neighborhood $N$ indicates that the agent has reason to believe that the true state of the world lies in $N$. Further notions of relative plausibility between worlds and beliefs based on the latter ordering are then defined in terms of this evidence structure, yielding our intended models for evidence-based beliefs. In addition, we also consider a second more general flavor, where belief and plausibility are modeled using additional primitive relations, and we prove a representation theorem showing that each such general model is a $p$-morphic image of an intended one. This semantics invites a number of natural special cases, depending on how uniform we make the evidence sets, and how coherent their total structure. We give a structural study of the resulting `uniform' and `flat' models. Our main result are sound and complete axiomatizations for the logics of all four major model classes with respect to the modal language of evidence, belief and safe belief. We conclude with an outlook toward logics for the dynamics of changing evidence, and the resulting language extensions and connections with logics of plausibility change.

artificial intelligence, belief revision, evidence model, (18 more...)

arXiv.org Artificial Intelligence

1307.1277

Country:

Europe (0.46)
North America (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.93)

Add feedback

Bayes Nets in Educational Assessment: Where Do the Numbers Come From?

Mislevy, Robert, Almond, Russell, Yan, Duanli, Steinberg, Linda S.

arXiv.org Artificial IntelligenceJan-23-2013

As observations and student models become complex, educational assessments that exploit advances in technology and cognitive psychology can outstrip familiar testing models and analytic methods. Within the Portal conceptual framework for assessment design, Bayesian inference networks (BINs) record beliefs about students' knowledge and skills, in light of what they say and do. Joining evidence model BIN fragments- which contain observable variables and pointers to student model variables - to the student model allows one to update belief about knowledge and skills as observations arrive. Markov Chain Monte Carlo (MCMC) techniques can estimate the required conditional probabilities from empirical data, supplemented by expert judgment or substantive theory. Details for the special cases of item response theory (IRT) and multivariate latent class modeling are given, with a numerical example of the latter.

artificial intelligence, examinee, machine learning, (19 more...)

arXiv.org Artificial Intelligence

1301.6722

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry:

Education > Educational Technology > Educational Software (0.91)
Education > Assessment & Standards (0.85)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Exploiting Functional Dependence in Bayesian Network Inference

Vomlel, Jirka

arXiv.org Artificial IntelligenceDec-12-2012

We propose an efficient method for Bayesian network inference in models with functional dependence. We generalize the multiplicative factorization method originally designed by Takikawa and D Ambrosio(1999) FOR models WITH independence OF causal influence.Using a hidden variable, we transform a probability potential INTO a product OF two - dimensional potentials.The multiplicative factorization yields more efficient inference. FOR example, IN junction tree propagation it helps TO avoid large cliques. IN ORDER TO keep potentials small, the number OF states OF the hidden variable should be minimized.We transform this problem INTO a combinatorial problem OF minimal base IN a particular space.We present an example OF a computerized adaptive test, IN which the factorization method IS significantly more efficient than previous inference methods.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1301.0609

Country:

Europe > Denmark (0.14)
Europe > Czechia (0.14)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)

Add feedback