AITopics | ret

Collaborating Authors

ret

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

4d13b2d99519c5415661dad44ab7edcd-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 19:45:04 GMT

Choose Option Aor Option B Option A: Exactly 200 peoplewillbesaved. Choose Option Aor Option B Option A: Exactly 400 peoplewilldie.

artificial intelligence, inthissection, section 3, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.05)
North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > California > Alameda County > Berkeley (0.04)

Technology: Information Technology > Artificial Intelligence (0.64)

Add feedback

Establishing Reliability Metrics for Reward Models in Large Language Models

Chen, Yizhou, Liu, Yawen, Wang, Xuesi, Yu, Qingtao, Huzhang, Guangda, Zeng, Anxiang, Yu, Han, Zhou, Zhiming

arXiv.org Artificial IntelligenceApr-22-2025

The reward model (RM) that represents human preferences plays a crucial role in optimizing the outputs of large language models (LLMs), e.g., through reinforcement learning from human feedback (RLHF) or rejection sampling. However, a long challenge for RM is its uncertain reliability, i.e., LLM outputs with higher rewards may not align with actual human preferences. Currently, there is a lack of a convincing metric to quantify the reliability of RMs. To bridge this gap, we propose the \textit{\underline{R}eliable at \underline{$η$}} (RETA) metric, which directly measures the reliability of an RM by evaluating the average quality (scored by an oracle) of the top $η$ quantile responses assessed by an RM. On top of RETA, we present an integrated benchmarking pipeline that allows anyone to evaluate their own RM without incurring additional Oracle labeling costs. Extensive experimental studies demonstrate the superior stability of RETA metric, providing solid evaluations of the reliability of various publicly available and proprietary RMs. When dealing with an unreliable RM, we can use the RETA metric to identify the optimal quantile from which to select the responses.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.14838

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Caffagni, Davide, Sarto, Sara, Cornia, Marcella, Baraldi, Lorenzo, Cucchiara, Rita

arXiv.org Artificial IntelligenceMar-3-2025

Cross-modal retrieval is gaining increasing efficacy and interest from the research community, thanks to large-scale training, novel architectural and learning designs, and its application in LLMs and multimodal LLMs. In this paper, we move a step forward and design an approach that allows for multimodal queries, composed of both an image and a text, and can search within collections of multimodal documents, where images and text are interleaved. Our model, ReT, employs multi-level representations extracted from different layers of both visual and textual backbones, both at the query and document side. To allow for multi-level and cross-modal understanding and feature extraction, ReT employs a novel Transformer-based recurrent cell that integrates both textual and visual features at different layers, and leverages sigmoidal gates inspired by the classical design of LSTMs. Extensive experiments on M2KR and M-BEIR benchmarks show that ReT achieves state-of-the-art performance across diverse settings. Our source code and trained models are publicly available at https://github.com/aimagelab/ReT.

dataset, representation, ret, (16 more...)

arXiv.org Artificial Intelligence

2503.0198

Country:

Asia > China (0.04)
North America > United States > Nebraska (0.04)
North America > United States > Illinois (0.04)
(8 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Oracle Computability and Turing Reducibility in the Calculus of Inductive Constructions

Forster, Yannick, Kirst, Dominik, Mück, Niklas

arXiv.org Artificial IntelligenceJul-28-2023

We develop synthetic notions of oracle computability and Turing reducibility in the Calculus of Inductive Constructions (CIC), the constructive type theory underlying the Coq proof assistant. As usual in synthetic approaches, we employ a definition of oracle computations based on meta-level functions rather than object-level models of computation, relying on the fact that in constructive systems such as CIC all definable functions are computable by construction. Such an approach lends itself well to machine-checked proofs, which we carry out in Coq. There is a tension in finding a good synthetic rendering of the higher-order notion of oracle computability. On the one hand, it has to be informative enough to prove central results, ensuring that all notions are faithfully captured. On the other hand, it has to be restricted enough to benefit from axioms for synthetic computability, which usually concern first-order objects. Drawing inspiration from a definition by Andrej Bauer based on continuous functions in the effective topos, we use a notion of sequential continuity to characterise valid oracle computations. As main technical results, we show that Turing reducibility forms an upper semilattice, transports decidability, and is strictly more expressive than truth-table reducibility, and prove that whenever both a predicate $p$ and its complement are semi-decidable relative to an oracle $q$, then $p$ Turing-reduces to $q$.

artificial intelligence, logic & formal reasoning, logic programming, (14 more...)

arXiv.org Artificial Intelligence

2307.15543

Country:

North America > United States > Wisconsin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.49)

Add feedback

TRAC: Trustworthy Retrieval Augmented Chatbot

Li, Shuo, Park, Sangdon, Lee, Insup, Bastani, Osbert

arXiv.org Artificial IntelligenceJul-6-2023

Although conversational AIs have demonstrated fantastic performance, they often generate incorrect information, or hallucinations. Retrieval augmented generation has emerged as a promising solution to reduce these hallucinations. However, these techniques still cannot guarantee correctness. Focusing on question answering, we propose a framework that can provide statistical guarantees for the retrieval augmented question answering system by combining conformal prediction and global testing. In addition, we use Bayesian optimization to choose hyperparameters of the global test to maximize the performance of the system. Our empirical results on the Natural Questions dataset demonstrate that our method can provide the desired coverage guarantee while minimizing the average prediction set size.

large language model, machine learning, question answering, (21 more...)

arXiv.org Artificial Intelligence

2307.04642

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.78)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation

Patel, Gaurav, Mopuri, Konda Reddy, Qiu, Qiang

arXiv.org Artificial IntelligenceFeb-27-2023

Data-free Knowledge Distillation (DFKD) has gained popularity recently, with the fundamental idea of carrying out knowledge transfer from a Teacher neural network to a Student neural network in the absence of training data. However, in the Adversarial DFKD framework, the student network's accuracy, suffers due to the non-stationary distribution of the pseudo-samples under multiple generator updates. To this end, at every generator update, we aim to maintain the student's performance on previously encountered examples while acquiring knowledge from samples of the current distribution. Thus, we propose a meta-learning inspired framework by treating the task of Knowledge-Acquisition (learning from newly generated samples) and Knowledge-Retention (retaining knowledge on previously met samples) as meta-train and meta-test, respectively. Hence, we dub our method as Learning to Retain while Acquiring. Moreover, we identify an implicit aligning factor between the Knowledge-Retention and Knowledge-Acquisition tasks indicating that the proposed student update strategy enforces a common gradient direction for both tasks, alleviating interference between the two objectives. Finally, we support our hypothesis by exhibiting extensive evaluation and comparison of our method with prior arts on multiple datasets.

artificial intelligence, knowledge management, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2302.1429

Genre: Research Report (0.40)

Industry: Education (1.00)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Linking Sketch Patches by Learning Synonymous Proximity for Graphic Sketch Representation

Zang, Sicong, Tu, Shikui, Xu, Lei

arXiv.org Artificial IntelligenceNov-30-2022

Graphic sketch representations are effective for representing sketches. Existing methods take the patches cropped from sketches as the graph nodes, and construct the edges based on sketch's drawing order or Euclidean distances on the canvas. However, the drawing order of a sketch may not be unique, while the patches from semantically related parts of a sketch may be far away from each other on the canvas. In this paper, we propose an order-invariant, semantics-aware method for graphic sketch representations. The cropped sketch patches are linked according to their global semantics or local geometric shapes, namely the synonymous proximity, by computing the cosine similarity between the captured patch embeddings. Such constructed edges are learnable to adapt to the variation of sketch drawings, which enable the message passing among synonymous patches. Aggregating the messages from synonymous patches by graph convolutional networks plays a role of denoising, which is beneficial to produce robust patch embeddings and accurate sketch representations. Furthermore, we enforce a clustering constraint over the embeddings jointly with the network learning. The synonymous patches are self-organized as compact clusters, and their embeddings are guided to move towards their assigned cluster centroids. It raises the accuracy of the computed synonymous proximity. Experimental results show that our method significantly improves the performance on both controllable sketch synthesis and sketch healing.

artificial intelligence, machine learning, sketch, (18 more...)

arXiv.org Artificial Intelligence

2211.16841

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Challenges and Pitfalls of Bayesian Unlearning

Rawat, Ambrish, Requeima, James, Bruinsma, Wessel, Turner, Richard

arXiv.org Artificial IntelligenceSep-13-2022

Machine unlearning refers to the task of removing a subset of training data, thereby removing its contributions to a trained model. Approximate unlearning are one class of methods for this task which avoid the need to retrain the model from scratch on the retained data. Bayes' rule can be used to cast approximate unlearning as an inference problem where the objective is to obtain the updated posterior by dividing out the likelihood of deleted data. However this has its own set of challenges as one often doesn't have access to the exact posterior of the model parameters. In this work we examine the use of the Laplace approximation and Variational Inference to obtain the updated posterior. With a neural network trained for a regression task as the guiding example, we draw insights on the applicability of Bayesian unlearning in practical scenarios.

approximation, challenge and pitfall, posterior, (14 more...)

arXiv.org Artificial Intelligence

2207.03227

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(4 more...)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Cell Mechanics Based Computational Classification of Red Blood Cells Via Machine Intelligence Applied to Morpho-Rheological Markers

Ge, Yan, Rosendahl, Philipp, Durán, Claudio, Töpfner, Nicole, Ciucci, Sara, Guck, Jochen, Cannistraci, Carlo Vittorio

arXiv.org Machine LearningMar-2-2020

Despite fluorescent cell-labelling being widely employed in biomedical studies, some of its drawbacks are inevitable, with unsuitable fluorescent probes or probes inducing a functional change being the main limitations. Consequently, the demand for and development of label-free methodologies to classify cells is strong and its impact on precision medicine is relevant. Towards this end, high-throughput techniques for cell mechanical phenotyping have been proposed to get a multidimensional biophysical characterization of single cells. With this motivation, our goal here is to investigate the extent to which an unsupervised machine learning methodology, which is applied exclusively on morpho-rheological markers obtained by real-time deformability and fluorescence cytometry (RT-FDC), can address the difficult task of providing label-free discrimination of reticulocytes from mature red blood cells. We focused on this problem, since the characterization of reticulocytes (their percentage and cellular features) in the blood is vital in multiple human disease conditions, especially bone-marrow disorders such as anemia and leukemia. Our approach reports promising label-free results in the classification of reticulocytes from mature red blood cells, and it represents a step forward in the development of high-throughput morpho-rheological-based methodologies for the computational categorization of single cells. Besides, our methodology can be an alternative but also a complementary method to integrate with existing cell-labelling techniques.

artificial intelligence, information, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1109/TCBB.2019.2945762

2003.00009

Country:

Europe > Germany (0.69)
North America > United States (0.67)
Europe > Italy (0.46)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.86)
Health & Medicine > Therapeutic Area > Immunology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Relevant Ensemble of Trees

Dawer, Gitesh, Barbu, Adrian

arXiv.org Machine LearningFeb-5-2018

Tree ensembles are flexible predictive models that can capture relevant variables and to some extent their interactions in a compact and interpretable manner. Most algorithms for obtaining tree ensembles are based on versions of boosting or Random Forest. Previous work showed that boosting algorithms exhibit a cyclic behavior of selecting the same tree again and again due to the way the loss is optimized. At the same time, Random Forest is not based on loss optimization and obtains a more complex and less interpretable model. In this paper we present a novel method for obtaining compact tree ensembles by growing a large pool of trees in parallel with many independent boosting threads and then selecting a small subset and updating their leaf weights by loss optimization. We allow for the trees in the initial pool to have different depths which further helps with generalization. Experiments on real datasets show that the obtained model has usually a smaller loss than boosting, which is also reflected in a lower misclassification error on the test set.

artificial intelligence, ensemble, machine learning, (18 more...)

arXiv.org Machine Learning

1709.05545

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.70)

Add feedback