AITopics | simple 0

Collaborating Authors

simple 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ViPlan: A Benchmark for Visual Planning with Symbolic Predicates and Vision-Language Models

Merler, Matteo, Dainese, Nicola, Alakuijala, Minttu, Bonetta, Giovanni, Ferrazzi, Pietro, Tian, Yu, Magnini, Bernardo, Marttinen, Pekka

arXiv.org Artificial IntelligenceMay-20-2025

Integrating Large Language Models with symbolic planners is a promising direction for obtaining verifiable and grounded plans compared to planning in natural language, with recent works extending this idea to visual domains using Vision-Language Models (VLMs). However, rigorous comparison between VLM-grounded symbolic approaches and methods that plan directly with a VLM has been hindered by a lack of common environments, evaluation protocols and model coverage. We introduce ViPlan, the first open-source benchmark for Visual Planning with symbolic predicates and VLMs. ViPlan features a series of increasingly challenging tasks in two domains: a visual variant of the classic Blocksworld planning problem and a simulated household robotics environment. We benchmark nine open-source VLM families across multiple sizes, along with selected closed models, evaluating both VLM-grounded symbolic planning and using the models directly to propose actions. We find symbolic planning to outperform direct VLM planning in Blocksworld, where accurate image grounding is crucial, whereas the opposite is true in the household robotics tasks, where commonsense knowledge and the ability to recover from errors are beneficial. Finally, we show that across most models and methods, there is no significant benefit to using Chain-of-Thought prompting, suggesting that current VLMs still struggle with visual reasoning.

large language model, machine learning, simple 0, (20 more...)

arXiv.org Artificial Intelligence

2505.1318

Country: North America > Mexico (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Bayesian vs. PAC-Bayesian Deep Neural Network Ensembles

Hauptvogel, Nick, Igel, Christian

arXiv.org Artificial IntelligenceJun-8-2024

Bayesian neural networks address epistemic uncertainty by learning a posterior distribution over model parameters. Sampling and weighting networks according to this posterior yields an ensemble model referred to as Bayes ensemble. Ensembles of neural networks (deep ensembles) can profit from the cancellation of errors effect: Errors by ensemble members may average out and the deep ensemble achieves better predictive performance than each individual network. We argue that neither the sampling nor the weighting in a Bayes ensemble are particularly well-suited for increasing generalization performance, as they do not support the cancellation of errors effect, which is evident in the limit from the Bernstein-von~Mises theorem for misspecified models. In contrast, a weighted average of models where the weights are optimized by minimizing a PAC-Bayesian generalization bound can improve generalization performance. This requires that the optimization takes correlations between models into account, which can be achieved by minimizing the tandem loss at the cost that hold-out data for estimating error correlations need to be available. The PAC-Bayesian weighting increases the robustness against correlated models and models with lower performance in an ensemble. This allows us to safely add several models from the same learning process to an ensemble, instead of using early-stopping for selecting a single weight configuration. Our study presents empirical results supporting these conceptual considerations on four different classification datasets. We show that state-of-the-art Bayes ensembles from the literature, despite being computationally demanding, do not improve over simple uniformly weighted deep ensembles and cannot match the performance of deep ensembles weighted by optimizing the tandem loss, which additionally come with non-vacuous generalization guarantees.

deep ensemble, ensemble, international conference, (14 more...)

arXiv.org Artificial Intelligence

2406.05469

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Value Prediction for Spatiotemporal Gait Data Using Deep Learning

Cavanagh, Ryan, Trajkovic, Jelena, Zhang, Wenlu, Khoo, I-Hung, Krishnan, Vennila

arXiv.org Artificial IntelligenceFeb-29-2024

Human gait has been commonly used for the diagnosis and evaluation of medical conditions and for monitoring the progress during treatment and rehabilitation. The use of wearable sensors that capture pressure or motion has yielded techniques that analyze the gait data to aid recovery, identify activity performed, or identify individuals. Deep learning, usually employing classification, has been successfully utilized in a variety of applications such as computer vision, biomedical imaging analysis, and natural language processing. We expand the application of deep learning to value prediction of time-series of spatiotemporal gait data. Moreover, we explore several deep learning architectures (Recurrent Neural Networks (RNN) and RNN combined with Convolutional Neural Networks (CNN)) to make short- and long-distance predictions using two different experimental setups. Our results show that short-distance prediction has an RMSE as low as 0.060675, and long-distance prediction RMSE as low as 0.106365. Additionally, the results show that the proposed deep learning models are capable of predicting the entire trial when trained and validated using the trials from the same participant. The proposed, customized models, used with value prediction open possibilities for additional applications, such as fall prediction, in-home progress monitoring, aiding of exoskeleton movement, and authentication.

participant, prediction, sensor, (13 more...)

arXiv.org Artificial Intelligence

2403.07926

Country:

North America > United States > California (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > Scotland (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Overview of the TREC 2023 Product Product Search Track

Campos, Daniel, Kallumadi, Surya, Rosset, Corby, Zhai, Cheng Xiang, Magnani, Alessandro

arXiv.org Artificial IntelligenceNov-15-2023

At TREC 2023, we hosted the first TREC Product Search Track, looking to create a reusable general benchmark for evaluating the performance of retrieval methods in the product search domain. We focus on providing a benchmark similar in scale and format to NQ Kwiatkowski et al. [2019], or the Deep Learning Track Craswell et al. [2021] but focused on product search. In providing a simple-to-use dataset, we believe broad experimentation using popular retrieval libraries Lin et al. [2021] Gao et al. [2022] can lead to broad improvements in retrieval performance. In this first year of the track, we created a novel collection based on the ESCI Product Re-ranking dataset Reddy et al. [2022], sampled novel queries, created enriched metadata in the form of additional text and images along with seeded evaluation results with a broad range of baseline runs to aid in collection reusability and to allow iteration and experimentation on the use of additional context. Unlike previous product search corpora, the Product Search Track is multi-modal and has a large enough scale to explore the usage of neural retrieval methods.

baseline 0, metadata 0, query, (14 more...)

arXiv.org Artificial Intelligence

2311.07861

Country:

Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Asia > Japan > Hokkaidō (0.04)
Africa > Madagascar (0.04)

Genre: Research Report (0.40)

Industry:

Materials (0.68)
Leisure & Entertainment (0.67)
Health & Medicine > Therapeutic Area (0.46)
(2 more...)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)

Add feedback