AITopics

2410.11545

Country:

North America > Haiti (0.14)
North America > United States > New York (0.04)
Europe > Italy > Umbria (0.04)
(7 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Bersier, Stephane, Chen-Lin, Xinyi

Encoding architecture algebra

arXiv.org Artificial IntelligenceOct-15-2024

There is growing awareness of the importance of designing model architectures that capture and respect the distinct structure of input data. Many successful deep learning architectures, 2 such as transformers [1], convolutional neural networks (CNNs)[2], graph neural networks (GNNs) [3], and recurrent neural networks (RNNs)[4], inherently incorporate aspects of data structure. Ongoing research focuses on refining existing architectures, as well as designing new ones for other types of structured data. For instance, DeepSets [5] are tailored to process sets, group and gauge equivariant CNNs [6][7] respect both global and local symmetries in the data, and strongly-typed RNNs [8] incorporate explicit types within recurrent networks. By accounting for the structure of the input data, these model architectures exhibit improved performance, better generalization with fewer parameters, and enhanced interpretability.

artificial intelligence, deep learning, machine learning, (18 more...)

2410.11776

Genre:

Overview (0.54)
Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-15-2024

HumVI: A Multilingual Dataset for Detecting Violent Incidents Impacting Humanitarian Aid

Lamba, Hemank, Abilov, Anton, Zhang, Ke, Olson, Elizabeth M., Dambanemuya, Henry k., Bárcia, João c., Batista, David S., Wille, Christina, Cahill, Aoife, Tetreault, Joel, Jaimes, Alex

Humanitarian organizations can enhance their effectiveness by analyzing data to discover trends, gather aggregated insights, manage their security risks, support decision-making, and inform advocacy and funding proposals. However, data about violent incidents with direct impact and relevance for humanitarian aid operations is not readily available. An automatic data collection and NLP-backed classification framework aligned with humanitarian perspectives can help bridge this gap. In this paper, we present HumVI - a dataset comprising news articles in three languages (English, French, Arabic) containing instances of different types of violent incidents categorized by the humanitarian sector they impact, e.g., aid security, education, food security, health, and protection. Reliable labels were obtained for the dataset by partnering with a data-backed humanitarian organization, Insecurity Insight. We provide multiple benchmarks for the dataset, employing various deep learning architectures and techniques, including data augmentation and mask loss, to address different task-related challenges, e.g., domain expansion. The dataset is publicly available at https://github.com/dataminr-ai/humvi-dataset.

category, dataset, insecurity insight, (16 more...)

2410.0637

Country:

North America > United States (0.46)
North America > Haiti (0.14)
Asia > Middle East > Palestine > Gaza Strip > Gaza Governorate > Gaza (0.04)
(18 more...)

Genre:

Overview (0.68)
Research Report (0.64)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Health & Medicine (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningOct-15-2024

DODT: Enhanced Online Decision Transformer Learning through Dreamer's Actor-Critic Trajectory Forecasting

Jiang, Eric Hanchen, Zhang, Zhi, Zhang, Dinghuai, Lizarraga, Andrew, Xu, Chenheng, Zhang, Yasi, Zhao, Siyan, Xu, Zhengjie, Yu, Peiyu, Tang, Yuer, Kong, Deqian, Wu, Ying Nian

Advancements in reinforcement learning have led to the development of sophisticated models capable of learning complex decision-making tasks. However, efficiently integrating world models with decision transformers remains a challenge. In this paper, we introduce a novel approach that combines the Dreamer algorithm's ability to generate anticipatory trajectories with the adaptive learning strengths of the Online Decision Transformer. Our methodology enables parallel training where Dreamer-produced trajectories enhance the contextual decision-making of the transformer, creating a bidirectional enhancement loop. We empirically demonstrate the efficacy of our approach on a suite of challenging benchmarks, achieving notable improvements in sample efficiency and reward maximization over existing methods. Our results indicate that the proposed integrated framework not only accelerates learning but also showcases robustness in diverse and dynamic scenarios, marking a significant step forward in model-based reinforcement learning.

decision transformer, dreamer, transformer, (12 more...)

arXiv.org Machine Learning

2410.11359

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Portugal (0.04)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)
Overview > Innovation (0.34)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Coello, Omar, Coronel, Moisés, Carpio, Darío, Vintimilla, Boris, Chuquimarca, Luis

Enhancing Apple's Defect Classification: Insights from Visible Spectrum and Narrow Spectral Band Imaging

This study addresses the classification of defects in apples as a crucial measure to mitigate economic losses and optimize the food supply chain. An innovative approach is employed that integrates images from the visible spectrum and 660 nm spectral wavelength to enhance accuracy and efficiency in defect classification. The methodology is based on the use of Single-Input and Multi-Inputs convolutional neural networks (CNNs) to validate the proposed strategies. Steps include image acquisition and preprocessing, classification model training, and performance evaluation. Results demonstrate that defect classification using the 660 nm spectral wavelength reveals details not visible in the entire visible spectrum. It is seen that the use of the appropriate spectral range in the classification process is slightly superior to the entire visible spectrum. The MobileNetV1 model achieves an accuracy of 98.80\% on the validation dataset versus the 98.26\% achieved using the entire visible spectrum. Conclusions highlight the potential to enhance the method by capturing images with specific spectral ranges using filters, enabling more effective network training for classification task. These improvements could further enhance the system's capability to identify and classify defects in apples.

artificial intelligence, deep learning, machine learning, (18 more...)

doi: 10.1109/ICPRS62101.2024.10677803

2410.19784

Country:

South America > Ecuador > Guayas Province > Guayaquil (0.05)
Asia > China (0.04)

Genre:

Overview (1.00)
Research Report > New Finding (0.49)
Research Report > Promising Solution (0.34)

Industry:

Health & Medicine (0.68)
Food & Agriculture (0.47)
Education (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Srivastava, Aviral, Panda, Sourav

A Formal Framework for Assessing and Mitigating Emergent Security Risks in Generative AI Models: Bridging Theory and Dynamic Risk Mitigation

As generative AI systems, including large language models (LLMs) and diffusion models, advance rapidly, their growing adoption has led to new and complex security risks often overlooked in traditional AI risk assessment frameworks. This paper introduces a novel formal framework for categorizing and mitigating these emergent security risks by integrating adaptive, real-time monitoring, and dynamic risk mitigation strategies tailored to generative models' unique vulnerabilities. We identify previously under-explored risks, including latent space exploitation, multi-modal cross-attack vectors, and feedback-loop-induced model degradation. Our framework employs a layered approach, incorporating anomaly detection, continuous red-teaming, and real-time adversarial simulation to mitigate these risks. We focus on formal verification methods to ensure model robustness and scalability in the face of evolving threats. Though theoretical, this work sets the stage for future empirical validation by establishing a detailed methodology and metrics for evaluating the performance of risk mitigation strategies in generative AI systems.

machine learning, natural language, vulnerability, (17 more...)

2410.13897

Country:

North America > United States > Pennsylvania > Centre County > State College (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre:

Overview (0.93)
Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.88)

Tepper, Mariano, Bhati, Ishwar Singh, Aguerrebere, Cecilia, Willke, Ted

GleanVec: Accelerating vector search with minimalist nonlinear dimensionality reduction

Embedding models can generate high-dimensional vectors whose similarity reflects semantic affinities. Thus, accurately and timely retrieving those vectors in a large collection that are similar to a given query has become a critical component of a wide range of applications. In particular, cross-modal retrieval (e.g., where a text query is used to find images) is gaining momentum rapidly. Here, it is challenging to achieve high accuracy as the queries often have different statistical distributions than the database vectors. Moreover, the high vector dimensionality puts these search systems under compute and memory pressure, leading to subpar performance. In this work, we present new linear and nonlinear methods for dimensionality reduction to accelerate high-dimensional vector search while maintaining accuracy in settings with in-distribution (ID) and out-of-distribution (OOD) queries. The linear LeanVec-Sphering outperforms other linear methods, trains faster, comes with no hyperparameters, and allows to set the target dimensionality more flexibly. The nonlinear Generalized LeanVec (GleanVec) uses a piecewise linear scheme to further improve the search accuracy while remaining computationally nimble. Initial experimental results show that LeanVec-Sphering and GleanVec push the state of the art for vector search.

data mining, machine learning, natural language, (19 more...)

2410.22347

Genre:

Research Report (0.70)
Overview (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Words to Wheels: Vision-Based Autonomous Driving Understanding Human Language Instructions Using Foundation Models

Ryu, Chanhoe, Seong, Hyunki, Lee, Daegyu, Moon, Seongwoo, Min, Sungjae, Shim, D. Hyunchul

This paper introduces an innovative application of foundation models, enabling Unmanned Ground Vehicles (UGVs) equipped with an RGB-D camera to navigate to designated destinations based on human language instructions. Unlike learning-based methods, this approach does not require prior training but instead leverages existing foundation models, thus facilitating generalization to novel environments. Upon receiving human language instructions, these are transformed into a 'cognitive route description' using a large language model (LLM)-a detailed navigation route expressed in human language. The vehicle then decomposes this description into landmarks and navigation maneuvers. The vehicle also determines elevation costs and identifies navigability levels of different regions through a terrain segmentation model, GANav, trained on open datasets. Semantic elevation costs, which take both elevation and navigability levels into account, are estimated and provided to the Model Predictive Path Integral (MPPI) planner, responsible for local path planning. Concurrently, the vehicle searches for target landmarks using foundation models, including YOLO-World and EfficientViT-SAM. Ultimately, the vehicle executes the navigation commands to reach the designated destination, the final landmark. Our experiments demonstrate that this application successfully guides UGVs to their destinations following human language instructions in novel environments, such as unfamiliar terrain or urban settings.

large language model, machine learning, natural language, (19 more...)

2410.10577

Country:

North America > United States (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (1.00)
Overview > Innovation (0.34)

Industry:

Transportation > Ground > Road (0.51)
Information Technology > Robotics & Automation (0.51)
Automobiles & Trucks (0.51)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Temperature-Centric Investigation of Speculative Decoding with Knowledge Distillation

Ouyang, Siru, Wang, Shuohang, Jiang, Minhao, Zhong, Ming, Yu, Donghan, Han, Jiawei, Shen, Yelong

Speculative decoding stands as a pivotal technique to expedite inference in autoregressive (large) language models. This method employs a smaller draft model to speculate a block of tokens, which the target model then evaluates for acceptance. Despite a wealth of studies aimed at increasing the efficiency of speculative decoding, the influence of generation configurations on the decoding process remains poorly understood, especially concerning decoding temperatures. This paper delves into the effects of decoding temperatures on speculative decoding's efficacy. Beginning with knowledge distillation (KD), we first highlight the challenge of decoding at higher temperatures, and demonstrate KD in a consistent temperature setting could be a remedy. We also investigate the effects of out-of-domain testing sets with out-of-range temperatures. Building upon these findings, we take an initial step to further the speedup for speculative decoding, particularly in a high-temperature generation setting. Our work offers new insights into how generation configurations drastically affect the performance of speculative decoding, and underscores the need for developing methods that focus on diverse decoding configurations. Code is publically available at https://github.com/ozyyshr/TempSpec.

distillation, large language model, machine learning, (18 more...)

2410.10141

Country:

Europe > Austria > Vienna (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Texas (0.04)
(5 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.68)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs

Wang, Haochuan, Feng, Xiachong, Li, Lei, Qin, Zhanyue, Sui, Dianbo, Kong, Lingpeng

The rapid advancement of large language models (LLMs) has accelerated their application in reasoning, with strategic reasoning drawing increasing attention. To evaluate the strategic reasoning capabilities of LLMs, game theory, with its concise structure, has become the preferred approach for many researchers. However, current research typically focuses on a limited selection of games, resulting in low coverage of game types. Additionally, classic game scenarios carry risks of data leakage, and the benchmarks used often lack extensibility, rendering them inadequate for evaluating state-of-the-art models. Specifically, we incorporate all 144 game types summarized by the Robinson-Goforth topology of 2 2 games, which are constructed as classic games in our benchmark. Furthermore, we employ synthetic data generation techniques to create diverse, higher-quality game scenarios through topic guidance and human inspection for each classic game, which we refer to as story-based games. Lastly, to provide a sustainable evaluation framework adaptable to increasingly powerful LLMs, we treat the aforementioned games as atomic units and organize them into more complex forms through sequential, parallel, and nested structures. We conducted a comprehensive evaluation of mainstream LLMs, covering tests on rational reasoning, reasoning robustness, Theory-of-Mind capabilities, and reasoning in complex game forms. The results revealed that LLMs still have flaws in the accuracy and consistency of strategic reasoning processes, and their levels of mastery over Theory-of-Mind also vary. These achievements are largely attributed to LLMs' ability to assimilate vast amounts of knowledge during training, emerging with the capacity to organize information at a coarse level and link knowledge at a finegrained level through their internal representations (Min et al., 2023; Zhao et al., 2023). These core capabilities have driven the success of LLMs in numerous reasoning tasks, including mathematical reasoning (Hendrycks et al., 2021; Zhang et al., 2023), commonsense reasoning (Sap et al., 2019; Bisk et al., 2020), logical reasoning (Lei et al., 2023), and strategic reasoning (Lorè & Heydari, Work done during an internship at the University of Hong Kong. The dataset and evaluation codes will be available at https://github.com/PinkEx/TMGBench. Among these, strategic reasoning has attracted considerable attention due to its multi-agent nature and close association with social intelligence (Gandhi et al., 2023). Strategic reasoning refers to the cognitive process of anticipating, planning, and responding to others' actions to achieve specific objectives within competitive or cooperative contexts (Zhang et al., 2024a).

large language model, machine learning, natural language, (18 more...)

2410.10479

Country:

Asia > China > Hong Kong (0.24)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre:

Research Report (1.00)
Overview (0.67)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)