AITopics

Explainable machine learning models have recently emerged as an important part of the research in artificial intelligence and aim at devising methods and techniques that are understandable for humans [26, 9, 18]. In this field, the use of concepts from topology and geometry has enabled developments that promise to make machine learning more easily interpretable, as required in many critical applications, where security and reliability are crucial elements. The research about group equivariant non-expansive operators (GENEOs) fits into this scientific context, offering the possibility of building small networks of operators that process the available information in a transparent and easily controllable way [5, 24, 7]. GENEOs have their roots in Topological Data Analysis and make available a mathematical theory for the approximation of observers, including their symmetries and shifting the attention from the data alone to the pairs (data, observer), seen as the main object of study. This change of perspective is justified by the fact that in many applications, the interest is not directly focused on data, but on approximating the experts' behavior in the presence of some given information [12]. It is indeed well known that different agents can react in completely different ways to the presence of the same data, and this implies that data comparison cannot be separated from the problem of understanding observers' characteristics and preferences.

geneo, graph, permutant, (14 more...)

2406.08045

Country:

Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.05)
North America > United States > Nevada > Clark County > Las Vegas (0.04)

Genre:

Research Report > Promising Solution (0.50)
Overview > Innovation (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Jaganathan, Vishnu, Huang, Hannah Hanyun, Irshad, Muhammad Zubair, Jampani, Varun, Raj, Amit, Kira, Zsolt

ICE-G: Image Conditional Editing of 3D Gaussian Splats

Recently many techniques have emerged to create high quality 3D assets and scenes. When it comes to editing of these objects, however, existing approaches are either slow, compromise on quality, or do not provide enough customization. We introduce a novel approach to quickly edit a 3D model from a single reference view. Our technique first segments the edit image, and then matches semantically corresponding regions across chosen segmented dataset views using DINO features. A color or texture change from a particular region of the edit image can then be applied to other views automatically in a semantically sensible manner. These edited views act as an updated dataset to further train and re-style the 3D scene. The end-result is therefore an edited 3D model. Our framework enables a wide variety of editing tasks such as manual local edits, correspondence based style transfer from any example image, and a combination of different styles from multiple example images. We use Gaussian Splats as our primary 3D representation due to their speed and ease of local editing, but our technique works for other methods such as NeRFs as well. We show through multiple examples that our method produces higher quality results while offering fine-grained control of editing. Project page: ice-gaussian.github.io

editing, gaussian splat, texture, (15 more...)

2406.08488

Country: Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre:

Research Report (0.70)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Mbata, Anthony, Sripada, Yaji, Zhong, Mingjun

A Survey of Pipeline Tools for Data Engineering

Currently, a variety of pipeline tools are available for use in data engineering. Data scientists can use these tools to resolve data wrangling issues associated with data and accomplish some data engineering tasks from data ingestion through data preparation to utilization as input for machine learning (ML). Some of these tools have essential built-in components or can be combined with other tools to perform desired data engineering operations. While some tools are wholly or partly commercial, several open-source tools are available to perform expert-level data engineering tasks. This survey examines the broad categories and examples of pipeline tools based on their design and data engineering intentions. These categories are Extract Transform Load/Extract Load Transform (ETL/ELT), pipelines for Data Integration, Ingestion, and Transformation, Data Pipeline Orchestration and Workflow Management, and Machine Learning Pipelines. The survey also provides a broad outline of the utilization with examples within these broad groups and finally, a discussion is presented with case studies indicating the usage of pipeline tools for data engineering. The studies present some first-user application experiences with sample data, some complexities of the applied pipeline, and a summary note of approaches to using these tools to prepare data for machine learning.

integration, pipeline, python, (14 more...)

2406.08335

Country: Europe > United Kingdom (0.04)

Genre: Overview (1.00)

Industry:

Information Technology > Services (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Quality (1.00)
(4 more...)

Applications of Explainable artificial intelligence in Earth system science

Huang, Feini, Jiang, Shijie, Li, Lu, Zhang, Yongkun, Zhang, Ye, Zhang, Ruqing, Li, Qingliang, Li, Danxi, Shangguan, Wei, Dai, Yongjiu

In recent years, artificial intelligence (AI) rapidly accelerated its influence and is expected to promote the development of Earth system science (ESS) if properly harnessed. In application of AI to ESS, a significant hurdle lies in the interpretability conundrum, an inherent problem of black-box nature arising from the complexity of AI algorithms. To address this, explainable AI (XAI) offers a set of powerful tools that make the models more transparent. The purpose of this review is twofold: First, to provide ESS scholars, especially newcomers, with a foundational understanding of XAI, serving as a primer to inspire future research advances; second, to encourage ESS professionals to embrace the benefits of AI, free from preconceived biases due to its lack of interpretability. We begin with elucidating the concept of XAI, along with typical methods. We then delve into a review of XAI applications in the ESS literature, highlighting the important role that XAI has played in facilitating communication with AI model decisions, improving model diagnosis, and uncovering scientific insights. We identify four significant challenges that XAI faces within the ESS, and propose solutions. Furthermore, we provide a comprehensive illustration of multifaceted perspectives. Given the unique challenges in ESS, an interpretable hybrid approach that seamlessly integrates AI with domain-specific knowledge appears to be a promising way to enhance the utility of AI in ESS. A visionary outlook for ESS envisions a harmonious blend where process-based models govern the known, AI models explore the unknown, and XAI bridges the gap by providing explanations.

application, explanation, xai, (14 more...)

2406.11882

Country:

Europe > Austria > Vienna (0.14)
Europe > Switzerland (0.04)
Europe > Monaco (0.04)
(26 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.46)

Industry:

Food & Agriculture > Agriculture (1.00)
Information Technology > Security & Privacy (0.92)
Health & Medicine > Therapeutic Area (0.67)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Kang, Hao, Xiong, Chenyan

ResearchArena: Benchmarking LLMs' Ability to Collect and Organize Information as Research Agents

Large language models (LLMs) have exhibited remarkable performance across various tasks in natural language processing. Nevertheless, challenges still arise when these tasks demand domain-specific expertise and advanced analytical skills, such as conducting research surveys on a designated topic. In this research, we develop ResearchArena, a benchmark that measures LLM agents' ability to conduct academic surveys, an initial step of academic research process. Specifically, we deconstructs the surveying process into three stages 1) information discovery: locating relevant papers, 2) information selection: assessing papers' importance to the topic, and 3) information organization: organizing papers into meaningful structures. In particular, we establish an offline environment comprising 12.0M full-text academic papers and 7.9K survey papers, which evaluates agents' ability to locate supporting materials for composing the survey on a topic, rank the located papers based on their impact, and organize these into a hierarchical knowledge mind-map. With this benchmark, we conduct preliminary evaluations of existing techniques and find that all LLM-based methods under-performing when compared to basic keyword-based retrieval techniques, highlighting substantial opportunities for future research.

arxiv preprint arxiv, death penalty, survey paper, (14 more...)

2406.10291

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Yemen > Amran Governorate > Amran (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Overview (1.00)
Research Report (0.82)

Industry: Law > Criminal Law (0.32)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey

Yang, Hao, Zhao, Yanyan, Wu, Yang, Wang, Shilong, Zheng, Tian, Zhang, Hongbo, Che, Wanxiang, Qin, Bing

Compared to traditional sentiment analysis, which only considers text, multimodal sentiment analysis needs to consider emotional signals from multimodal sources simultaneously and is therefore more consistent with the way how humans process sentiment in real-world scenarios. It involves processing emotional information from various sources such as natural language, images, videos, audio, physiological signals, etc. However, although other modalities also contain diverse emotional cues, natural language usually contains richer contextual information and therefore always occupies a crucial position in multimodal sentiment analysis. The emergence of ChatGPT has opened up immense potential for applying large language models (LLMs) to text-centric multimodal tasks. However, it is still unclear how existing LLMs can adapt better to text-centric multimodal sentiment analysis tasks. This survey aims to (1) present a comprehensive review of recent research in text-centric multimodal sentiment analysis tasks, (2) examine the potential of LLMs for text-centric multimodal sentiment analysis, outlining their approaches, advantages, and limitations, (3) summarize the application scenarios of LLM-based multimodal sentiment analysis technology, and (4) explore the challenges and potential research directions for multimodal sentiment analysis in the future.

multimodal sentiment analysis, representation, sentiment analysis, (14 more...)

2406.08068

Country:

North America > United States > Washington > King County > Seattle (0.14)
Asia > China > Heilongjiang Province > Harbin (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(8 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Media (1.00)
Information Technology (1.00)
Health & Medicine > Therapeutic Area (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(2 more...)

cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers

Sundar, Anirudh, Xu, Jin, Gay, William, Richardson, Christopher, Heck, Larry

An emerging area of research in situated and multimodal interactive conversations (SIMMC) includes interactions in scientific papers. Since scientific papers are primarily composed of text, equations, figures, and tables, SIMMC methods must be developed specifically for each component to support the depth of inquiry and interactions required by research scientists. This work introduces Conversational Papers (cPAPERS), a dataset of conversational question-answer pairs from reviews of academic papers grounded in these paper components and their associated references from scientific documents available on arXiv. We present a data collection strategy to collect these question-answer pairs from OpenReview and associate them with contextual information from LaTeX source files. Additionally, we present a series of baseline approaches utilizing Large Language Models (LLMs) in both zero-shot and fine-tuned configurations to address the cPAPERS dataset.

computational linguistic, dataset, equation, (13 more...)

2406.08398

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
(13 more...)

Genre:

Overview (0.67)
Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Vitorino, João, Maia, Eva, Praça, Isabel

Adversarial Evasion Attack Efficiency against Large Language Models

Large Language Models (LLMs) are valuable for text classification, but their vulnerabilities must not be disregarded. They lack robustness against adversarial examples, so it is pertinent to understand the impacts of different types of perturbations, and assess if those attacks could be replicated by common users with a small amount of perturbations and a small number of queries to a deployed LLM. This work presents an analysis of the effectiveness, efficiency, and practicality of three different types of adversarial attacks against five different LLMs in a sentiment classification task. The obtained results demonstrated the very distinct impacts of the word-level and character-level attacks. The word attacks were more effective, but the character and more constrained attacks were more practical and required a reduced number of perturbations and queries. These differences need to be considered during the development of adversarial defense strategies to train more robust LLMs for intelligent text classification applications.

llm, perturbation, text sample, (13 more...)

2406.0805

Country:

Europe > Portugal > Porto > Porto (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Overview (0.46)
Research Report (0.40)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

LLM-Assisted Light: Leveraging Large Language Model Capabilities for Human-Mimetic Traffic Signal Control in Complex Urban Environments

Wang, Maonan, Pang, Aoyu, Kan, Yuheng, Pun, Man-On, Chen, Chung Shue, Huang, Bo

Traffic congestion in metropolitan areas presents a formidable challenge with far-reaching economic, environmental, and societal ramifications. Therefore, effective congestion management is imperative, with traffic signal control (TSC) systems being pivotal in this endeavor. Conventional TSC systems, designed upon rule-based algorithms or reinforcement learning (RL), frequently exhibit deficiencies in managing the complexities and variabilities of urban traffic flows, constrained by their limited capacity for adaptation to unfamiliar scenarios. In response to these limitations, this work introduces an innovative approach that integrates Large Language Models (LLMs) into TSC, harnessing their advanced reasoning and decision-making faculties. Specifically, a hybrid framework that augments LLMs with a suite of perception and decision-making tools is proposed, facilitating the interrogation of both the static and dynamic traffic information. This design places the LLM at the center of the decision-making process, combining external traffic data with established TSC methods. Moreover, a simulation platform is developed to corroborate the efficacy of the proposed framework. The findings from our simulations attest to the system's adeptness in adjusting to a multiplicity of traffic environments without the need for additional training. Notably, in cases of Sensor Outage (SO), our approach surpasses conventional RL-based systems by reducing the average waiting time by $20.4\%$. This research signifies a notable advance in TSC strategies and paves the way for the integration of LLMs into real-world, dynamic scenarios, highlighting their potential to revolutionize traffic management. The related code is available at https://github.com/Traffic-Alpha/LLM-Assisted-Light.

intersection, la-light, scenario, (14 more...)

2403.08337

Country:

North America > United States (0.14)
Asia > China > Shanghai > Shanghai (0.06)
Asia > China > Hong Kong (0.04)
(3 more...)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Chen, Daiwei, Chen, Yi, Rege, Aniket, Vinayak, Ramya Korlakai

PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences

Large foundation models pretrained on raw web-scale data are not readily deployable without additional step of extensive alignment to human preferences. Such alignment is typically done by collecting large amounts of pairwise comparisons from humans ("Do you prefer output A or B?") and learning a reward model or a policy with the Bradley-Terry-Luce (BTL) model as a proxy for a human's underlying implicit preferences. These methods generally suffer from assuming a universal preference shared by all humans, which lacks the flexibility of adapting to plurality of opinions and preferences. In this work, we propose PAL, a framework to model human preference complementary to existing pretraining strategies, which incorporates plurality from the ground up. We propose using the ideal point model as a lens to view alignment using preference comparisons. Together with our novel reformulation and using mixture modeling, our framework captures the plurality of population preferences while simultaneously learning a common preference latent space across different preferences, which can few-shot generalize to new, unseen users. Our approach enables us to use the penultimate-layer representation of large foundation models and simple MLP layers to learn reward functions that are on-par with the existing large state-of-the-art reward models, thereby enhancing efficiency of reward modeling significantly. We show that PAL achieves competitive reward model accuracy compared to strong baselines on 1) Language models with Summary dataset ; 2) Image Generative models with Pick-a-Pic dataset ; 3) A new semisynthetic heterogeneous dataset generated using Anthropic Personas. Finally, our experiments also highlight the shortcoming of current preference datasets that are created using rigid rubrics which wash away heterogeneity, and call for more nuanced data collection approaches.

arxiv preprint arxiv, dataset, pluralistic alignment framework, (11 more...)

2406.08469

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Israel (0.04)

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(3 more...)