AITopics | Antarctica

Collaborating Authors

Antarctica

Fast Ergodic Search with Kernel Functions

Sun, Muchen, Gaggar, Ayush, Trautman, Peter, Murphey, Todd

arXiv.org Artificial IntelligenceMar-3-2024

Ergodic search enables optimal exploration of an information distribution while guaranteeing the asymptotic coverage of the search space. However, current methods typically have exponential computation complexity in the search space dimension and are restricted to Euclidean space. We introduce a computationally efficient ergodic search method. Our contributions are two-fold. First, we develop a kernel-based ergodic metric and generalize it from Euclidean space to Lie groups. We formally prove the proposed metric is consistent with the standard ergodic metric while guaranteeing linear complexity in the search space dimension. Secondly, we derive the first-order optimality condition of the kernel ergodic metric for nonlinear systems, which enables efficient trajectory optimization. Comprehensive numerical benchmarks show that the proposed method is at least two orders of magnitude faster than the state-of-the-art algorithm. Finally, we demonstrate the proposed algorithm with a peg-in-hole insertion task. We formulate the problem as a coverage task in the space of SE(3) and use a 30-second-long human demonstration as the prior distribution for ergodic coverage. Ergodicity guarantees the asymptotic solution of the peg-in-hole problem so long as the solution resides within the prior information distribution, which is seen in the 100\% success rate.

ergodic metric, metric, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2403.01536

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Illinois > Cook County > Evanston (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained Language Models

Rakotonirina, Nathanaël Carraz, Baroni, Marco

arXiv.org Artificial IntelligenceFeb-23-2024

We introduce MemoryPrompt, a leaner approach in which the LM is complemented by a small auxiliary recurrent network that passes information to the LM by prefixing its regular input with a sequence of vectors, akin to soft prompts, without requiring LM finetuning. Tested on a task designed to probe a LM's ability to keep track of multiple fact updates, a MemoryPrompt-augmented LM outperforms much larger LMs that have access to the full input history. We also test MemoryPrompt on a long-distance dialogue dataset, where its performance is comparable to that of a model conditioned on the entire conversation history. In both experiments we also observe that, unlike full-finetuning approaches, MemoryPrompt does not suffer from catastrophic forgetting when adapted to new tasks, thus not disrupting the generalist capabilities of the underlying LM.

dataset, memory vector, memoryprompt, (16 more...)

arXiv.org Artificial Intelligence

2402.15268

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Dominican Republic (0.04)
(3 more...)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

Pan, Zizheng, Zhuang, Bohan, Huang, De-An, Nie, Weili, Yu, Zhiding, Xiao, Chaowei, Cai, Jianfei, Anandkumar, Anima

arXiv.org Artificial IntelligenceFeb-21-2024

Sampling from diffusion probabilistic models (DPMs) is often expensive for high-quality image generation and typically requires many steps with a large model. In this paper, we introduce sampling Trajectory Stitching T-Stitch, a simple yet efficient technique to improve the sampling efficiency with little or no generation degradation. Instead of solely using a large DPM for the entire sampling trajectory, T-Stitch first leverages a smaller DPM in the initial steps as a cheap drop-in replacement of the larger DPM and switches to the larger DPM at a later stage. Our key insight is that different diffusion models learn similar encodings under the same training data distribution and smaller models are capable of generating good global structures in the early steps. Extensive experiments demonstrate that T-Stitch is training-free, generally applicable for different architectures, and complements most existing fast sampling techniques with flexible speed and quality trade-offs. On DiT-XL, for example, 40% of the early timesteps can be safely replaced with a 10x faster DiT-S without performance drop on class-conditional ImageNet generation. We further show that our method can also be used as a drop-in technique to not only accelerate the popular pretrained stable diffusion (SD) models but also improve the prompt alignment of stylized SD models from the public model zoo. Code is released at https://github.com/NVlabs/T-Stitch

accelerating sampling, pre-trained diffusion model, t-stitch, (14 more...)

arXiv.org Artificial Intelligence

2402.14167

Country:

North America > United States > New York (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Antarctica (0.04)

Genre:

Research Report (0.64)
Workflow (0.48)

Industry:

Media > Photography (0.46)
Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Survey on Fairness for Machine Learning on Graphs

Laclau, Charlotte, Largeron, Christine, Choudhary, Manvi

arXiv.org Artificial IntelligenceFeb-21-2024

Nowadays, the analysis of complex phenomena modeled by graphs plays a crucial role in many real-world application domains where decisions can have a strong societal impact. However, numerous studies and papers have recently revealed that machine learning models could lead to potential disparate treatment between individuals and unfair outcomes. In that context, algorithmic contributions for graph mining are not spared by the problem of fairness and present some specific challenges related to the intrinsic nature of graphs: (1) graph data is non-IID, and this assumption may invalidate many existing studies in fair machine learning, (2) suited metric definitions to assess the different types of fairness with relational data and (3) algorithmic challenge on the difficulty of finding a good trade-off between model accuracy and fairness. This survey is the first one dedicated to fairness for relational data. It aims to present a comprehensive review of state-of-the-art techniques in fairness on graph mining and identify the open challenges and future trends. In particular, we start by presenting several sensible application domains and the associated graph mining tasks with a focus on edge prediction and node classification in the sequel. We also recall the different metrics proposed to evaluate potential bias at different levels of the graph mining process; then we provide a comprehensive overview of recent contributions in the domain of fair machine learning for graphs, that we classify into pre-processing, in-processing and post-processing models. We also propose to describe existing graph data, synthetic and real-world benchmarks. Finally, we present in detail five potential promising directions to advance research in studying algorithmic fairness on graphs.

fairness, graph, node, (12 more...)

arXiv.org Artificial Intelligence

2205.05396

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(7 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.48)

Industry:

Information Technology > Security & Privacy (0.93)
Government (0.92)
Law > Statutes (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback

Quantitative causality, causality-guided scientific discovery, and causal machine learning

Liang, X. San, Chen, Dake, Zhang, Renhe

arXiv.org Artificial IntelligenceFeb-20-2024

It has been said, arguably, that causality analysis should pave a promising way to interpretable deep learning and generalization. Incorporation of causality into artificial intelligence (AI) algorithms, however, is challenged with its vagueness, non-quantitiveness, computational inefficiency, etc. During the past 18 years, these challenges have been essentially resolved, with the establishment of a rigorous formalism of causality analysis initially motivated from atmospheric predictability. This not only opens a new field in the atmosphere-ocean science, namely, information flow, but also has led to scientific discoveries in other disciplines, such as quantum mechanics, neuroscience, financial economics, etc., through various applications. This note provides a brief review of the decade-long effort, including a list of major theoretical results, a sketch of the causal deep learning framework, and some representative real-world applications in geoscience pertaining to this journal, such as those on the anthropogenic cause of global warming, the decadal prediction of El Niño Modoki, the forecasting of an extreme drought in China, among others. Keywords: Causality, Liang-Kleeman information flow, Causal artificial intelligence, Fuzzy cognitive map, Interpretability, Frobenius-Perron operator, Weather/Climate forecasting 1. Introduction Causality analysis is a fundamental problem in scientific research, as commented by Einstein in 1953 in response to a question on the status quo of science in China at that time (cf. the historical record in Hu, 2005).The recent rush in artificial intelligence (AI) has stimulated enormous interest in causal inference, partly due to the realization that it may take the field to the next level to approach human intelligence (see Pearl, 2018; Bengio, 2019; Schölkopf, 2022). In the fields pertaining to this journal, assessment of the cause-effect relations between dynamic events makes a natural objective for the corresponding researches.

causality, information flow, liang, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.34133/olar.0026

2402.13427

Country:

Pacific Ocean > North Pacific Ocean > South China Sea (0.05)
Antarctica (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.61)

Add feedback

Creating a Fine Grained Entity Type Taxonomy Using LLMs

Gunn, Michael, Park, Dohyun, Kamath, Nidhish

arXiv.org Artificial IntelligenceFeb-19-2024

In this study, we investigate the potential of GPT-4 and its advanced iteration, GPT-4 Turbo, in autonomously developing a detailed entity type taxonomy. Our objective is to construct a comprehensive taxonomy, starting from a broad classification of entity types - including objects, time, locations, organizations, events, actions, and subjects - similar to existing manually curated taxonomies. This classification is then progressively refined through iterative prompting techniques, leveraging GPT-4's internal knowledge base. The result is an extensive taxonomy comprising over 5000 nuanced entity types, which demonstrates remarkable quality upon subjective evaluation. We employed a straightforward yet effective prompting strategy, enabling the taxonomy to be dynamically expanded. The practical applications of this detailed taxonomy are diverse and significant. It facilitates the creation of new, more intricate branches through pattern-based combinations and notably enhances information extraction tasks, such as relation extraction and event argument extraction. Our methodology not only introduces an innovative approach to taxonomy creation but also opens new avenues for applying such taxonomies in various computational linguistics and AI-related fields.

gpt-4, ontology, taxonomy, (14 more...)

arXiv.org Artificial Intelligence

2402.12557

Country:

Africa (0.14)
North America > United States > Illinois (0.05)
Antarctica (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Graph-based Virtual Sensing from Sparse and Partial Multivariate Observations

De Felice, Giovanni, Cini, Andrea, Zambon, Daniele, Gusev, Vladimir V., Alippi, Cesare

arXiv.org Artificial IntelligenceFeb-19-2024

Virtual sensing techniques allow for inferring signals at new unmonitored locations by exploiting spatio-temporal measurements coming from physical sensors at different locations. However, as the sensor coverage becomes sparse due to costs or other constraints, physical proximity cannot be used to support interpolation. In this paper, we overcome this challenge by leveraging dependencies between the target variable and a set of correlated variables (covariates) that can frequently be associated with each location of interest. From this viewpoint, covariates provide partial observability, and the problem consists of inferring values for unobserved channels by exploiting observations at other locations to learn how such variables can correlate. We introduce a novel graph-based methodology to exploit such relationships and design a graph deep learning architecture, named GgNet, implementing the framework. The proposed approach relies on propagating information over a nested graph structure that is used to learn dependencies between variables as well as locations. GgNet is extensively evaluated under different virtual sensing scenarios, demonstrating higher reconstruction accuracy compared to the state-of-the-art.

conference paper, dataset, dependency, (17 more...)

arXiv.org Artificial Intelligence

2402.12598

Country:

North America > United States (0.67)
Indian Ocean (0.04)
Antarctica > French Southern Territories > Kerguelen > Port-aux-Francais (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.50)

Industry:

Energy > Renewable > Solar (0.69)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Knowledge Editing on Black-box Large Language Models

Song, Xiaoshuai, Wang, Zhengyang, He, Keqing, Dong, Guanting, Mou, Yutao, Zhao, Jinxu, Xu, Weiran

arXiv.org Artificial IntelligenceFeb-17-2024

Knowledge editing (KE) aims to efficiently and precisely modify the behavior of large language models (LLMs) to update specific knowledge without negatively influencing other knowledge. Current research primarily focuses on white-box LLMs editing, overlooking an important scenario: black-box LLMs editing, where LLMs are accessed through interfaces and only textual output is available. In this paper, we first officially introduce KE on black-box LLMs and then propose a comprehensive evaluation framework to overcome the limitations of existing evaluations that are not applicable to black-box LLMs editing and lack comprehensiveness. To tackle privacy leaks of editing data and style over-editing in current methods, we introduce a novel postEdit framework, resolving privacy concerns through downstream post-processing and maintaining textual style consistency via fine-grained editing to original responses. Experiments and analysis on two benchmarks demonstrate that postEdit outperforms all baselines and achieves strong generalization, especially with huge improvements on style retention (average $+20.82\%\uparrow$).

editing, knowledge, llm, (14 more...)

arXiv.org Artificial Intelligence

2402.08631

Country:

North America > United States (0.94)
Asia > Singapore (0.04)
Asia > China > Beijing > Beijing (0.04)
(9 more...)

Genre: Research Report (0.82)

Industry:

Transportation > Air (1.00)
Government > Regional Government > North America Government > United States Government (0.68)
Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Energy-aware Multi-UAV Coverage Mission Planning with Optimal Speed of Flight

Datsko, Denys, Nekovar, Frantisek, Penicka, Robert, Saska, Martin

arXiv.org Artificial IntelligenceFeb-16-2024

This paper tackles the problem of planning minimum-energy coverage paths for multiple UAVs. The addressed Multi-UAV Coverage Path Planning (mCPP) is a crucial problem for many UAV applications such as inspection and aerial survey. However, the typical path-length objective of existing approaches does not directly minimize the energy consumption, nor allows for constraining energy of individual paths by the battery capacity. To this end, we propose a novel mCPP method that uses the optimal flight speed for minimizing energy consumption per traveled distance and a simple yet precise energy consumption estimation algorithm that is utilized during the mCPP planning phase. The method decomposes a given area with boustrophedon decomposition and represents the mCPP as an instance of Multiple Set Traveling Salesman Problem with a minimum energy objective and energy consumption constraint. The proposed method is shown to outperform state-of-the-art methods in terms of computational time and energy efficiency of produced paths. The experimental results show that the accuracy of the energy consumption estimation is on average 97% compared to real flight consumption. The feasibility of the proposed method was verified in a real-world coverage experiment with two UAVs.

algorithm, consumption, energy consumption, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2024.3358581

2402.10529

Country:

Antarctica (0.04)
Europe > Czechia > Prague (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry:

Energy > Energy Storage (0.48)
Government > Military (0.41)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Physics-informed machine learning as a kernel method

Doumèche, Nathan, Bach, Francis, Boyer, Claire, Biau, Gérard

arXiv.org Artificial IntelligenceFeb-12-2024

Physics-informed machine learning combines the expressiveness of data-based approaches with the interpretability of physical models. In this context, we consider a general regression problem where the empirical risk is regularized by a partial differential equation that quantifies the physical inconsistency. We prove that for linear differential priors, the problem can be formulated as a kernel regression task. Taking advantage of kernel theory, we derive convergence rates for the minimizer of the regularized risk and show that it converges at least at the Sobolev minimax rate. However, faster rates can be achieved, depending on the physical error. This principle is illustrated with a one-dimensional example, supporting the claim that regularizing the empirical risk with physical information can be beneficial to the statistical performance of estimators.

eigenvalue, informed machine learning, operator, (12 more...)

arXiv.org Artificial Intelligence

2402.07514

Country:

North America > United States > New York (0.04)
Europe > France (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Antarctica (0.04)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback