AITopics | usr

Collaborating Authors

usr

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LUNE: Efficient LLM Unlearning via LoRA Fine-Tuning with Negative Examples

Liu, Yezi, Chen, Hanning, Huang, Wenjun, Ni, Yang, Imani, Mohsen

arXiv.org Artificial IntelligenceDec-9-2025

Large language models (LLMs) possess vast knowledge acquired from extensive training corpora, but they often cannot remove specific pieces of information when needed, which makes it hard to handle privacy, bias mitigation, and knowledge correction. Traditional model unlearning approaches require computationally expensive fine-tuning or direct weight editing, making them impractical for real-world deployment. In this work, we introduce LoRA-based Unlearning with Negative Examples (LUNE), a lightweight framework that performs negative-only unlearning by updating only low-rank adapters while freezing the backbone, thereby localizing edits and avoiding disruptive global changes. Leveraging Low-Rank Adaptation (LoRA), LUNE targets intermediate representations to suppress (or replace) requested knowledge with an order-of-magnitude lower compute and memory than full fine-tuning or direct weight editing. Extensive experiments on multiple factual unlearning tasks show that LUNE: (I) achieves effectiveness comparable to full fine-tuning and memory-editing methods, and (II) reduces computational cost by about an order of magnitude.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2512.07375

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Optimal Sequential Recommendations: Exploiting User and Item Structure

Karzand, Mina, Bresler, Guy

arXiv.org Machine LearningApr-28-2025

Given the importance of these recommendation algorithms, it makes sense to try to design optimal ones. A basic criterion for optimality, that captures the first-order experience of users in a recommendation system, is to maximize the proportion of recommendations that are liked, 1 similar to [11, 23] The goal of this paper is to gain insight into the design of recommendation algorithms by finding a statistically optimal algorithm within the context of a natural model for recommendation systems. One of our findings is that the best way to obtain information about users and items in order to make good recommendations depends on the time horizon and its relation to various system parameters including the number of users, the diversity of users, and richness of the items; there are a number of operating regimes depending on these parameters. It goes without saying that the nature of any insight obtained is intertwined with the choice of model. We use the same model as [11], closely related to those studied in [10, 12]. The model is different from those in other papers on the topic; we now motivate its key features.

artificial intelligence, machine learning, recommendation, (16 more...)

arXiv.org Machine Learning

2504.19476

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Yolo County > Davis (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

STARS: Sensor-agnostic Transformer Architecture for Remote Sensing

King, Ethan, Rodriguez, Jaime, Llanes, Diego, Doster, Timothy, Emerson, Tegan, Koch, James

arXiv.org Artificial IntelligenceNov-8-2024

We present a sensor-agnostic spectral transformer as the basis for spectral foundation models. To that end, we introduce a Universal Spectral Representation (USR) that leverages sensor meta-data, such as sensing kernel specifications and sensing wavelengths, to encode spectra obtained from any spectral instrument into a common representation, such that a single model can ingest data from any sensor. Furthermore, we develop a methodology for pre-training such models in a self-supervised manner using a novel random sensor-augmentation and reconstruction pipeline to learn spectral features independent of the sensing paradigm. We demonstrate that our architecture can learn sensor independent spectral features that generalize effectively to sensors not seen during training. This work sets the stage for training foundation models that can both leverage and be effective for the growing diversity of spectral data.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.05714

Country:

North America > United States > Washington > Benton County > Richland (0.04)
North America > United States > Texas > Harris County > Houston (0.04)

Genre: Research Report (0.50)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.42)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

SurroFlow: A Flow-Based Surrogate Model for Parameter Space Exploration and Uncertainty Quantification

Shen, Jingyi, Duan, Yuhan, Shen, Han-Wei

arXiv.org Artificial IntelligenceJul-16-2024

Existing deep learning-based surrogate models facilitate efficient data generation, but fall short in uncertainty quantification, efficient parameter space exploration, and reverse prediction. In our work, we introduce SurroFlow, a novel normalizing flow-based surrogate model, to learn the invertible transformation between simulation parameters and simulation outputs. The model not only allows accurate predictions of simulation outcomes for a given simulation parameter but also supports uncertainty quantification in the data generation process. Additionally, it enables efficient simulation parameter recommendation and exploration. We integrate SurroFlow and a genetic algorithm as the backend of a visual interface to support effective user-guided ensemble simulation exploration and visualization. Our framework significantly reduces the computational costs while enhancing the reliability and exploration capabilities of scientific surrogate models.

scientist, simulation parameter, surroflow, (12 more...)

arXiv.org Artificial Intelligence

2407.12884

Country:

Pacific Ocean (0.04)
North America > United States > Ohio (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
Atlantic Ocean > North Atlantic Ocean > Baltic Sea (0.04)

Genre: Research Report (0.50)

Industry:

Energy (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Domain Private Transformers for Multi-Domain Dialog Systems

Kabra, Anmol, Elenberg, Ethan R.

arXiv.org Artificial IntelligenceDec-7-2023

Large, general purpose language models have demonstrated impressive performance across many different conversational domains. While multi-domain language models achieve low overall perplexity, their outputs are not guaranteed to stay within the domain of a given input prompt. This paper proposes domain privacy as a novel way to quantify how likely a conditional language model will leak across domains. We also develop policy functions based on token-level domain classification, and propose an efficient fine-tuning method to improve the trained model's domain privacy. Experiments on membership inference attacks show that our proposed method has comparable resiliency to methods adapted from recent literature on differentially private language models.

privacy, redaction schedule, usr, (14 more...)

arXiv.org Artificial Intelligence

2305.14208

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Virginia (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (0.68)
Transportation > Passenger (0.46)
Transportation > Air (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.40)

Add feedback

'What are you referring to?' Evaluating the Ability of Multi-Modal Dialogue Models to Process Clarificational Exchanges

Chiyah-Garcia, Javier, Suglia, Alessandro, Eshghi, Arash, Hastie, Helen

arXiv.org Artificial IntelligenceJul-28-2023

Referential ambiguities arise in dialogue when a referring expression does not uniquely identify the intended referent for the addressee. Addressees usually detect such ambiguities immediately and work with the speaker to repair it using meta-communicative, Clarificational Exchanges (CE): a Clarification Request (CR) and a response. Here, we argue that the ability to generate and respond to CRs imposes specific constraints on the architecture and objective functions of multi-modal, visually grounded dialogue models. We use the SIMMC 2.0 dataset to evaluate the ability of different state-of-the-art model architectures to process CEs, with a metric that probes the contextual updates that arise from them in the model. We find that language-based models are able to encode simple multi-modal semantic information and process some CEs, excelling with those related to the dialogue history, whilst multi-modal models can use additional learning objectives to obtain disentangled object representations, which become crucial to handle complex referential ambiguities across modalities overall.

machine learning, natural language, object-oriented architecture, (19 more...)

arXiv.org Artificial Intelligence

2307.15554

Country:

North America > Dominican Republic (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(9 more...)

Genre:

Personal > Interview (0.83)
Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.49)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.49)

Add feedback

Revisiting the DARPA Communicator Data using Conversation Analysis

Wallis, Peter

arXiv.org Artificial IntelligenceJul-13-2023

The state of the art in human computer conversation leaves something to be desired and, indeed, talking to a computer can be down-right annoying. This paper describes an approach to identifying ``opportunities for improvement'' in these systems by looking for abuse in the form of swear words. The premise is that humans swear at computers as a sanction and, as such, swear words represent a point of failure where the system did not behave as it should. Having identified where things went wrong, we can work backward through the transcripts and, using conversation analysis (CA) work out how things went wrong. Conversation analysis is a qualitative methodology and can appear quite alien - indeed unscientific - to those of us from a quantitative background. The paper starts with a description of Conversation analysis in its modern form, and then goes on to apply the methodology to transcripts of frustrated and annoyed users in the DARPA Communicator project. The conclusion is that there is at least one species of failure caused by the inability of the Communicator systems to handle mixed initiative at the discourse structure level. Along the way, I hope to demonstrate that there is an alternative future for computational linguistics that does not rely on larger and larger text corpora.

artificial intelligence, chatbot, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1075/is.9.3.05wal

2307.06982

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
North America > United States > Illinois > Cook County > Chicago (0.05)
North America > Canada > Ontario > Toronto (0.04)
(5 more...)

Genre: Research Report (0.50)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Consumer Products & Services > Travel (1.00)
(2 more...)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Add feedback

Refine and Imitate: Reducing Repetition and Inconsistency in Persuasion Dialogues via Reinforcement Learning and Human Demonstration

Shi, Weiyan, Li, Yu, Sahay, Saurav, Yu, Zhou

arXiv.org Artificial IntelligenceDec-30-2020

Despite the recent success of large-scale language models on various downstream NLP tasks, the repetition and inconsistency problems still persist in dialogue response generation. Previous approaches have attempted to avoid repetition by penalizing the language model's undesirable behaviors in the loss function. However, these methods focus on token-level information and can lead to incoherent responses and uninterpretable behaviors. To alleviate these issues, we propose to apply reinforcement learning to refine an MLE-based language model without user simulators, and distill sentence-level information about repetition, inconsistency and task relevance through rewards. In addition, to better accomplish the dialogue task, the model learns from human demonstration to imitate intellectual activities such as persuasion, and selects the most persuasive responses. Experiments show that our model outperforms previous state-of-the-art dialogue models on both automatic metrics and human evaluation results on a donation persuasion task, and generates more diverse, consistent and persuasive conversations according to the user feedback.

dialogue, donation, language model, (15 more...)

arXiv.org Artificial Intelligence

2012.15375

Country:

North America > Puerto Rico (0.04)
North America > United States > California > Yolo County > Davis (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.46)

Add feedback

Resource Constrained Dialog Policy Learning via Differentiable Inductive Logic Programming

Zhou, Zhenpeng, Beirami, Ahmad, Crook, Paul, Shah, Pararth, Subba, Rajen, Geramifard, Alborz

arXiv.org Artificial IntelligenceNov-10-2020

Motivated by the needs of resource constrained dialog policy learning, we introduce dialog policy via differentiable inductive logic (DILOG). We explore the tasks of one-shot learning and zero-shot domain transfer with DILOG on SimDial and MultiWoZ. Using a single representative dialog from the restaurant domain, we train DILOG on the SimDial dataset and obtain 99 % in-domain test accuracy. We also show that the trained DILOG zero-shot transfers to all other domains with 99 % accuracy, proving the suitability of DILOG to slot-filling dialogs. We further extend our study to the MultiWoZ dataset achieving 90 % inform and success metrics. We also observe that these metrics are not capturing some of the shortcomings of DILOG in terms of false positives, prompting us to measure an auxiliary Action F1 score. We show that DILOG is 100x more data efficient than state-of-the-art neural approaches on MultiWoZ while achieving similar performance metrics. We conclude with a discussion on the strengths and weaknesses of DILOG.

arxiv preprint arxiv, dilog, food pref, (13 more...)

arXiv.org Artificial Intelligence

2011.05457

Genre: Research Report (0.50)

Industry: Consumer Products & Services > Restaurants (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System

Wang, Jianhong, Zhang, Yuan, Kim, Tae-Kyun, Gu, Yunjie

arXiv.org Artificial IntelligenceJul-23-2020

Designing task-oriented dialogue systems is a challenging research topic, since it needs not only to generate utterances fulfilling user requests but also to guarantee the comprehensibility. Many previous works trained end-to-end (E2E) models with supervised learning (SL), however, the bias in annotated system utterances remains as a bottleneck. Reinforcement learning (RL) deals with the problem through using non-differentiable evaluation metrics (e.g., the success rate) as rewards. Nonetheless, existing works with RL showed that the comprehensibility of generated system utterances could be corrupted when improving the performance on fulfilling user requests. In our work, we (1) propose modelling the hierarchical structure between dialogue policy and natural language generator (NLG) with the option framework, called HDNO; (2) train HDNO with hierarchical reinforcement learning (HRL), as well as suggest alternating updates between dialogue policy and NLG during HRL inspired by fictitious play, to preserve the comprehensibility of generated system utterances while improving fulfilling user requests; and (3) propose using a discriminator modelled with language models as an additional reward to further improve the comprehensibility. We test HDNO on MultiWoz 2.0 and MultiWoz 2.1, the datasets on multi-domain dialogues, in comparison with word-level E2E model trained with RL, LaRL and HDSA, showing a significant improvement on the total performance evaluated with automatic metrics.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2006.06814

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
(2 more...)

Add feedback