AITopics | ree

Collaborating Authors

ree

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sparse Activation Editing for Reliable Instruction Following in Narratives

Zhao, Runcong, Cao, Chengyu, Zhu, Qinglin, Lv, Xiucheng, Shao, Shun, Gui, Lin, Xu, Ruifeng, He, Yulan

arXiv.org Artificial IntelligenceMay-23-2025

Complex narrative contexts often challenge language models' ability to follow instructions, and existing benchmarks fail to capture these difficulties. To address this, we propose Concise-SAE, a training-free framework that improves instruction following by identifying and editing instruction-relevant neurons using only natural language instructions, without requiring labelled data. To thoroughly evaluate our method, we introduce FreeInstruct, a diverse and realistic benchmark of 1,212 examples that highlights the challenges of instruction following in narrative-rich settings. While initially motivated by complex narratives, Concise-SAE demonstrates state-of-the-art instruction adherence across varied tasks without compromising generation quality.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.16505

Country:

Europe > Austria > Vienna (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
North America > Canada > Ontario > Toronto (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry: Education > Curriculum > Subject-Specific Education (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Provably Overwhelming Transformer Models with Designed Inputs

Stambler, Lev, Nezhadi, Seyed Sajjad, Coudron, Matthew

arXiv.org Artificial IntelligenceFeb-9-2025

We develop an algorithm which, given a trained transformer model $\mathcal{M}$ as input, as well as a string of tokens $s$ of length $n_{fix}$ and an integer $n_{free}$, can generate a mathematical proof that $\mathcal{M}$ is ``overwhelmed'' by $s$, in time and space $\widetilde{O}(n_{fix}^2 + n_{free}^3)$. We say that $\mathcal{M}$ is ``overwhelmed'' by $s$ when the output of the model evaluated on this string plus any additional string $t$, $\mathcal{M}(s + t)$, is completely insensitive to the value of the string $t$ whenever length($t$) $\leq n_{free}$. Along the way, we prove a particularly strong worst-case form of ``over-squashing'', which we use to bound the model's behavior. Our technique uses computer-aided proofs to establish this type of operationally relevant guarantee about transformer models. We empirically test our algorithm on a single layer transformer complete with an attention head, layer-norm, MLP/ReLU layers, and RoPE positional encoding. We believe that this work is a stepping stone towards the difficult task of obtaining useful guarantees for trained transformer models.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.06038

Country:

North America > United States > Maryland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings

Deiseroth, Björn, Brack, Manuel, Schramowski, Patrick, Kersting, Kristian, Weinbach, Samuel

arXiv.org Artificial IntelligenceJun-27-2024

Tokenizers are crucial for encoding information in Large Language Models, but their development has recently stagnated, and they contain inherent weaknesses. Major limitations include computational overhead, ineffective vocabulary use, and unnecessarily large embedding and head layers. Additionally, their performance is biased towards a reference corpus, leading to reduced effectiveness for underrepresented languages. To remedy these issues, we propose T-FREE, which directly embeds words through sparse activation patterns over character triplets, and does not require a reference corpus. T-FREE inherently exploits morphological similarities and allows for strong compression of embedding layers. In our exhaustive experimental evaluation, we achieve competitive downstream performance with a parameter reduction of more than 85% on these layers. Further, T-FREE shows significant improvements in cross-lingual transfer learning.

ree, tokenizer, whitespace, (16 more...)

arXiv.org Artificial Intelligence

2406.19223

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals

Yang, Ruihan, Chen, Jiangjie, Zhang, Yikai, Yuan, Siyu, Chen, Aili, Richardson, Kyle, Xiao, Yanghua, Yang, Deqing

arXiv.org Artificial IntelligenceJun-7-2024

Language agents powered by large language models (LLMs) are increasingly valuable as decision-making tools in domains such as gaming and programming. However, these agents often face challenges in achieving high-level goals without detailed instructions and in adapting to environments where feedback is delayed. In this paper, we present SelfGoal, a novel automatic approach designed to enhance agents' capabilities to achieve high-level goals with limited human prior and environmental feedback. The core concept of SelfGoal involves adaptively breaking down a high-level goal into a tree structure of more practical subgoals during the interaction with environments while identifying the most useful subgoals and progressively updating this structure. Experimental results demonstrate that SelfGoal significantly enhances the performance of language agents across various tasks, including competitive, cooperative, and deferred feedback environments. Project page: https://selfgoal-agent.github.io.

agent, high-level goal, subgoal, (16 more...)

arXiv.org Artificial Intelligence

2406.04784

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

TBNet: A Neural Architectural Defense Framework Facilitating DNN Model Protection in Trusted Execution Environments

Liu, Ziyu, Zhou, Tong, Luo, Yukui, Xu, Xiaolin

arXiv.org Artificial IntelligenceMay-6-2024

Trusted Execution Environments (TEEs) have become a promising solution to secure DNN models on edge devices. However, the existing solutions either provide inadequate protection or introduce large performance overhead. Taking both security and performance into consideration, this paper presents TBNet, a TEE-based defense framework that protects DNN model from a neural architectural perspective. Specifically, TBNet generates a novel Two-Branch substitution model, to respectively exploit (1) the computational resources in the untrusted Rich Execution Environment (REE) for latency reduction and (2) the physically-isolated TEE for model protection. Experimental results on a Raspberry Pi across diverse DNN model architectures and datasets demonstrate that TBNet achieves efficient model protection at a low cost.

tbnet, tee, victim model, (13 more...)

arXiv.org Artificial Intelligence

2405.03974

Country:

North America > United States > California > San Francisco County > San Francisco (0.16)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Bristol County > Dartmouth (0.04)
Asia > China > Heilongjiang Province > Daqing (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Hardware (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Memory-Efficient and Secure DNN Inference on TrustZone-enabled Consumer IoT Devices

Xie, Xueshuo, Wang, Haoxu, Jian, Zhaolong, Li, Tao, Wang, Wei, Xu, Zhiwei, Wang, Guiling

arXiv.org Artificial IntelligenceMar-19-2024

Edge intelligence enables resource-demanding Deep Neural Network (DNN) inference without transferring original data, addressing concerns about data privacy in consumer Internet of Things (IoT) devices. For privacy-sensitive applications, deploying models in hardware-isolated trusted execution environments (TEEs) becomes essential. However, the limited secure memory in TEEs poses challenges for deploying DNN inference, and alternative techniques like model partitioning and offloading introduce performance degradation and security issues. In this paper, we present a novel approach for advanced model deployment in TrustZone that ensures comprehensive privacy preservation during model inference. We design a memory-efficient management method to support memory-demanding inference in TEEs. By adjusting the memory priority, we effectively mitigate memory leakage risks and memory overlap conflicts, resulting in 32 lines of code alterations in the trusted operating system. Additionally, we leverage two tiny libraries: S-Tinylib (2,538 LoCs), a tiny deep learning library, and Tinylibm (827 LoCs), a tiny math library, to support efficient inference in TEEs. We implemented a prototype on Raspberry Pi 3B+ and evaluated it using three well-known lightweight DNN models. The experimental results demonstrate that our design significantly improves inference speed by 3.13 times and reduces power consumption by over 66.5% compared to non-memory optimization method in TEEs.

inference, memory size, tee, (15 more...)

arXiv.org Artificial Intelligence

2403.12568

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New Jersey (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Software (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Minimizing Uncertainty in Pipelines

Neural Information Processing SystemsMar-14-2024, 13:48:32 GMT

In this paper, we consider the problem of debugging large pipelines by human labeling.

algorithm, node, probability, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Netherlands > Gelderland > Nijmegen (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy

Ippolito, Daphne, Tramèr, Florian, Nasr, Milad, Zhang, Chiyuan, Jagielski, Matthew, Lee, Katherine, Choquette-Choo, Christopher A., Carlini, Nicholas

arXiv.org Artificial IntelligenceSep-11-2023

Studying data memorization in neural language models helps us understand the risks (e.g., to privacy or copyright) associated with models regurgitating training data and aids in the development of countermeasures. Many prior works -- and some recently deployed defenses -- focus on "verbatim memorization", defined as a model generation that exactly matches a substring from the training set. We argue that verbatim memorization definitions are too restrictive and fail to capture more subtle forms of memorization. Specifically, we design and implement an efficient defense that perfectly prevents all verbatim memorization. And yet, we demonstrate that this "perfect" filter does not prevent the leakage of training data. Indeed, it is easily circumvented by plausible and minimally modified "style-transfer" prompts -- and in some cases even the non-modified original prompts -- to extract memorized information. We conclude by discussing potential alternative definitions and why defining memorization is a difficult yet crucial open question for neural language models.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2210.17546

Country:

North America > United States > New York (0.04)
North America > United States > California > Sacramento County > Sacramento (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Sports (0.67)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

Parameter-free projected gradient descent

Chzhen, Evgenii, Giraud, Christophe, Stoltz, Gilles

arXiv.org Machine LearningMay-31-2023

We consider the problem of minimizing a convex function over a closed convex set, with Projected Gradient Descent (PGD). We propose a fully parameter-free version of AdaGrad, which is adaptive to the distance between the initialization and the optimum, and to the sum of the square norm of the subgradients. Our algorithm is able to handle projection steps, does not involve restarts, reweighing along the trajectory or additional gradient evaluations compared to the classical PGD. It also fulfills optimal rates of convergence for cumulative regret up to logarithmic factors. We provide an extension of our approach to stochastic optimization and conduct numerical experiments supporting the developed theory.

artificial intelligence, log 2, machine learning, (18 more...)

arXiv.org Machine Learning

2305.19605

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > France (0.04)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

Add feedback

DiversiTree: A New Method to Efficiently Compute Diverse Sets of Near-Optimal Solutions to Mixed-Integer Optimization Problems

Ahanor, Izuwa, Medal, Hugh, Trapp, Andrew C.

arXiv.org Artificial IntelligenceFeb-8-2023

While most methods for solving mixed-integer optimization problems compute a single optimal solution, a diverse set of near-optimal solutions can often lead to improved outcomes. We present a new method for finding a set of diverse solutions by emphasizing diversity within the search for near-optimal solutions. Specifically, within a branch-and-bound framework, we investigated parameterized node selection rules that explicitly consider diversity. Our results indicate that our approach significantly increases the diversity of the final solution set. When compared with two existing methods, our method runs with similar runtime as regular node selection methods and gives a diversity improvement between 12% and 190%. In contrast, popular node selection rules, such as best-first search, in some instances performed worse than state-of-the-art methods by more than 35% and gave an improvement of no more than 130%. Further, we find that our method is most effective when diversity in node selection is continuously emphasized after reaching a minimal depth in the tree and when the solution set has grown sufficiently large. Our method can be easily incorporated into integer programming solvers and has the potential to significantly increase the diversity of solution sets.

artificial intelligence, diversity, optimization problem, (14 more...)

arXiv.org Artificial Intelligence

2204.03822

Country:

North America > United States > Tennessee > Knox County > Knoxville (0.04)
Europe > France (0.04)
Europe > Belgium > Wallonia > Walloon Brabant > Louvain-la-Neuve (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation (0.93)
Health & Medicine (0.67)
Government > Military (0.67)
Government > Regional Government > North America Government > United States Government (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback