agile
AGILE: A Novel Reinforcement Learning Framework of LLM Agents
We introduce a novel reinforcement learning framework of LLM agents named AGILE (AGent that Interacts and Learns from Environments) designed to perform complex conversational tasks with users, leveraging LLMs, memory, tools, and interactions with experts. The agent possesses capabilities beyond conversation, including reflection, tool usage, and expert consultation. We formulate the construction of such an LLM agent as a reinforcement learning (RL) problem, in which the LLM serves as the policy model. We fine-tune the LLM using labeled data of actions and the PPO algorithm. We focus on question answering and release a dataset for agents called ProductQA, comprising challenging questions in online shopping. Our extensive experiments on ProductQA, MedMCQA and HotPotQA show that AGILE agents based on 7B and 13B LLMs trained with PPO can outperform GPT-4 agents. Our ablation study highlights the indispensability of memory, tools, consultation, reflection, and reinforcement learning in achieving the agent's strong performance.
Activation-Guided Local Editing for Jailbreaking Attacks
Wang, Jiecong, Li, Haoran, Peng, Hao, Zeng, Ziqian, Wang, Zihao, Du, Haohua, Yu, Zhengtao
Jailbreaking is an essential adversarial technique for red-teaming these models to uncover and patch security flaws. However, existing jailbreak methods face significant drawbacks. Token-level jailbreak attacks often produce incoherent or unreadable inputs and exhibit poor transferability, while prompt-level attacks lack scalability and rely heavily on manual effort and human ingenuity. We propose a concise and effective two-stage framework that combines the advantages of these approaches. The first stage performs a scenario-based generation of context and rephrases the original malicious query to obscure its harmful intent. The second stage then utilizes information from the model's hidden states to guide fine-grained edits, effectively steering the model's internal representation of the input from a malicious toward a benign one. Extensive experiments demonstrate that this method achieves state-of-the-art Attack Success Rate, with gains of up to 37.74% over the strongest baseline, and exhibits excellent transferability to black-box models. Our analysis further demonstrates that AGILE maintains substantial effectiveness against prominent defense mechanisms, highlighting the limitations of current safeguards and providing valuable insights for future defense development. Our code is available at https://github.com/yunsaijc/AGILE.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- (2 more...)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)
LANTERN: A Machine Learning Framework for Lipid Nanoparticle Transfection Efficiency Prediction
Mehradfar, Asal, Sepehri, Mohammad Shahab, Hernandez-Lobato, Jose Miguel, Kwon, Glen S., Soltanolkotabi, Mahdi, Avestimehr, Salman, Rasoulianboroujeni, Morteza
The discovery of new ionizable lipids for efficient lipid nanoparticle (LNP)-mediated RNA delivery remains a critical bottleneck for RNA-based therapeutics development. Recent advances have highlighted the potential of machine learning (ML) to predict transfection efficiency from molecular structure, enabling high-throughput virtual screening and accelerating lead identification. However, existing approaches are hindered by inadequate data quality, ineffective feature representations, low predictive accuracy, and poor generalizability. Here, we present LANTERN (Lipid nANoparticle Transfection Efficiency pRedictioN), a robust ML framework for predicting transfection efficiency based on ionizable lipid representation. We benchmarked a diverse set of ML models against AGILE, a previously published model developed for transfection prediction. Our results show that combining simpler models with chemically informative features, particularly count-based Morgan fingerprints, outperforms more complex models that rely on internally learned embeddings, such as AGILE. We also show that a multi-layer perceptron trained on a combination of Morgan fingerprints and Expert descriptors achieved the highest performance ($\text{R}^2$ = 0.8161, r = 0.9053), significantly exceeding AGILE ($\text{R}^2$ = 0.2655, r = 0.5488). We show that the models in LANTERN consistently have strong performance across multiple evaluation metrics. Thus, LANTERN offers a robust benchmarking framework for LNP transfection prediction and serves as a valuable tool for accelerating lipid-based RNA delivery systems design.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Wisconsin > Dane County > Madison (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Tennessee > Washington County > Johnson City (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.90)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
AGILE: A Novel Reinforcement Learning Framework of LLM Agents
We introduce a novel reinforcement learning framework of LLM agents named AGILE (AGent that Interacts and Learns from Environments) designed to perform complex conversational tasks with users, leveraging LLMs, memory, tools, and interactions with experts. The agent possesses capabilities beyond conversation, including reflection, tool usage, and expert consultation. We formulate the construction of such an LLM agent as a reinforcement learning (RL) problem, in which the LLM serves as the policy model. We fine-tune the LLM using labeled data of actions and the PPO algorithm. We focus on question answering and release a dataset for agents called ProductQA, comprising challenging questions in online shopping.
Your Learned Constraint is Secretly a Backward Reachable Tube
Qadri, Mohamad, Swamy, Gokul, Francis, Jonathan, Kaess, Michael, Bajcsy, Andrea
Inverse Constraint Learning (ICL) is the problem of inferring constraints from safe (i.e., constraint-satisfying) demonstrations. The hope is that these inferred constraints can then be used downstream to search for safe policies for new tasks and, potentially, under different dynamics. Our paper explores the question of what mathematical entity ICL recovers. Somewhat surprisingly, we show that both in theory and in practice, ICL recovers the set of states where failure is inevitable, rather than the set of states where failure has already happened. In the language of safe control, this means we recover a backwards reachable tube (BRT) rather than a failure set . In contrast to the failure set, the BRT depends on the dynamics of the data collection system. We discuss the implications of the dynamics-conditionedness of the recovered constraint on both the sample-efficiency of policy search and the transferability of learned constraints.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
Bridging Adaptivity and Safety: Learning Agile Collision-Free Locomotion Across Varied Physics
Zhong, Yichao, Zhang, Chong, He, Tairan, Shi, Guanya
Real-world legged locomotion systems often need to reconcile agility and safety for different scenarios. Moreover, the underlying dynamics are often unknown and time-variant (e.g., payload, friction). In this paper, we introduce BAS (Bridging Adaptivity and Safety), which builds upon the pipeline of prior work Agile But Safe (ABS) (He et al., 2024b) and is designed to provide adaptive safety even in dynamic environments with uncertainties. BAS involves an agile policy to avoid obstacles rapidly and a recovery policy to prevent collisions, a physical parameter estimator that is concurrently trained with agile policy, and a learned control-theoretic RA (reach-avoid) value network that governs the policy switch. Also, the agile policy and RA network are both conditioned on physical parameters to make them adaptive. To mitigate the distribution shift issue, we further introduce an on-policy fine-tuning phase for the estimator to enhance its robustness and accuracy. The simulation results show that BAS achieves 50% better safety than baselines in dynamic environments while maintaining a higher speed on average. In real-world experiments, BAS shows its capability in complex environments with unknown physics (e.g., slippery floors with unknown frictions, unknown payloads up to 8kg), while baselines lack adaptivity, leading to collisions or degraded agility. As a result, BAS achieves a 19.8% increase in speed and gets a 2.36 times lower collision rate than ABS in the real world.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
Mitigating Interference in the Knowledge Continuum through Attention-Guided Incremental Learning
Bhat, Prashant, Renjith, Bharath, Arani, Elahe, Zonooz, Bahram
Continual learning (CL) remains a significant challenge for deep neural networks, as it is prone to forgetting previously acquired knowledge. Several approaches have been proposed in the literature, such as experience rehearsal, regularization, and parameter isolation, to address this problem. Although almost zero forgetting can be achieved in task-incremental learning, class-incremental learning remains highly challenging due to the problem of inter-task class separation. Limited access to previous task data makes it difficult to discriminate between classes of current and previous tasks. To address this issue, we propose'Attention-Guided Incremental Learning' (AGILE), a novel rehearsal-based CL approach that incorporates compact task attention to effectively reduce interference between tasks. AGILE utilizes lightweight, learnable task projection vectors to transform the latent representations of a shared task attention module toward task distribution. Through extensive empirical evaluation, we show that AGILE significantly improves generalization performance by mitigating task interference and outperforming rehearsal-based approaches in several CL scenarios. Furthermore, AGILE can scale well to a large number of tasks with minimal overhead while remaining well-calibrated with reduced task-recency bias. In recent years, deep neural networks (DNNs) have been shown to perform better than humans on certain specific tasks, such as Atari games (Silver et al., 2018) and classification (He et al., 2015). Although impressive, these models are trained on static data and are unable to adapt their behavior to novel tasks while maintaining performance on previous tasks when the data evolve over time (Fedus et al., 2020). Continual learning (CL) refers to a training paradigm in which DNNs are exposed to a sequence of tasks and are expected to learn potentially incrementally or online (Parisi et al., 2019). CL has remained one of the most daunting tasks for DNNs, as acquiring new information significantly deteriorates the performance of previously learned tasks, a phenomenon termed "catastrophic forgetting" (French, 1999; McCloskey & Cohen, 1989).
- Leisure & Entertainment > Games > Computer Games (0.54)
- Health & Medicine > Therapeutic Area (0.46)
I Was the First AI Minister in History
"A Minister of Artificial Intelligence who is the age of my son, appointed to regulate a hypothetical technology, proves to me that your government has too much time and resources on its hands." Those were the words of a senior government official during a bilateral meeting in 2017, soon after I was appointed as the world's first Minister for Artificial Intelligence. Upon hearing that remark, I distinctly recall feeling a pang of indignation by their equating youth with incompetence, but even more so by their clear disregard and trivialization of AI. Six years into my role of leading the UAE's strategy to become the most prepared country for AI, the past year has been an exhilarating sprint of unprecedented AI advancements. It is now undeniable that AI is no longer a hypothetical technology, but one that warrants far more government time and resources across the globe.
- Asia > Middle East > UAE (0.73)
- Europe > Ukraine > Kyiv Oblast > Chernobyl (0.05)
- Europe > Spain > Canary Islands > Tenerife (0.05)
- Government (1.00)
- Law (0.72)
- Transportation > Air (0.31)
- Energy > Power Industry > Utilities > Nuclear (0.31)
2023: It's Time To Adopt A Strategy For Change - AI Magazine
We are clearly in a period of recession. But for all businesses, it's a good time to put a change strategy in place. Here's why 2023 needs to be the year to optimize and automate your IT… The pandemic has shown companies that they need to be more agile in order to react quickly to sometimes unexpected events. The looming economic recession is an example of an unexpected factor whose repercussions may well exceed those of the periods of confinement that we have experienced during the pandemic. Unfortunately, companies tend to suspend development during economic downturns, cancel contracts, delay projects, and generally "batten down the hatches" to weather the storm. At first sight, this approach, often motivated by financial reasons, seems logical.
- Banking & Finance > Economy (0.59)
- Information Technology > Security & Privacy (0.50)
- Information Technology > Services (0.49)
Intelligent automation: The next step towards making technology more productive
The cardinal focus of every business is to boost productivity, increase revenue and streamline operations, three things that are imperative to sustain in today's highly-competitive landscape. Over the last few years, many companies have accelerated their digital transformation journey to meet the aforementioned goals effectively. They have started leveraging robotic process automation (RPA), machine learning (ML), artificial intelligence (AI), big data analytics and more such cutting-edge technologies. And while the pandemic has brought drastic changes in our lifestyle, it has also taught us to stay prepared for uncertain and unexpected events. Business leaders are also taking this lesson to heart and using futuristic technologies like automation to remain agile in the unforeseeable future.
- North America > United States (0.05)
- Asia > India > Gujarat (0.05)
- Government (0.30)
- Banking & Finance (0.30)
- Information Technology > Data Science > Data Mining > Big Data (0.55)
- Information Technology > Artificial Intelligence > Robots (0.55)