AITopics | Maghsudi, Setareh

Collaborating Authors

Maghsudi, Setareh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Quantum-Inspired Reinforcement Learning in the Presence of Epistemic Ambivalence

Habibi, Alireza, Ghoorchian, Saeed, Maghsudi, Setareh

arXiv.org Artificial IntelligenceMar-6-2025

The complexity of online decision-making under uncertainty stems from the requirement of finding a balance between exploiting known strategies and exploring new possibilities. Naturally, the uncertainty type plays a crucial role in developing decision-making strategies that manage complexity effectively. In this paper, we focus on a specific form of uncertainty known as epistemic ambivalence (EA), which emerges from conflicting pieces of evidence or contradictory experiences. It creates a delicate interplay between uncertainty and confidence, distinguishing it from epistemic uncertainty that typically diminishes with new information. Indeed, ambivalence can persist even after additional knowledge is acquired. To address this phenomenon, we propose a novel framework, called the epistemically ambivalent Markov decision process (EA-MDP), aiming to understand and control EA in decision-making processes. This framework incorporates the concept of a quantum state from the quantum mechanics formalism, and its core is to assess the probability and reward of every possible outcome. We calculate the reward function using quantum measurement techniques and prove the existence of an optimal policy and an optimal value function in the EA-MDP framework. We also propose the EA-epsilon-greedy Q-learning algorithm. To evaluate the impact of EA on decision-making and the expedience of our framework, we study two distinct experimental setups, namely the two-state problem and the lattice problem. Our results show that using our methods, the agent converges to the optimal policy in the presence of EA.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2503.04219

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

From Code to Play: Benchmarking Program Search for Games Using Large Language Models

Eberhardinger, Manuel, Goodman, James, Dockhorn, Alexander, Perez-Liebana, Diego, Gaina, Raluca D., Çakmak, Duygu, Maghsudi, Setareh, Lucas, Simon

arXiv.org Artificial IntelligenceDec-5-2024

Abstract--Large language models (LLMs) have shown impressive capabilities in generating program code, opening exciting opportunities for applying program synthesis to games. In this work, we explore the potential of LLMs to directly synthesize usable code for a wide range of gaming applications, focusing on two programming languages, Python and Java. We use an evolutionary hill-climbing algorithm, where the mutations and seeds of the initial programs are controlled by LLMs. For Python, the framework covers various game-related tasks, including five miniature versions of Atari games, ten levels of Baba is You, an environment inspired by Asteroids, and a maze generation task. For Java, the framework contains 12 games from the TAG tabletop games framework. Across 29 tasks, we evaluated 12 language models for Python and 8 for Java. Our findings suggest that the performance of LLMs depends more on the task than on model size. While larger models generate more executable programs, these do not always result in higher-quality solutions but are much more expensive. No model has a clear advantage, although on any specific task, one model may be better. Trying many models on a problem and using the best results across them is more reliable than using just one.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.04057

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.86)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Decentralized Task Offloading and Load-Balancing for Mobile Edge Computing in Dense Networks

Yahya, Mariam, Conzelmann, Alexander, Maghsudi, Setareh

arXiv.org Artificial IntelligenceJun-24-2024

We study the problem of decentralized task offloading and load-balancing in a dense network with numerous devices and a set of edge servers. Solving this problem optimally is complicated due to the unknown network information and random task sizes. The shared network resources also influence the users' decisions and resource distribution. Our solution combines the mean field multi-agent multi-armed bandit (MAB) game with a load-balancing technique that adjusts the servers' rewards to achieve a target population profile despite the distributed user decision-making. Numerical results demonstrate the efficacy of our approach and the convergence to the target load distribution.

agent, artificial intelligence, server, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LCOMM.2024.3416833

2407.0008

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Power Industry (0.83)

Technology:

Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.90)

Add feedback

Distributed Management of Fluctuating Energy Resources in Dynamic Networked Systems

Cheng, Xiaotong, Tsetis, Ioannis, Maghsudi, Setareh

arXiv.org Artificial IntelligenceMay-29-2024

Modern power systems integrate renewable distributed energy resources (DERs) as an environment-friendly enhancement to meet the ever-increasing demands. However, the inherent unreliability of renewable energy renders developing DER management algorithms imperative. We study the energy-sharing problem in a system consisting of several DERs. Each agent harvests and distributes renewable energy in its neighborhood to optimize the network's performance while minimizing energy waste. We model this problem as a bandit convex optimization problem with constraints that correspond to each node's limitations for energy production. We propose distributed decision-making policies to solve the formulated problem, where we utilize the notion of dynamic regret as the performance metric. We also include an adjustment strategy in our developed algorithm to reduce the constraint violations. Besides, we design a policy that deals with the non-stationary environment. Theoretical analysis shows the effectiveness of our proposed algorithm. Numerical experiments using a real-world dataset show superior performance of our proposal compared to state-of-the-art methods.

algorithm, artificial intelligence, optimization problem, (15 more...)

arXiv.org Artificial Intelligence

2405.19015

Country:

North America > United States > New York (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > Promising Solution (0.48)

Industry:

Energy > Power Industry (1.00)
Energy > Energy Storage (1.00)
Energy > Renewable > Solar (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Add feedback

Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation

He, Qiang, Zhou, Tianyi, Fang, Meng, Maghsudi, Setareh

arXiv.org Artificial IntelligenceApr-19-2024

Representation rank is an important concept for understanding the role of Neural Networks (NNs) in Deep Reinforcement learning (DRL), which measures the expressive capacity of value networks. Existing studies focus on unboundedly maximizing this rank; nevertheless, that approach would introduce overly complex models in the learning, thus undermining performance. Hence, fine-tuning representation rank presents a challenging and crucial optimization problem. To address this issue, we find a guiding principle for adaptive control of the representation rank. We employ the Bellman equation as a theoretical foundation and derive an upper bound on the cosine similarity of consecutive state-action pairs representations of value networks. We then leverage this upper bound to propose a novel regularizer, namely BEllman Equation-based automatic rank Regularizer (BEER). This regularizer adaptively regularizes the representation rank, thus improving the DRL agent's performance. We first validate the effectiveness of automatic control of rank on illustrative experiments. Then, we scale up BEER to complex continuous control tasks by combining it with the deterministic policy gradient method. Among 12 challenging DeepMind control tasks, BEER outperforms the baselines by a large margin. Besides, BEER demonstrates significant advantages in Q-value approximation. Our code is available at https://github.com/sweetice/BEER-ICLR2024.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2404.12754

Country:

North America > United States > Maryland (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Meta Learning in Bandits within Shared Affine Subspaces

Bilaj, Steven, Dhouib, Sofien, Maghsudi, Setareh

arXiv.org Machine LearningMar-31-2024

In the applications mentioned above, the tasks often relate to each other despite being different. For instance, subgroups of patients have comparable features. As another We study the problem of meta-learning several example, holidays or discount periods promote similar interests contextual stochastic bandits tasks by leveraging in the products of an e-commerce website. That observation their concentration around a low-dimensional motivates us to look beyond a single task to uncover affine subspace, which we learn via online principal a relation between different ones to accelerate learning component analysis to reduce the expected on newly encountered tasks. That problem, referred regret over the encountered bandits. We propose to as meta-learning or learning-to-learn (LTL), has mainly and theoretically analyze two strategies that solve appeared in the offline learning literature so far (Hutter the problem: One based on the principle of optimism et al., 2019). Nevertheless, an emergent body of literature in the face of uncertainty and the other via combines LTL and MAB to accelerate learning and reduce Thompson sampling. Our framework is generic the average regret per task (Cella et al., 2020; Cella and and includes previously proposed approaches as Pontil, 2021; Bilaj et al., 2023).

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2404.00688

Country:

Europe > Spain (0.14)
Europe > Germany (0.14)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Industry:

Education (0.66)
Information Technology > Services > e-Commerce Services (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Eigensubspace of Temporal-Difference Dynamics and How It Improves Value Approximation in Reinforcement Learning

He, Qiang, Zhou, Tianyi, Fang, Meng, Maghsudi, Setareh

arXiv.org Artificial IntelligenceNov-8-2023

We propose a novel value approximation method, namely "Eigensubspace Regularized Critic (ERC)" for deep reinforcement learning (RL). ERC is motivated by an analysis of the dynamics of Q-value approximation error in the Temporal-Difference (TD) method, which follows a path defined by the 1-eigensubspace of the transition kernel associated with the Markov Decision Process (MDP). It reveals a fundamental property of TD learning that has remained unused in previous deep RL approaches. In ERC, we propose a regularizer that guides the approximation error tending towards the 1-eigensubspace, resulting in a more efficient and stable path of value approximation. Moreover, we theoretically prove the convergence of the ERC method. Besides, theoretical analysis and experiments demonstrate that ERC effectively reduces the variance of value functions. Among 26 tasks in the DMControl benchmark, ERC outperforms state-of-the-art methods for 20. Besides, it shows significant advantages in Q-value approximation and variance reduction. Our code is available at https://sites.google.com/view/erc-ecml23/.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-43421-1_34

2306.1675

Country:

North America > United States > Maryland (0.28)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning of Generalizable and Interpretable Knowledge in Grid-Based Reinforcement Learning Environments

Eberhardinger, Manuel, Maucher, Johannes, Maghsudi, Setareh

arXiv.org Artificial IntelligenceSep-7-2023

Understanding the interactions of agents trained with deep reinforcement learning is crucial for deploying agents in games or the real world. In the former, unreasonable actions confuse players. In the latter, that effect is even more significant, as unexpected behavior cause accidents with potentially grave and long-lasting consequences for the involved individuals. In this work, we propose using program synthesis to imitate reinforcement learning policies after seeing a trajectory of the action sequence. Programs have the advantage that they are inherently interpretable and verifiable for correctness. We adapt the state-of-the-art program synthesis system DreamCoder for learning concepts in grid-based environments, specifically, a navigation task and two miniature versions of Atari games, Space Invaders and Asterix. By inspecting the generated libraries, we can make inferences about the concepts the black-box agent has learned and better understand the agent's behavior. We achieve the same by visualizing the agent's decision-making process for the imitated sequences. We evaluate our approach with different types of program synthesizers based on a search-only method, a neural-guided search, and a language model fine-tuned on code.

artificial intelligence, grid-based reinforcement learning environment, machine learning, (1 more...)

arXiv.org Artificial Intelligence

2309.03651

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment > Games > Computer Games (0.53)
Education (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Federated Learning in UAV-Enhanced Networks: Joint Coverage and Convergence Time Optimization

Yahya, Mariam, Maghsudi, Setareh, Stanczak, Slawomir

arXiv.org Artificial IntelligenceAug-31-2023

Federated learning (FL) involves several devices that collaboratively train a shared model without transferring their local data. FL reduces the communication overhead, making it a promising learning method in UAV-enhanced wireless networks with scarce energy resources. Despite the potential, implementing FL in UAV-enhanced networks is challenging, as conventional UAV placement methods that maximize coverage increase the FL delay significantly. Moreover, the uncertainty and lack of a priori information about crucial variables, such as channel quality, exacerbate the problem. In this paper, we first analyze the statistical characteristics of a UAV-enhanced wireless sensor network (WSN) with energy harvesting. We then develop a model and solution based on the multi-objective multi-armed bandit theory to maximize the network coverage while minimizing the FL delay. Besides, we propose another solution that is particularly useful with large action sets and strict energy constraints at the UAVs. Our proposal uses a scalarized best-arm identification algorithm to find the optimal arms that maximize the ratio of the expected reward to the expected energy cost by sequentially eliminating one or more arms in each round. Then, we derive the upper bound on the error probability of our multi-objective and cost-aware algorithm. Numerical results show the effectiveness of our approach.

coverage and convergence time optimization, federated learning, uav-enhanced network, (1 more...)

arXiv.org Artificial Intelligence

2308.16889

Genre: Research Report (0.69)

Industry: Energy (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.89)
Information Technology > Communications > Networks > Sensor Networks (0.87)

Add feedback

Piecewise-Stationary Combinatorial Semi-Bandit with Causally Related Rewards

Nourani-Koliji, Behzad, Bilaj, Steven, Balef, Amir Rezaei, Maghsudi, Setareh

arXiv.org Artificial IntelligenceJul-26-2023

We study the piecewise stationary combinatorial semi-bandit problem with causally related rewards. In our nonstationary environment, variations in the base arms' distributions, causal relationships between rewards, or both, change the reward generation process. In such an environment, an optimal decision-maker must follow both sources of change and adapt accordingly. The problem becomes aggravated in the combinatorial semi-bandit setting, where the decision-maker only observes the outcome of the selected bundle of arms. The core of our proposed policy is the Upper Confidence Bound (UCB) algorithm. We assume the agent relies on an adaptive approach to overcome the challenge. More specifically, it employs a change-point detector based on the Generalized Likelihood Ratio (GLR) test. Besides, we introduce the notion of group restart as a new alternative restarting strategy in the decision making process in structured environments. Finally, our algorithm integrates a mechanism to trace the variations of the underlying graph structure, which captures the causal relationships between the rewards in the bandit setting. Theoretically, we establish a regret upper bound that reflects the effects of the number of structural- and distribution changes on the performance. The outcome of our numerical experiments in real-world scenarios exhibits applicability and superior performance of our proposal compared to the state-of-the-art benchmarks.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2307.14138

Country: Europe > Germany (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.38)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.90)

Add feedback