AITopics

2010.09473

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)

arXiv.org Artificial IntelligenceSep-14-2020

An Empirical Study of Human Behavioral Agents in Bandits, Contextual Bandits and Reinforcement Learning

Lin, Baihan, Cecchi, Guillermo, Bouneffouf, Djallel, Reinen, Jenna, Rish, Irina

Artificial behavioral agents are often evaluated based on their consistent behaviors and performance to take sequential actions in an environment to maximize some notion of cumulative reward. However, human decision making in real life usually involves different strategies and behavioral trajectories that lead to the same empirical outcome. Motivated by clinical literature of a wide range of neurological and psychiatric disorders, we propose here a more general and flexible parametric framework for sequential decision making that involves a two-stream reward processing mechanism. We demonstrated that this framework is flexible and unified enough to incorporate a family of problems spanning multi-armed bandits (MAB), contextual bandits (CB) and reinforcement learning (RL), which decompose the sequential decision making process in different levels. Inspired by the known reward processing abnormalities of many mental disorders, our clinically-inspired agents demonstrated interesting behavioral trajectories and comparable performance on simulated tasks with particular reward distributions, a real-world dataset capturing human decision-making in gambling tasks, and the PacMan game across different reward stationarities in a lifelong learning setting.

algorithm, attention deficit hyperactivity disorder, neurology, (19 more...)

2005.04544

Country:

North America (0.68)
Europe > United Kingdom > England (0.28)

Genre:

Research Report (0.64)
Workflow (0.48)
Instructional Material (0.48)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceSep-10-2020

Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior

Lin, Baihan, Bouneffouf, Djallel, Cecchi, Guillermo

Prisoner's Dilemma mainly treat the choice to cooperate or defect as an atomic action. We propose to study online learning algorithm behavior in the Iterated Prisoner's Dilemma (IPD) game, where we explored the full spectrum of reinforcement learning agents: multi-armed bandits, contextual bandits and reinforcement learning. We have evaluate them based on a tournament of iterated prisoner's dilemma where multiple agents can compete in a sequential fashion. This allows us to analyze the dynamics of policies learned by multiple self-interested independent reward-driven agents, and also allows us study the capacity of these algorithms to fit the human behaviors. Results suggest that considering the current situation to make decision is the worst in this kind of social dilemma game. Multiples discoveries on online learning behaviors and clinical validations are stated.

algorithm, computer based training, game theory, (26 more...)

2006.0658

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Leisure & Entertainment > Games (0.93)
Education > Educational Setting > Online (0.82)
(2 more...)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Machine LearningJul-21-2020

Spectral Clustering using Eigenspectrum Shape Based Nystrom Sampling

Bouneffouf, Djallel

Spectral clustering has shown a superior performance in analyzing the cluster structure. However, its computational complexity limits its application in analyzing large-scale data. To address this problem, many low-rank matrix approximating algorithms are proposed, including the Nystrom method - an approach with proven approximate error bounds. There are several algorithms that provide recipes to construct Nystrom approximations with variable accuracies and computing times. This paper proposes a scalable Nystrom-based clustering algorithm with a new sampling procedure, Centroid Minimum Sum of Squared Similarities (CMS3), and a heuristic on when to use it. Our heuristic depends on the eigen spectrum shape of the dataset, and yields competitive low-rank approximations in test datasets compared to the other state-of-the-art methods

algorithm, artificial intelligence, data mining, (17 more...)

2007.11416

Country:

Asia (1.00)
North America > United States > California (0.68)

Genre:

Research Report (0.84)
Instructional Material (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Data Science > Data Mining (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

arXiv.org Artificial IntelligenceJul-18-2020

Contextual Bandit with Missing Rewards

Bouneffouf, Djallel, Upadhyay, Sohini, Khazaeni, Yasaman

We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side-information, or context, available to a decision-maker) where the reward associated with each context-based decision may not always be observed ("missing rewards"). This new problem is motivated by certain online settings including clinical trial and ad recommendation applications. In order to address the missing rewards setting, we propose to combine the standard contextual bandit approach with an unsupervised learning mechanism such as clustering. Unlike standard contextual bandit methods, by leveraging clustering to estimate missing reward, we are able to learn from each incoming event, even those with missing rewards. Promising empirical results are obtained on several real-life datasets.

big data, djallel bouneffouf, health & medicine, (19 more...)

2007.06368

Country:

Asia (1.00)
North America > United States (0.94)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.36)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Artificial IntelligenceJul-17-2020

Computing the Dirichlet-Multinomial Log-Likelihood Function

Bouneffouf, Djallel

Dirichlet-multinomial (DMN) distribution is commonly used to model over-dispersion in count data. Precise and fast numerical computation of the DMN log-likelihood function is important for performing statistical inference using this distribution, and remains a challenge. To address this, we use mathematical properties of the gamma function to derive a closed form expression for the DMN log-likelihood function. Compared to existing methods, calculation of the closed form has a lower computational complexity, hence is much faster without comprimising computational accuracy.

artificial intelligence, bouneffouf, health & medicine, (17 more...)

2007.11967

Country:

Asia (1.00)
North America > United States (0.95)

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology:

Information Technology > Data Science > Data Mining (0.69)
Information Technology > Artificial Intelligence > Natural Language (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.47)

arXiv.org Machine LearningJul-10-2020

Solving Constrained CASH Problems with ADMM

Ram, Parikshit, Liu, Sijia, Vijaykeerthi, Deepak, Wang, Dakuo, Bouneffouf, Djallel, Bramble, Greg, Samulowitz, Horst, Gray, Alexander G.

The CASH problem has been widely studied in the context of automated configurations of machine learning (ML) pipelines and various solvers and toolkits are available. However, CASH solvers do not directly handle black-box constraints such as fairness, robustness or other domain-specific custom constraints. We present our recent approach [Liu, et al., 2020] that leverages the ADMM optimization framework to decompose CASH into multiple small problems and demonstrate how ADMM facilitates incorporation of black-box constraints.

artificial intelligence, constraint, optimization problem, (17 more...)

2006.09635

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

arXiv.org Machine LearningJun-26-2020

Online learning with Corrupted context: Corrupted Contextual Bandits

Bouneffouf, Djallel

We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side-information, or context, available to a decision-maker) where the context used at each decision may be corrupted ("useless context"). This new problem is motivated by certain online settings including clinical trial and ad recommendation applications. In order to address the corrupted-context setting, we propose to combine the standard contextual bandit approach with a classical multi-armed bandit mechanism. Unlike standard contextual bandit methods, we are able to learn from all iteration, even those with corrupted context, by improving the computing of the expectation for each arm. Promising empirical results are obtained on several real-life datasets.

computer based training, djallel bouneffouf, educational technology, (22 more...)

2006.15194

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.94)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine (1.00)
Education > Educational Setting > Online (0.41)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningMay-4-2020

Hyper-parameter Tuning for the Contextual Bandit

Bouneffouf, Djallel, Claeys, Emmanuelle

We study here the problem of learning the exploration exploitation trade-off in the contextual bandit problem with linear reward function setting. In the traditional algorithms that solve the contextual bandit problem, the exploration is a parameter that is tuned by the user. However, our proposed algorithm learn to choose the right exploration parameters in an online manner based on the observed context, and the immediate reward received for the chosen action. We have presented here two algorithms that uses a bandit to find the optimal exploration of the contextual bandit algorithm, which we hope is the first step toward the automation of the multi-armed bandit algorithm.

algorithm, health & medicine, upstream oil & gas, (21 more...)

2005.02209

Country:

Asia > Singapore (0.15)
Asia > Malaysia (0.14)
Europe > France (0.14)
(5 more...)

Genre: Research Report > New Finding (0.47)

Industry:

Health & Medicine (0.94)
Energy > Oil & Gas > Upstream (0.35)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceJun-28-2019

Reinforcement Learning Models of Human Behavior: Reward Processing in Mental Disorders

Lin, Baihan, Cecchi, Guillermo, Bouneffouf, Djallel, Reinen, Jenna, Rish, Irina

For AI community, the development of agents that react differently to different types of rewards can enable us to understand a wide spectrum of multi-agent interactions in complex real-world socioeconomic systems. Empirically, the proposed model outperforms Q-Learning and Double Q-Learning in artificial scenarios with certain reward distributions and real-world human decision making gambling tasks. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions and user preferences in long-term recommendation systems.

attention deficit hyperactivity disorder, disorder, neurology, (21 more...)

1906.11286

Country: North America > United States (0.15)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (0.95)
Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (0.68)
Health & Medicine > Therapeutic Area > Neurology > Attention Deficit/Hyperactivity Disorder (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)