Agents
Dealing with Incompatibilities among Procedural Goals under Uncertainty
Morveli-Espinoza, Mariela, Nieves, Juan Carlos, Possebom, Ayslan Trevizan, Tacla, Cesar Augusto
By considering rational agents, we focus on the problem of selecting goals out of a set of incompatible ones. We consider three forms of incompatibility introduced by Castelfranchi and Paglieri, namely the terminal, the instrumental (or based on resources), and the superfluity. We represent the agent's plans by means of structured arguments whose premises are pervaded with uncertainty. We measure the strength of these arguments in order to determine the set of compatible goals. We propose two novel ways for calculating the strength of these arguments, depending on the kind of incompatibility that exists between them. The first one is the logical strength value, it is denoted by a three-dimensional vector, which is calculated from a probabilistic interval associated with each argument. The vector represents the precision of the interval, the location of it, and the combination of precision and location. This type of representation and treatment of the strength of a structured argument has not been defined before by the state of the art. The second way for calculating the strength of the argument is based on the cost of the plans (regarding the necessary resources) and the preference of the goals associated with the plans. Considering our novel approach for measuring the strength of structured arguments, we propose a semantics for the selection of plans and goals that is based on Dung's abstract argumentation theory. Finally, we make a theoretical evaluation of our proposal.
Strategy Proof Mechanisms for Facility Location at Limited Locations
Facility location problems often permit facilities to be located at any position. But what if this is not the case in practice? What if facilities can only be located at particular locations like a highway exit or close to a bus stop? We consider here the impact of such constraints on the location of facilities on the performance of strategy proof mechanisms for locating facilities.We study four different performance objectives: the total distance agents must travel to their closest facility, the maximum distance any agent must travel to their closest facility, and the utilitarian and egalitarian welfare.We show that constraining facilities to a limited set of locations makes all four objectives harder to approximate in general.
AutoFS: Automated Feature Selection via Diversity-aware Interactive Reinforcement Learning
Fan, Wei, Liu, Kunpeng, Liu, Hao, Wang, Pengyang, Ge, Yong, Fu, Yanjie
In this paper, we study the problem of balancing effectiveness and efficiency in automated feature selection. Feature selection is a fundamental intelligence for machine learning and predictive analysis. After exploring many feature selection methods, we observe a computational dilemma: 1) traditional feature selection methods (e.g., mRMR) are mostly efficient, but difficult to identify the best subset; 2) the emerging reinforced feature selection methods automatically navigate feature space to explore the best subset, but are usually inefficient. Are automation and efficiency always apart from each other? Can we bridge the gap between effectiveness and efficiency under automation? Motivated by such a computational dilemma, this study is to develop a novel feature space navigation method. To that end, we propose an Interactive Reinforced Feature Selection (IRFS) framework that guides agents by not just self-exploration experience, but also diverse external skilled trainers to accelerate learning for feature exploration. Specifically, we formulate the feature selection problem into an interactive reinforcement learning framework. In this framework, we first model two trainers skilled at different searching strategies: (1) KBest based trainer; (2) Decision Tree based trainer. We then develop two strategies: (1) to identify assertive and hesitant agents to diversify agent training, and (2) to enable the two trainers to take the teaching role in different stages to fuse the experiences of the trainers and diversify teaching process. Such a hybrid teaching strategy can help agents to learn broader knowledge, and, thereafter, be more effective. Finally, we present extensive experiments on real-world datasets to demonstrate the improved performances of our method: more efficient than existing reinforced selection and more effective than classic selection.
Strategy Proof Mechanisms for Facility Location in Euclidean and Manhattan Space
We study the impact on mechanisms for facility location of moving from one dimension to two (or more) dimensions and Euclidean or Manhattan distances. We consider three fundamental axiomatic properties: anonymity which is a basic fairness property, Pareto optimality which is one of the most important efficiency properties, and strategy proofness which ensures agents do not have an incentive to mis-report. We also consider how well such mechanisms can approximate the optimal welfare. Our results are somewhat negative. Moving from one dimension to two (or more) dimensions often makes these axiomatic properties more difficult to achieve. For example, with two facilities in Euclidean space or with just a single facility in Manhattan space, no mechanism is anonymous, Pareto optimal and strategy proof. By contrast, mechanisms on the line exist with all three properties.We also show that approximation ratios may increase when moving to two (or more) dimensions. All our impossibility results are minimal. If we drop one of the three axioms (anonymity, Pareto optimality or strategy proofness) multiple mechanisms satisfy the other two axioms.
Reinforcement Learning for Strategic Recommendations
Theocharous, Georgios, Chandak, Yash, Thomas, Philip S., de Nijs, Frits
Strategic recommendations (SR) refer to the problem where an intelligent agent observes the sequential behaviors and activities of users and decides when and how to interact with them to optimize some long-term objectives, both for the user and the business. These systems are in their infancy in the industry and in need of practical solutions to some fundamental research challenges. At Adobe research, we have been implementing such systems for various use-cases, including points of interest recommendations, tutorial recommendations, next step guidance in multi-media editing software, and ad recommendation for optimizing lifetime value. There are many research challenges when building these systems, such as modeling the sequential behavior of users, deciding when to intervene and offer recommendations without annoying the user, evaluating policies offline with high confidence, safe deployment, non-stationarity, building systems from passive data that do not contain past recommendations, resource constraint optimization in multi-user systems, scaling to large and dynamic actions spaces, and handling and incorporating human cognitive biases. In this paper we cover various use-cases and research challenges we solved to make these systems practical.
Advancing the Scientific Frontier with Increasingly Autonomous Systems
Amini, Rashied, Azari, Abigail, Bhaskaran, Shyam, Beauchamp, Patricia, Castillo-Rogez, Julie, Castano, Rebecca, Chung, Seung, Day, John, Doyle, Richard, Feather, Martin, Fesq, Lorraine, Frank, Jeremy, Furlong, P. Michael, Ingham, Michel, Kennedy, Brian, Kolcio, Ksenia, Nesnas, Issa, Rasmussen, Robert, Reeves, Glenn, Sorice, Cristina, Theiling, Bethany, Wyatt, Jay
A close partnership between people and partially autonomous machines has enabled decades of space exploration. But to further expand our horizons, our systems must become more capable. Increasing the nature and degree of autonomy - allowing our systems to make and act on their own decisions as directed by mission teams - enables new science capabilities and enhances science return. The 2011 Planetary Science Decadal Survey (PSDS) and on-going pre-Decadal mission studies have identified increased autonomy as a core technology required for future missions. However, even as scientific discovery has necessitated the development of autonomous systems and past flight demonstrations have been successful, institutional barriers have limited its maturation and infusion on existing planetary missions. Consequently, the authors and endorsers of this paper recommend that new programmatic pathways be developed to infuse autonomy, infrastructure for support autonomous systems be invested in, new practices be adopted, and the cost-saving value of autonomy for operations be studied.
Grounded Language Learning Fast and Slow
Hill, Felix, Tieleman, Olivier, von Glehn, Tamara, Wong, Nathaniel, Merzic, Hamza, Clark, Stephen
Recent work has shown that large text-based neural language models, trained with conventional supervised learning objectives, acquire a surprising propensity for few- and one-shot learning. Here, we show that an embodied agent situated in a simulated 3D world, and endowed with a novel dual-coding external memory, can exhibit similar one-shot word learning when trained with conventional reinforcement learning algorithms. After a single introduction to a novel object via continuous visual perception and a language prompt ("This is a dax"), the agent can re-identify the object and manipulate it as instructed ("Put the dax on the bed"). In doing so, it seamlessly integrates short-term, within-episode knowledge of the appropriate referent for the word "dax" with long-term lexical and motor knowledge acquired across episodes (i.e. "bed" and "putting"). We find that, under certain training conditions and with a particular memory writing mechanism, the agent's one-shot word-object binding generalizes to novel exemplars within the same ShapeNet category, and is effective in settings with unfamiliar numbers of objects. We further show how dual-coding memory can be exploited as a signal for intrinsic motivation, stimulating the agent to seek names for objects that may be useful for later executing instructions. Together, the results demonstrate that deep neural networks can exploit meta-learning, episodic memory and an explicitly multi-modal environment to account for 'fast-mapping', a fundamental pillar of human cognitive development and a potentially transformative capacity for agents that interact with human users.
Artificial Intelligence Assisted Collaborative Edge Caching in Small Cell Networks
Pervej, Md Ferdous, Tan, Le Thanh, Hu, Rose Qingyang
Edge caching is a new paradigm that has been exploited over the past several years to reduce the load for the core network and to enhance the content delivery performance. Many existing caching solutions only consider homogeneous caching placement due to the immense complexity associated with the heterogeneous caching models. Unlike these legacy modeling paradigms, this paper considers heterogeneous content preference of the users with heterogeneous caching models at the edge nodes. Besides, aiming to maximize the cache hit ratio (CHR) in a two-tier heterogeneous network, we let the edge nodes collaborate. However, due to complex combinatorial decision variables, the formulated problem is hard to solve in the polynomial time. Moreover, there does not even exist a ready-to-use tool or software to solve the problem. We propose a modified particle swarm optimization (M-PSO) algorithm that efficiently solves the complex constraint problem in a reasonable time. Using numerical analysis and simulation, we validate that the proposed algorithm significantly enhances the CHR performance when comparing to that of the existing baseline caching schemes.
Solution To Value Agregation
AI Safety researchers attempting to align values of highly capable intelligent systems with those of humanity face a number of challenges including personal value extraction, multi-agent value merger and finally in-silico encoding. State-of-the-art research in value alignment shows difficulties in every stage in this process, but merger of incompatible preferences is a particularly difficult challenge to overcome. In this paper we assume that the value extraction problem will be solved and propose a possible way to implement an AI solution which optimally aligns with individual preferences of each user. We conclude by analyzing benefits and limitations of the proposed approach. Since the birth of the field of Artificial Intelligence (AI) researchers worked on creating ever capable machines, but with recent success in multiple subdomains of AI [1–7] safety and security of such systems and predicted future superintelligences [8, 9] has become paramount [10, 11].
Competing AI: How competition feedback affects machine learning
Ginart, Antonio, Zhang, Eva, Zou, James
This papers studies how competition affects machine learning (ML) predictors. As ML becomes more ubiquitous, it is often deployed by companies to compete over customers. For example, digital platforms like Yelp use ML to predict user preference and make recommendations. A service that is more often queried by users, perhaps because it more accurately anticipates user preferences, is also more likely to obtain additional user data (e.g. in the form of a Yelp review). Thus, competing predictors cause feedback loops whereby a predictor's performance impacts what training data it receives and biases its predictions over time. We introduce a flexible model of competing ML predictors that enables both rapid experimentation and theoretical tractability. We show with empirical and mathematical analysis that competition causes predictors to specialize for specific sub-populations at the cost of worse performance over the general population. We further analyze the impact of predictor specialization on the overall prediction quality experienced by users. We show that having too few or too many competing predictors in a market can hurt the overall prediction quality. Our theory is complemented by experiments on several real datasets using popular learning algorithms, such as neural networks and nearest neighbor methods.