Goto

Collaborating Authors

 behavior


ShoppingMMLU: AMassiveMulti-TaskOnline ShoppingBenchmarkforLargeLanguageModels

Neural Information Processing Systems

However,existingmodelsand benchmarks are commonly tailored to specific tasks, falling short of capturing the full complexity of online shopping. Large Language Models (LLMs), with their multi-task and few-shot learning abilities, have the potential to profoundly transform online shopping byalleviating task-specific engineering effortsandby providing users with interactiveconversations.


On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay

Neural Information Processing Systems

Training neural networks with batch normalization and weight decay has become a common practice in recent years. In this work, we show that their combined use may result in a surprising periodic behavior of optimization dynamics: the training process regularly exhibits destabilizations that, however, do not lead to complete divergence but cause a new period of training. We rigorously investigate the mechanism underlying the discovered periodic behavior from both empirical and theoretical points of view and analyze the conditions in which it occurs in practice. We also demonstrate that periodic behavior can be regarded as a generalization of two previously opposing perspectives on training with batch normalization and weight decay, namely the equilibrium presumption and the instability presumption.


Behavior From the Void: Unsupervised Active Pre-Training

Neural Information Processing Systems

We introduce a new unsupervised pre-training method for reinforcement learning called APT, which stands for Active Pre-Training. APT learns behaviors and representations by actively searching for novel states in reward-free environments. The key novel idea is to explore the environment by maximizing a non-parametric entropy computed in an abstract representation space, which avoids challenging density modeling and consequently allows our approach to scale much better in environments that have high-dimensional observations (e.g., image observations). We empirically evaluate APT by exposing task-specific reward after a long unsupervised pre-training phase. In Atari games, APT achieves human-level performance on 12 games and obtains highly competitive performance compared to canonical fully supervised RL algorithms. On DMControl suite, APT beats all baselines in terms of asymptotic performance and data efficiency and dramatically improves performance on tasks that are extremely difficult to train from scratch.


Reviews: Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior

Neural Information Processing Systems

The paper investigates the problem of inferring an agent's belief of the system dynamics of an MDP, given demonstrations of its behavior and the reward function it was optimizing. Knowledge of this internal belief can be used for Inverse Reinforcement Learning of an unknown task in the same environment. Furthermore, given the action provided by the agent, its intended action on the true dynamics can be inferred. This allows for assistive tele-operation, by applying the intended actions to the system instead of the provided ones. The proposed method models the agent using the model derived in maximum causal entropy inverse reinforcement learning.


Intellectual abilities of artificial intelligence (AI) - Semiwiki

#artificialintelligence

To understand AI’s capabilities and abilities we need to recognize the different components and subsets of AI. Terms like Neural Networks, Machine Learning (ML), and Deep Learning, need to be define and explained. In general, Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think like humans and…


A critique of pure learning and what artificial neural networks can learn from animal brains

#artificialintelligence

Not long after the invention of computers in the 1940s, expectations were high. Many believed that computers would soon achieve or surpass human-level intelligence. Herbert Simon, a pioneer of artificial intelligence (AI), famously predicted in 1965 that "machines will be capable, within twenty years, of doing any work a man can do"--to achieve general AI. Of course, these predictions turned out to be wildly off the mark. In the tech world today, optimism is high again.


How AI can help you stay ahead of cybersecurity threats

#artificialintelligence

Since the 2013 Target breach, it's been clear that companies need to respond better to security alerts even as volumes have gone up. With this year's fast-spreading ransomware attacks and ever-tightening compliance requirements, response must be much faster. Adding staff is tough with the cybersecurity hiring crunch, so companies are turning to machine learning and artificial intelligence (AI) to automate tasks and better detect bad behavior. In a cybersecurity context, AI is software that perceives its environment well enough to identify events and take action against a predefined purpose. AI is particularly good at recognizing patterns and anomalies within them, which makes it an excellent tool to detect threats.


The Great AI Paradox

MIT Technology Review

You've probably heard versions of each of the following ideas. With computers becoming remarkably adept at driving, understanding speech, and other tasks, more jobs could soon be automated than society is prepared to handle. This "superintelligence" will largely make human labor unnecessary. In fact, we'd better hope that machines don't eliminate us altogether, either accidentally or on purpose. Even though the first scenario is already under way, it won't necessarily lead to the second one.


Robot Planning

AI Magazine

Drew McDermott Research on planning for robots is in such a state of flux that there is disagreement about what planning is and whether it is necessary. We can take planning to be the optimization and debugging of a robot's program by reasoning about possible courses of execution. It is necessary to the extent that fragments of robot programs are combined at run time. There are several strands of research in the field; I survey six: (1) attempts to avoid planning; (2) the design of flexible plan notations; (3) theories of time-constrained planning; (4) planning by projecting and repairing faulty plans; (5) motion planning; and (6) the learning of optimal behaviors from reinforcements. More research is needed on formal semantics for robot plans.


The Timing of Bids in Internet Auctions

AI Magazine

Many bidders in eBay use bidding strategies that involve late bids, incremental bids, or both. Based on field evidence, we discuss the manner in which late bids are caused both by sophisticated, strategic reasoning and by irrationality and inexperience; the interaction of late bidding with incremental bidding; and the relation between market design and artificial agent design. Participants in internet markets can be human bidders bidding in person or artificial agents used by human bidders. Thus, the performance of market rules depends on what behavior the rules elicit from human and artificial agents. At the same time, the performance of software agents, and the decisions of bidders whether to use them, depends on how they interact with humans and other software agents in the market.