Goto

Collaborating Authors

 behavior


ShoppingMMLU: AMassiveMulti-TaskOnline ShoppingBenchmarkforLargeLanguageModels

Neural Information Processing Systems

However,existingmodelsand benchmarks are commonly tailored to specific tasks, falling short of capturing the full complexity of online shopping. Large Language Models (LLMs), with their multi-task and few-shot learning abilities, have the potential to profoundly transform online shopping byalleviating task-specific engineering effortsandby providing users with interactiveconversations.


On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay

Neural Information Processing Systems

Training neural networks with batch normalization and weight decay has become a common practice in recent years. In this work, we show that their combined use may result in a surprising periodic behavior of optimization dynamics: the training process regularly exhibits destabilizations that, however, do not lead to complete divergence but cause a new period of training. We rigorously investigate the mechanism underlying the discovered periodic behavior from both empirical and theoretical points of view and analyze the conditions in which it occurs in practice. We also demonstrate that periodic behavior can be regarded as a generalization of two previously opposing perspectives on training with batch normalization and weight decay, namely the equilibrium presumption and the instability presumption.


Behavior From the Void: Unsupervised Active Pre-Training

Neural Information Processing Systems

We introduce a new unsupervised pre-training method for reinforcement learning called APT, which stands for Active Pre-Training. APT learns behaviors and representations by actively searching for novel states in reward-free environments. The key novel idea is to explore the environment by maximizing a non-parametric entropy computed in an abstract representation space, which avoids challenging density modeling and consequently allows our approach to scale much better in environments that have high-dimensional observations (e.g., image observations). We empirically evaluate APT by exposing task-specific reward after a long unsupervised pre-training phase. In Atari games, APT achieves human-level performance on 12 games and obtains highly competitive performance compared to canonical fully supervised RL algorithms. On DMControl suite, APT beats all baselines in terms of asymptotic performance and data efficiency and dramatically improves performance on tasks that are extremely difficult to train from scratch.


Reviews: Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior

Neural Information Processing Systems

The paper investigates the problem of inferring an agent's belief of the system dynamics of an MDP, given demonstrations of its behavior and the reward function it was optimizing. Knowledge of this internal belief can be used for Inverse Reinforcement Learning of an unknown task in the same environment. Furthermore, given the action provided by the agent, its intended action on the true dynamics can be inferred. This allows for assistive tele-operation, by applying the intended actions to the system instead of the provided ones. The proposed method models the agent using the model derived in maximum causal entropy inverse reinforcement learning.


[100%OFF] Graph Neural Networks: Basics, Codes And Simulations For AI

#artificialintelligence

Graph AI carries immense potential for us to explore, connect the dots and build intelligent applications using the Internet of Behaviors (IoB). Many Graph Neural Networks achieved state-of-the-art results on both node and graph classification tasks. However, despite GNNs revolutionizing graph representation learning, there is limited understanding of their area to the students. The purpose of this course is to unfold the basics to the cutting-edge concepts and technologies in this realm. Graphs are all around us; real-world objects are often defined in terms of their connections to other things.


Predicting Others' Behavior on the Road With Artificial Intelligence

#artificialintelligence

Researchers have created a machine-learning system that efficiently predicts the future trajectories of multiple road users, like drivers, cyclists, and pedestrians, which could enable an autonomous vehicle to more safely navigate city streets. If a robot is going to navigate a vehicle safely through downtown Boston, it must be able to predict what nearby drivers, cyclists, and pedestrians are going to do next. A new machine-learning system may someday help driverless cars predict the next moves of nearby drivers, pedestrians, and cyclists in real-time. Humans may be one of the biggest roadblocks to fully autonomous vehicles operating on city streets. If a robot is going to navigate a vehicle safely through downtown Boston, it must be able to predict what nearby drivers, pedestrians, and cyclists are going to do next.


Artificial Intelligence: Using Advanced Analytics to Detect Conduct and Patterns of Behavior

#artificialintelligence

Artificial intelligence (AI) adoption has been largely accepted in the legal community, as many have realized the value of technology that can detect relevant content and produce better outcomes. Incorporating AI into document review workflows or using insights to inform case strategy is transformative and drives better results. From government requests to civil litigation and internal investigations, high profile and fast-moving matters require efficient processes. Deploying technology strategically will help teams to identity key documents and themes early in the case and manage the assessment and review of data efficiently. The continued evolution of AI tools, such as the ability to detect conduct and behavior through sentiment analysis and pattern processing, will further assist with investigatory compliance but can also be used proactively.



A critique of pure learning and what artificial neural networks can learn from animal brains

#artificialintelligence

Not long after the invention of computers in the 1940s, expectations were high. Many believed that computers would soon achieve or surpass human-level intelligence. Herbert Simon, a pioneer of artificial intelligence (AI), famously predicted in 1965 that "machines will be capable, within twenty years, of doing any work a man can do"--to achieve general AI. Of course, these predictions turned out to be wildly off the mark. In the tech world today, optimism is high again.


On the Behavior of Convolutional Nets for Feature Extraction

Garcia-Gasulla, Dario, Parés, Ferran, Vilalta, Armand, Moreno, Jonatan, Ayguadé, Eduard, Labarta, Jesús, Cortés, Ulises, Suzumura, Toyotaro

Journal of Artificial Intelligence Research

Deep neural networks are representation learning techniques. During training, a deep net is capable of generating a descriptive language of unprecedented size and detail in machine learning. Extracting the descriptive language coded within a trained CNN model (in the case of image data), and reusing it for other purposes is a field of interest, as it provides access to the visual descriptors previously learnt by the CNN after processing millions of images, without requiring an expensive training phase. Contributions to this field (commonly known as feature representation transfer or transfer learning) have been purely empirical so far, extracting all CNN features from a single layer close to the output and testing their performance by feeding them to a classifier. This approach has provided consistent results, although its relevance is limited to classification tasks. In a completely different approach, in this paper we statistically measure the discriminative power of every single feature found within a deep CNN, when used for characterizing every class of 11 datasets. We seek to provide new insights into the behavior of CNN features, particularly the ones from convolutional layers, as this can be relevant for their application to knowledge representation and reasoning. Our results confirm that low and middle level features may behave differently to high level features, but only under certain conditions. We find that all CNN features can be used for knowledge representation purposes both by their presence or by their absence, doubling the information a single CNN feature may provide. We also study how much noise these features may include, and propose a thresholding approach to discard most of it. All these insights have a direct application to the generation of CNN embedding spaces.