AITopics | Instructional Material

Collaborating Authors

Instructional Material

Fighting Failures with FIRE: Failure Identification to Reduce Expert Burden in Intervention-Based Learning

Ablett, Trevor, Marić, Filip, Kelly, Jonathan

arXiv.org Artificial IntelligenceDec-8-2023

Supervised imitation learning, also known as behavioral cloning, suffers from distribution drift leading to failures during policy execution. One approach to mitigate this issue is to allow an expert to correct the agent's actions during task execution, based on the expert's determination that the agent has reached a `point of no return.' The agent's policy is then retrained using this new corrective data. This approach alone can enable high-performance agents to be learned, but at a substantial cost: the expert must vigilantly observe execution until the policy reaches a specified level of success, and even at that point, there is no guarantee that the policy will always succeed. To address these limitations, we present FIRE (Failure Identification to Reduce Expert Burden in intervention-based learning), a system that can predict when a running policy will fail, halt its execution, and request a correction from the expert. Unlike existing approaches that learn only from expert data, our approach learns from both expert and non-expert data, akin to adversarial learning. We demonstrate experimentally for a series of challenging manipulation tasks that our method is able to recognize state-action pairs that lead to failures. This permits seamless integration into an intervention-based learning system, where we show an order-of-magnitude gain in sample efficiency compared with a state-of-the-art inverse reinforcement learning method and dramatically improved performance over an equivalent amount of data learned with behavioral cloning.

international conference, learning, proceedings, (16 more...)

arXiv.org Artificial Intelligence

2007.00245

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(16 more...)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)

Add feedback

What's coming up at #NeurIPS2023?

AIHubDec-7-2023, 11:51:30 GMT

The thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023) is due to kick-off on Sunday 10 December and run until Saturday 16 December. There is a bumper programme of events, including invited talks, orals, posters, tutorials, workshops, and socials, not to mention AIhub's session on science communication. There are seven invited talks this year. For this year's conference, there will be a total of 14 tutorials. These will be held on Monday 11 December, in person only.

monday 11, neurips2023, workshop, (4 more...)

AIHub

Genre: Instructional Material > Course Syllabus & Notes (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.61)

Add feedback

Testing LLM performance on the Physics GRE: some observations

Gupta, Pranav

arXiv.org Artificial IntelligenceDec-7-2023

With the recent developments in large language models (LLMs) and their widespread availability through open source models and/or low-cost APIs, several exciting products and applications are emerging, many of which are in the field of STEM educational technology for K-12 and university students. There is a need to evaluate these powerful language models on several benchmarks, in order to understand their risks and limitations. In this short paper, we summarize and analyze the performance of Bard, a popular LLM-based conversational service made available by Google, on the standardized Physics GRE examination.

bard, language model, llm, (16 more...)

arXiv.org Artificial Intelligence

2312.04613

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre:

Research Report (0.40)
Instructional Material (0.34)

Industry:

Education > Curriculum > Subject-Specific Education (0.49)
Education > Educational Setting > Higher Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Digital Life Project: Autonomous 3D Characters with Social Intelligence

Cai, Zhongang, Jiang, Jianping, Qing, Zhongfei, Guo, Xinying, Zhang, Mingyuan, Lin, Zhengyu, Mei, Haiyi, Wei, Chen, Wang, Ruisi, Yin, Wanqi, Fan, Xiangyu, Du, Han, Pan, Liang, Gao, Peng, Yang, Zhitao, Gao, Yang, Li, Jiaqi, Ren, Tianxiang, Wei, Yukun, Wang, Xiaogang, Loy, Chen Change, Yang, Lei, Liu, Ziwei

arXiv.org Artificial IntelligenceDec-7-2023

In this work, we present Digital Life Project, a framework utilizing language as the universal medium to build autonomous 3D characters, who are capable of engaging in social interactions and expressing with articulated body motions, thereby simulating life in a digital environment. Our framework comprises two primary components: 1) SocioMind: a meticulously crafted digital brain that models personalities with systematic few-shot exemplars, incorporates a reflection process based on psychology principles, and emulates autonomy by initiating dialogue topics; 2) MoMat-MoGen: a text-driven motion synthesis paradigm for controlling the character's digital body. It integrates motion matching, a proven industry technique to ensure motion quality, with cutting-edge advancements in motion generation for diversity. Extensive experiments demonstrate that each module achieves state-of-the-art performance in its respective domain. Collectively, they enable virtual characters to initiate and sustain dialogues autonomously, while evolving their socio-psychological states. Concurrently, these characters can perform contextually relevant bodily movements. Additionally, a motion captioning module further allows the virtual character to recognize and appropriately respond to human players' actions. Homepage: https://digital-life-project.com/

interaction, preprint arxiv, psychological state, (15 more...)

arXiv.org Artificial Intelligence

2312.04547

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Minnesota (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.67)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Leisure & Entertainment > Games (0.66)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.45)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(3 more...)

Add feedback

Unnatural Algorithms in Machine Learning

Goodbrake, Christian

arXiv.org Machine LearningDec-7-2023

Natural gradient descent has a remarkable property that in the small learning rate limit, it displays an invariance with respect to network reparameterizations, leading to robust training behavior even for highly covariant network parameterizations. We show that optimization algorithms with this property can be viewed as discrete approximations of natural transformations from the functor determining an optimizer's state space from the diffeomorphism group if its configuration manifold, to the functor determining that state space's tangent bundle from this group. Algorithms with this property enjoy greater efficiency when used to train poorly parameterized networks, as the network evolution they generate is approximately invariant to network reparameterizations. More specifically, the flow generated by these algorithms in the limit as the learning rate vanishes is invariant under smooth reparameterizations, the respective flows of the parameters being determined by equivariant maps. By casting this property a natural transformation, we allow for generalizations beyond equivariance with respect to group actions; this framework can account for non-invertible maps such as projections, creating a framework for the direct comparison of training behavior across non-isomorphic network architectures, and the formal examination of limiting behavior as network size increases by considering inverse limits of these projections, should they exist. We introduce a simple method of introducing this naturality more generally and examine a number of popular machine learning training algorithms, finding that most are unnatural.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Machine Learning

2312.04739

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Oceania > Tonga (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > China (0.04)

Genre:

Instructional Material (0.46)
Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.37)

Add feedback

Wake-Sleep Consolidated Learning

Sorrenti, Amelia, Bellitto, Giovanni, Salanitri, Federica Proietto, Pennisi, Matteo, Palazzo, Simone, Spampinato, Concetto

arXiv.org Artificial IntelligenceDec-6-2023

We propose Wake-Sleep Consolidated Learning (WSCL), a learning strategy leveraging Complementary Learning System theory and the wake-sleep phases of the human brain to improve the performance of deep neural networks for visual classification tasks in continual learning settings. Our method learns continually via the synchronization between distinct wake and sleep phases. During the wake phase, the model is exposed to sensory input and adapts its representations, ensuring stability through a dynamic parameter freezing mechanism and storing episodic memories in a short-term temporary memory (similarly to what happens in the hippocampus). During the sleep phase, the training process is split into NREM and REM stages. In the NREM stage, the model's synaptic weights are consolidated using replayed samples from the short-term and long-term memory and the synaptic plasticity mechanism is activated, strengthening important connections and weakening unimportant ones. In the REM stage, the model is exposed to previously-unseen realistic visual sensory experience, and the dreaming process is activated, which enables the model to explore the potential feature space, thus preparing synapses to future knowledge. We evaluate the effectiveness of our approach on three benchmark datasets: CIFAR-10, Tiny-ImageNet and FG-ImageNet. In all cases, our method outperforms the baselines and prior work, yielding a significant performance gain on continual visual classification tasks. Furthermore, we demonstrate the usefulness of all processing stages and the importance of dreaming to enable positive forward transfer.

continual learning, international conference, learning, (15 more...)

arXiv.org Artificial Intelligence

2401.08623

Country:

North America > United States (0.14)
Europe > Italy (0.05)
Europe > Austria (0.04)
(2 more...)

Genre:

Research Report (0.64)
Instructional Material (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Sleep (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Coherent Soft Imitation Learning

Watson, Joe, Huang, Sandy H., Heess, Nicolas

arXiv.org Artificial IntelligenceDec-6-2023

Imitation learning methods seek to learn from an expert either through behavioral cloning (BC) of the policy or inverse reinforcement learning (IRL) of the reward. Such methods enable agents to learn complex tasks from humans that are difficult to capture with hand-designed reward functions. Choosing BC or IRL for imitation depends on the quality and state-action coverage of the demonstrations, as well as additional access to the Markov decision process. Hybrid strategies that combine BC and IRL are not common, as initial policy optimization against inaccurate rewards diminishes the benefit of pretraining the policy with BC. This work derives an imitation method that captures the strengths of both BC and IRL. In the entropy-regularized ('soft') reinforcement learning setting, we show that the behaviour-cloned policy can be used as both a shaped reward and a critic hypothesis space by inverting the regularized policy update. This coherency facilitates fine-tuning cloned policies using the reward estimate and additional interactions with the environment. This approach conveniently achieves imitation learning through initial behaviour cloning, followed by refinement via RL with online or offline data sources. The simplicity of the approach enables graceful scaling to high-dimensional and vision-based tasks, with stable learning and minimal hyperparameter tuning, in contrast to adversarial approaches. For the open-source implementation and simulation results, see https://joemwatson.github.io/csil/.

demonstration, learning, regularization, (13 more...)

arXiv.org Artificial Intelligence

2305.16498

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

A Comparative Study of AI-Generated (GPT-4) and Human-crafted MCQs in Programming Education

Doughty, Jacob, Wan, Zipiao, Bompelli, Anishka, Qayum, Jubahed, Wang, Taozhi, Zhang, Juran, Zheng, Yujia, Doyle, Aidan, Sridhar, Pragnya, Agarwal, Arav, Bogart, Christopher, Keylor, Eric, Kultur, Can, Savelka, Jaromir, Sakr, Majd

arXiv.org Artificial IntelligenceDec-5-2023

There is a constant need for educators to develop and maintain effective up-to-date assessments. While there is a growing body of research in computing education on utilizing large language models (LLMs) in generation and engagement with coding exercises, the use of LLMs for generating programming MCQs has not been extensively explored. We analyzed the capability of GPT-4 to produce multiple-choice questions (MCQs) aligned with specific learning objectives (LOs) from Python programming classes in higher education. Specifically, we developed an LLM-powered (GPT-4) system for generation of MCQs from high-level course context and module-level LOs. We evaluated 651 LLM-generated and 449 human-crafted MCQs aligned to 246 LOs from 6 Python courses. We found that GPT-4 was capable of producing MCQs with clear language, a single correct choice, and high-quality distractors. We also observed that the generated MCQs appeared to be well-aligned with the LOs. Our findings can be leveraged by educators wishing to take advantage of the state-of-the-art generative models to support MCQ authoring efforts.

bloom, distractor, mcq, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3636243.3636256

2312.03173

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.15)
Oceania > Australia > New South Wales > Sydney (0.05)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.46)
Education > Educational Setting > Higher Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Toward Energy-Efficient Massive MIMO: Graph Neural Network Precoding for Mitigating Non-Linear PA Distortion

Feys, Thomas, Van der Perre, Liesbet, Rottenberg, François

arXiv.org Artificial IntelligenceDec-5-2023

Massive MIMO systems are typically designed assuming linear power amplifiers (PAs). However, PAs are most energy efficient close to saturation, where non-linear distortion arises. For conventional precoders, this distortion can coherently combine at user locations, limiting performance. We propose a graph neural network (GNN) to learn a mapping between channel and precoding matrices, which maximizes the sum rate affected by non-linear distortion, using a high-order polynomial PA model. In the distortion-limited regime, this GNN-based precoder outperforms zero forcing (ZF), ZF plus digital pre-distortion (DPD) and the distortion-aware beamforming (DAB) precoder from the state-of-the-art. At an input back-off of -3 dB the proposed precoder compared to ZF increases the sum rate by 8.60 and 8.84 bits/channel use for two and four users respectively. Radiation patterns show that these gains are achieved by transmitting the non-linear distortion in non-user directions. In the four user-case, for a fixed sum rate, the total consumed power (PA and processing) of the GNN precoder is 3.24 and 1.44 times lower compared to ZF and ZF plus DPD respectively. A complexity analysis shows six orders of magnitude reduction compared to DAB precoding. This opens perspectives to operate PAs closer to saturation, which drastically increases their energy efficiency.

distortion, gnn, precoder, (16 more...)

arXiv.org Artificial Intelligence

2312.04591

Country:

South America (0.04)
North America > Central America (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Government (0.67)
Telecommunications (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Effective Backdoor Mitigation Depends on the Pre-training Objective

Verma, Sahil, Bhatt, Gantavya, Schwarzschild, Avi, Singhal, Soumye, Das, Arnav Mohanty, Shah, Chirag, Dickerson, John P, Bilmes, Jeff

arXiv.org Artificial IntelligenceDec-5-2023

Despite the advanced capabilities of contemporary machine learning (ML) models, they remain vulnerable to adversarial and backdoor attacks. This vulnerability is particularly concerning in real-world deployments, where compromised models may exhibit unpredictable behavior in critical scenarios. Such risks are heightened by the prevalent practice of collecting massive, internet-sourced datasets for pre-training multimodal models, as these datasets may harbor backdoors. Various techniques have been proposed to mitigate the effects of backdooring in these models such as CleanCLIP which is the current state-of-the-art approach. In this work, we demonstrate that the efficacy of CleanCLIP in mitigating backdoors is highly dependent on the particular objective used during model pre-training. We observe that stronger pre-training objectives correlate with harder to remove backdoors behaviors. We show this by training multimodal models on two large datasets consisting of 3 million (CC3M) and 6 million (CC6M) datapoints, under various pre-training objectives, followed by poison removal using CleanCLIP. We find that CleanCLIP is ineffective when stronger pre-training objectives are used, even with extensive hyperparameter tuning. Our findings underscore critical considerations for ML practitioners who pre-train models using large-scale web-curated data and are concerned about potential backdoor threats. Notably, our results suggest that simpler pre-training objectives are more amenable to effective backdoor removal. This insight is pivotal for practitioners seeking to balance the trade-offs between using stronger pre-training objectives and security against backdoor attacks.

accuracy, dataset, hyperparam, (12 more...)

arXiv.org Artificial Intelligence

2311.14948

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Ukraine > Crimea > Sevastopol (0.04)
Asia > Nepal (0.04)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)
Research Report > Promising Solution (0.89)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback