AITopics

2307.05574

Country:

Oceania > New Zealand (0.04)
North America > Bermuda (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(8 more...)

Genre:

Instructional Material (0.93)
Research Report (0.64)

Industry:

Law (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

arXiv.org Artificial IntelligenceJul-10-2023

Selective Sampling and Imitation Learning via Online Regression

Sekhari, Ayush, Sridharan, Karthik, Sun, Wen, Wu, Runzhe

We consider the problem of Imitation Learning (IL) by actively querying noisy expert for feedback. While imitation learning has been empirically successful, much of prior work assumes access to noiseless expert feedback which is not practical in many applications. In fact, when one only has access to noisy expert feedback, algorithms that rely on purely offline data (non-interactive IL) can be shown to need a prohibitively large number of samples to be successful. In contrast, in this work, we provide an interactive algorithm for IL that uses selective sampling to actively query the noisy expert for feedback. Our contributions are twofold: First, we provide a new selective sampling algorithm that works with general function classes and multiple actions, and obtains the best-known bounds for the regret and the number of queries. Next, we extend this analysis to the problem of IL with noisy expert feedback and provide a new IL algorithm that makes limited queries. Our algorithm for selective sampling leverages function approximation, and relies on an online regression oracle w.r.t.~the given model class to predict actions, and to decide whether to query the expert for its label. On the theoretical side, the regret bound of our algorithm is upper bounded by the regret of the online regression oracle, while the query complexity additionally depends on the eluder dimension of the model class. We complement this with a lower bound that demonstrates that our results are tight. We extend our selective sampling algorithm for IL with general function approximation and provide bounds on both the regret and the number of queries made to the noisy expert. A key novelty here is that our regret and query complexity bounds only depend on the number of times the optimal policy (and not the noisy expert, or the learner) go to states that have a small margin.

algorithm, artificial intelligence, machine learning, (15 more...)

2307.04998

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre:

Instructional Material > Course Syllabus & Notes (0.45)
Research Report > New Finding (0.34)

Industry: Education > Educational Setting > Online (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Dao, Xuan-Quy, Le, Ngoc-Bich, Phan, Xuan-Dung, Ngo, Bac-Bien

Can ChatGPT pass the Vietnamese National High School Graduation Examination?

arXiv.org Artificial IntelligenceJul-10-2023

This research article highlights the potential of AI-powered chatbots in education and presents the results of using ChatGPT, a large language model, to complete the Vietnamese National High School Graduation Examination (VNHSGE). The study dataset included 30 essays in the literature test case and 1,700 multiple-choice questions designed for other subjects. The results showed that ChatGPT was able to pass the examination with an average score of 6-7, demonstrating the technology's potential to revolutionize the educational landscape. The analysis of ChatGPT performance revealed its proficiency in a range of subjects, including mathematics, English, physics, chemistry, biology, history, geography, civic education, and literature, which suggests its potential to provide effective support for learners. However, further research is needed to assess ChatGPT performance on more complex exam questions and its potential to support learners in different contexts. As technology continues to evolve and improve, we can expect to see the use of AI tools like ChatGPT become increasingly common in educational settings, ultimately enhancing the educational experience for both students and educators.

large language model, machine learning, natural language, (19 more...)

2306.0917

Country:

North America > United States (0.14)
Asia > Vietnam (0.05)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Education > Educational Setting > K-12 Education > Secondary School (0.72)
Education > Educational Technology > Educational Software > Computer Based Training (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-9-2023

A User Study on Explainable Online Reinforcement Learning for Adaptive Systems

Metzger, Andreas, Laufer, Jan, Feit, Felix, Pohl, Klaus

Online reinforcement learning (RL) is increasingly used for realizing adaptive systems in the presence of design time uncertainty. Online RL facilitates learning from actual operational data and thereby leverages feedback only available at runtime. However, Online RL requires the definition of an effective and correct reward function, which quantifies the feedback to the RL algorithm and thereby guides learning. With Deep RL gaining interest, the learned knowledge is no longer explicitly represented, but is represented as a neural network. For a human, it becomes practically impossible to relate the parametrization of the neural network to concrete RL decisions. Deep RL thus essentially appears as a black box, which severely limits the debugging of adaptive systems. We previously introduced the explainable RL technique XRL-DINE, which provides visual insights into why certain decisions were made at important time points. Here, we introduce an empirical user study involving 54 software engineers from academia and industry to assess (1) the performance of software engineers when performing different tasks using XRL-DINE and (2) the perceived usefulness and ease of use of XRL-DINE.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2307.04098

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
Europe > Spain > Andalusia > Seville Province > Seville (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.93)
Instructional Material > Online (0.62)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Gladkoff, Serge, Han, Lifeng, Nenadic, Goran

Student's t-Distribution: On Measuring the Inter-Rater Reliability When the Observations are Scarce

arXiv.org Artificial IntelligenceJul-9-2023

In natural language processing (NLP) we always rely on human judgement as the golden quality evaluation method. However, there has been an ongoing debate on how to better evaluate inter-rater reliability (IRR) levels for certain evaluation tasks, such as translation quality evaluation (TQE), especially when the data samples (observations) are very scarce. In this work, we first introduce the study on how to estimate the confidence interval for the measurement value when only one data (evaluation) point is available. Then, this leads to our example with two human-generated observational scores, for which, we introduce ``Student's \textit{t}-Distribution'' method and explain how to use it to measure the IRR score using only these two data points, as well as the confidence intervals (CIs) of the quality evaluation. We give quantitative analysis on how the evaluation confidence can be greatly improved by introducing more observations, even if only one extra observation. We encourage researchers to report their IRR scores in all possible means, e.g. using Student's \textit{t}-Distribution method whenever possible; thus making the NLP evaluation more meaningful, transparent, and trustworthy. This \textit{t}-Distribution method can be also used outside of NLP fields to measure IRR level for trustworthy evaluation of experimental investigations, whenever the observational data is scarce. Keywords: Inter-Rater Reliability (IRR); Scarce Observations; Confidence Intervals (CIs); Natural Language Processing (NLP); Translation Quality Evaluation (TQE); Student's \textit{t}-Distribution

artificial intelligence, evaluation, natural language, (17 more...)

2303.04526

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Health & Medicine (1.00)
Education (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Yuan, Zhenyuan, Zhu, Minghui

Lightweight Distributed Gaussian Process Regression for Online Machine Learning

arXiv.org Artificial IntelligenceJul-9-2023

In this paper, we study the problem where a group of agents aim to collaboratively learn a common static latent function through streaming data. We propose a lightweight distributed Gaussian process regression (GPR) algorithm that is cognizant of agents' limited capabilities in communication, computation and memory. Each agent independently runs agent-based GPR using local streaming data to predict test points of interest; then the agents collaboratively execute distributed GPR to obtain global predictions over a common sparse set of test points; finally, each agent fuses results from distributed GPR with agent-based GPR to refine its predictions. By quantifying the transient and steady-state performances in predictive variance and error, we show that limited inter-agent communication improves learning performances in the sense of Pareto. Monte Carlo simulation is conducted to evaluate the developed algorithm.

agg, artificial intelligence, machine learning, (18 more...)

2105.04738

Country:

North America > United States > Pennsylvania (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Massachusetts (0.04)
(8 more...)

Genre:

Research Report (0.49)
Instructional Material > Online (0.40)

Industry:

Education (0.67)
Energy (0.45)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Deep Generative Models for Decision-Making and Control

Janner, Michael

Deep model-based reinforcement learning methods offer a conceptually simple approach to the decision-making and control problem: use learning for the purpose of estimating an approximate dynamics model, and offload the rest of the work to classical trajectory optimization. However, this combination has a number of empirical shortcomings, limiting the usefulness of model-based methods in practice. The dual purpose of this thesis is to study the reasons for these shortcomings and to propose solutions for the uncovered problems. We begin by generalizing the dynamics model itself, replacing the standard single-step formulation with a model that predicts over probabilistic latent horizons. The resulting model, trained with a generative reinterpretation of temporal difference learning, leads to infinite-horizon variants of the procedures central to model-based control, including the model rollout and model-based value estimation.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

2306.0881

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Workflow (1.00)
Research Report (1.00)
Instructional Material (0.92)

Industry: Education > Educational Setting (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)

Designing a Direct Feedback Loop between Humans and Convolutional Neural Networks through Local Explanations

Sun, Tong Steven, Gao, Yuyang, Khaladkar, Shubham, Liu, Sijia, Zhao, Liang, Kim, Young-Ho, Hong, Sungsoo Ray

The local explanation provides heatmaps on images to explain how Convolutional Neural Networks (CNNs) derive their output. Due to its visual straightforwardness, the method has been one of the most popular explainable AI (XAI) methods for diagnosing CNNs. Through our formative study (S1), however, we captured ML engineers' ambivalent perspective about the local explanation as a valuable and indispensable envision in building CNNs versus the process that exhausts them due to the heuristic nature of detecting vulnerability. Moreover, steering the CNNs based on the vulnerability learned from the diagnosis seemed highly challenging. To mitigate the gap, we designed DeepFuse, the first interactive design that realizes the direct feedback loop between a user and CNNs in diagnosing and revising CNN's vulnerability using local explanations. DeepFuse helps CNN engineers to systemically search "unreasonable" local explanations and annotate the new boundaries for those identified as unreasonable in a labor-efficient manner. Next, it steers the model based on the given annotation such that the model doesn't introduce similar mistakes. We conducted a two-day study (S2) with 12 experienced CNN engineers. Using DeepFuse, participants made a more accurate and "reasonable" model than the current state-of-the-art. Also, participants found the way DeepFuse guides case-based reasoning can practically improve their current practice. We provide implications for design that explain how future HCI-driven design can move our practice forward to make XAI-driven insights more actionable.

artificial intelligence, deep learning, machine learning, (16 more...)

2307.04036

Country:

North America > United States > Michigan (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > Idaho > Bonneville County > Idaho Falls (0.04)

Genre:

Questionnaire & Opinion Survey (1.00)
Overview (0.92)
Research Report > New Finding (0.67)
(2 more...)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Kim, Taewoon, Cochez, Michael, François-Lavet, Vincent, Neerincx, Mark, Vossen, Piek

A Machine with Short-Term, Episodic, and Semantic Memory Systems

Inspired by the cognitive science theory of the explicit human memory systems, we have modeled an agent with short-term, episodic, and semantic memory systems, each of which is modeled with a knowledge graph. To evaluate this system and analyze the behavior of this agent, we designed and released our own reinforcement learning agent environment, "the Room", where an agent has to learn how to encode, store, and retrieve memories to maximize its return by answering questions. We show that our deep Q-learning based agent successfully learns whether a short-term memory should be forgotten, or rather be stored in the episodic or semantic memory systems. Our experiments indicate that an agent with human-like memory systems can outperform an agent without this memory structure in the environment.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

doi: 10.1609/aaai.v37i1.25075

2212.02098

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
(3 more...)

Genre:

Research Report (0.50)
Instructional Material (0.48)

Industry: Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Oikawa, Hiroki, Ge, Hangli, Koshizuka, Noboru

Efficient Compressed Ratio Estimation Using Online Sequential Learning for Edge Computing

Owing to the widespread adoption of the Internet of Things, a vast amount of sensor information is being acquired in real time. Accordingly, the communication cost of data from edge devices is increasing. Compressed sensing (CS), a data compression method that can be used on edge devices, has been attracting attention as a method to reduce communication costs. In CS, estimating the appropriate compression ratio is important. There is a method to adaptively estimate the compression ratio for the acquired data using reinforcement learning (RL). However, the computational costs associated with existing RL methods that can be utilized on edges are often high. In this study, we developed an efficient RL method for edge devices, referred to as the actor--critic online sequential extreme learning machine (AC-OSELM), and a system to compress data by estimating an appropriate compression ratio on the edge using AC-OSELM. The performance of the proposed method in estimating the compression ratio is evaluated by comparing it with other RL methods for edge devices. The experimental results indicate that AC-OSELM demonstrated the same or better compression performance and faster compression ratio estimation than the existing methods.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2211.04284

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Europe > France (0.04)

Genre:

Instructional Material > Online (0.40)
Research Report > New Finding (0.34)

Industry: Information Technology (0.49)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)