AITopics

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Neural Information Processing SystemsMar-13-2024, 10:29:15 GMT

General Stochastic Networks for Classification

We extend generative stochastic networks to supervised learning of representations. In particular, we introduce a hybrid training objective considering a generative and discriminative cost function governed by a trade-off parameter λ. We use a new variant of network training involving noise injection, i.e. walkback training, to jointly optimize multiple network layers. Neither additional regularization constraints, such as l1, l2 norms or dropout variants, nor pooling-or convolutional layers were added. Nevertheless, we are able to obtain state-of-the-art performance on the MNIST dataset, without using permutation invariant digits and outperform baseline models on sub-variants of the MNIST and rectangles dataset significantly.

bengio, gsn, neural information processing system, (13 more...)

Country: Europe > Austria > Styria > Graz (0.04)

Genre: Instructional Material (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Neural Information Processing SystemsMar-13-2024, 09:00:30 GMT

Sequential Monte Carlo for Graphical Models

We propose a new framework for how to use sequential Monte Carlo (SMC) algorithms for inference in probabilistic graphical models (PGM). Via a sequential decomposition of the PGM we find a sequence of auxiliary distributions defined on a monotonically increasing sequence of probability spaces. By targeting these auxiliary distributions using SMC we are able to approximate the full joint distribution defined by the PGM. One of the key merits of the SMC sampler is that it provides an unbiased estimate of the partition function of the model. We also show how it can be used within a particle Markov chain Monte Carlo framework in order to construct high-dimensional block-sampling algorithms for general PGMs.

partition function, sampler, smc sampler, (15 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(7 more...)

Genre: Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Neural Information Processing SystemsMar-13-2024, 03:00:21 GMT

Deep Knowledge Tracing Chris Piech, Jonathan Huang

Knowledge tracing--where a machine models the knowledge of a student as they interact with coursework--is a well established problem in computer supported education. Though effectively modeling student knowledge would have high educational impact, the task has many inherent challenges. In this paper we explore the utility of using Recurrent Neural Networks (RNNs) to model student learning. The RNN family of models have important advantages over previous methods in that they do not require the explicit encoding of human domain knowledge, and can capture more complex representations of student knowledge.

dataset, knowledge, student, (16 more...)

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Japan > Honshū > Kantō > Chiba Prefecture > Chiba (0.04)

Genre:

Research Report (0.94)
Instructional Material (0.70)

Industry:

Education > Educational Setting (0.71)
Education > Educational Technology > Educational Software > Computer Based Training (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Neural Information Processing SystemsMar-13-2024, 00:45:51 GMT

Large-Scale Bayesian Multi-Label Learning via Topic-Based Label Embeddings

We present a scalable Bayesian multi-label learning model based on learning lowdimensional label embeddings. Our model assumes that each label vector is generated as a weighted combination of a set of topics (each topic being a distribution over labels), where the combination weights (i.e., the embeddings) for each label vector are conditioned on the observed feature vector. This construction, coupled with a Bernoulli-Poisson link function for each label of the binary label vector, leads to a model with a computational cost that scales in the number of positive labels in the label matrix. This makes the model particularly appealing for real-world multi-label learning problems where the label matrix is usually very massive but highly sparse. Using a data-augmentation strategy leads to full local conjugacy in our model, facilitating simple and very efficient Gibbs sampling, as well as an Expectation Maximization algorithm for inference. Also, predicting the label vector at test time does not require doing an inference for the label embeddings and can be done in closed form. We report results on several benchmark data sets, comparing our model with various state-of-the art methods.

inference, label matrix, positive label, (12 more...)

Country: Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.67)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Energy > Power Industry (0.46)
Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)

RECIPE4U: Student-ChatGPT Interaction Dataset in EFL Writing Education

Han, Jieun, Yoo, Haneul, Myung, Junho, Kim, Minsun, Lee, Tak Yeon, Ahn, So-Yeon, Oh, Alice

The integration of generative AI in education is expanding, yet empirical analyses of large-scale and real-world interactions between students and AI systems still remain limited. Addressing this gap, we present RECIPE4U (RECIPE for University), a dataset sourced from a semester-long experiment with 212 college students in English as Foreign Language (EFL) writing courses. During the study, students engaged in dialogues with ChatGPT to revise their essays. RECIPE4U includes comprehensive records of these interactions, including conversation logs, students' intent, students' self-rated satisfaction, and students' essay edit histories. In particular, we annotate the students' utterances in RECIPE4U with 13 intention labels based on our coding schemes. We establish baseline results for two subtasks in task-oriented dialogue systems within educational contexts: intent detection and satisfaction estimation. As a foundational step, we explore student-ChatGPT interaction patterns through RECIPE4U and analyze them by focusing on students' dialogue, essay data statistics, and students' essay edits. We further illustrate potential applications of RECIPE4U dataset for enhancing the incorporation of LLMs in educational frameworks. RECIPE4U is publicly available at https://zeunie.github.io/RECIPE4U/.

chatgpt, student, utterance, (13 more...)

2403.08272

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre:

Instructional Material (1.00)
Research Report > Experimental Study (0.46)

Industry:

Education > Educational Setting > Higher Education (1.00)
Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Jacobs, Sven, Jaschke, Steffen

Evaluating the Application of Large Language Models to Generate Feedback in Programming Education

This study investigates the application of large language models, specifically GPT-4, to enhance programming education. The research outlines the design of a web application that uses GPT-4 to provide feedback on programming tasks, without giving away the solution. A web application for working on programming tasks was developed for the study and evaluated with 51 students over the course of one semester. The results show that most of the feedback generated by GPT-4 effectively addressed code errors. However, challenges with incorrect suggestions and hallucinated issues indicate the need for further improvements.

language model, student, unit test, (17 more...)

2403.09744

Country:

North America > United States > New York (0.05)
Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Siegen (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(4 more...)

Genre:

Instructional Material (0.94)
Research Report > New Finding (0.34)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

AcademiaOS: Automating Grounded Theory Development in Qualitative Research with Large Language Models

Übellacker, Thomas

AcademiaOS is a first attempt to automate grounded theory development in qualitative research with large language models. Using recent large language models' language understanding, generation, and reasoning capabilities, AcademiaOS codes curated qualitative raw data such as interview transcripts and develops themes and dimensions to further develop a grounded theoretical model, affording novel insights. A user study (n=19) suggests that the system finds acceptance in the academic community and exhibits the potential to augment humans in qualitative research. AcademiaOS has been made open-source for others to build upon and adapt to their use cases.

academiaos, dimension, theory development, (16 more...)

2403.08844

Country:

Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)
North America > United States > Ohio (0.04)
North America > Canada (0.04)
(2 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Instructional Material (1.00)
Overview (0.92)
Research Report > Experimental Study (0.68)

Industry:

Law (1.00)
Health & Medicine > Consumer Health (1.00)
Government (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Yildirim, Murat Onur, Yildirim, Elif Ceren Gok, Mocanu, Decebal Constantin, Vanschoren, Joaquin

FOCIL: Finetune-and-Freeze for Online Class Incremental Learning by Training Randomly Pruned Sparse Experts

Class incremental learning (CIL) in an online continual learning setting strives to acquire knowledge on a series of novel classes from a data stream, using each data point only once for training. This is more realistic compared to offline modes, where it is assumed that all data from novel class(es) is readily available. Current online CIL approaches store a subset of the previous data which creates heavy overhead costs in terms of both memory and computation, as well as privacy issues. In this paper, we propose a new online CIL approach called FOCIL. It fine-tunes the main architecture continually by training a randomly pruned sparse subnetwork for each task. Then, it freezes the trained connections to prevent forgetting. FOCIL also determines the sparsity level and learning rate per task adaptively and ensures (almost) zero forgetting across all tasks without storing any replay data. Experimental results on 10-Task CIFAR100, 20-Task CIFAR100, and 100-Task TinyImagenet, demonstrate that our method outperforms the SOTA by a large margin. The code is publicly available at https://github.com/muratonuryildirim/FOCIL.

focil, learning, online class incremental learning, (11 more...)

2403.14684

Country: Europe > Netherlands > North Brabant > Eindhoven (0.04)

Genre:

Research Report (1.00)
Instructional Material > Online (0.88)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsMar-12-2024, 19:59:07 GMT

End-to-End Kernel Learning with Supervised Convolutional Kernel Networks

In this paper, we introduce a new image representation based on a multilayer kernel machine. Unlike traditional kernel methods where data representation is decoupled from the prediction task, we learn how to shape the kernel with supervision. We proceed by first proposing improvements of the recently-introduced convolutional kernel networks (CKNs) in the context of unsupervised learning; then, we derive backpropagation rules to take advantage of labeled training data. The resulting model is a new type of convolutional neural network, where optimizing the filters at each layer is equivalent to learning a linear subspace in a reproducing kernel Hilbert space (RKHS). We show that our method achieves reasonably competitive performance for image classification on some standard "deep learning" datasets such as CIFAR-10 and SVHN, and also for image super-resolution, demonstrating the applicability of our approach to a large variety of image-related tasks.

convolutional kernel network, convolutional neural network, kernel, (11 more...)

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Instructional Material > Course Syllabus & Notes (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)