AITopics

2403.14668

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Tennessee > Shelby County > Memphis (0.05)
North America > United States > Florida > Orange County > Orlando (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-4-2024

A Tutorial on the Pretrain-Finetune Paradigm for Natural Language Processing

Wang, Yu

The pretrain-finetune paradigm represents a transformative approach in natural language processing (NLP). This paradigm distinguishes itself through the use of large pretrained language models, demonstrating remarkable efficiency in finetuning tasks, even with limited training data. This efficiency is especially beneficial for research in social sciences, where the number of annotated samples is often quite limited. Our tutorial offers a comprehensive introduction to the pretrain-finetune paradigm. We first delve into the fundamental concepts of pretraining and finetuning, followed by practical exercises using real-world applications. We demonstrate the application of the paradigm across various tasks, including multi-class classification and regression. Emphasizing its efficacy and user-friendliness, the tutorial aims to encourage broader adoption of this paradigm. To this end, we have provided open access to all our code and datasets. The tutorial is particularly valuable for quantitative researchers in psychology, offering them an insightful guide into this innovative approach.

language model, prediction, pretrain-finetune paradigm, (14 more...)

2403.02504

Country:

Oceania > New Zealand (0.05)
Oceania > Australia (0.04)
North America > United States (0.04)
(4 more...)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Chernozhukov, Victor, Hansen, Christian, Kallus, Nathan, Spindler, Martin, Syrgkanis, Vasilis

Applied Causal Inference Powered by ML and AI

arXiv.org Machine LearningMar-4-2024

This book aims to provide a working introduction to the emerging fusion of modern statistical inference - aka machine learning (ML) or artificial intelligence (AI) - and causal inference methods. The book is aimed at upper level undergraduates and master's-level students as well as doctoral students focusing on applied empirical research. A sufficient background for the core material is one semester of introductory econometrics and one semester of machine learning. We hope the book is also useful to empirical researchers looking to apply modern methods in their work. The book provides an overview of key ideas in both predictive inference and causal inference and shows how predictive tools are key ingredients to answering many causal questions.

equation modelling and conditional exogeneity, intervention induce new counterfactual distribution, random assignment randomized controlled trial, (17 more...)

arXiv.org Machine Learning

2403.02467

Country:

North America > Canada > Ontario > Toronto (0.27)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.13)
North America > United States > New York (0.04)
(21 more...)

Genre:

Workflow (1.00)
Summary/Review (1.00)
Research Report > Strength High (1.00)
(5 more...)

Industry:

Marketing (1.00)
Law (1.00)
Information Technology (1.00)
(10 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(6 more...)

Bornschein, Jorg, Li, Yazhe, Rannen-Triki, Amal

Transformers for Supervised Online Continual Learning

Transformers have become the dominant architecture for sequence modeling tasks such as natural language processing or audio processing, and they are now even considered for tasks that are not naturally sequential such as image classification. Their ability to attend to and to process a set of tokens as context enables them to develop in-context few-shot learning abilities. However, their potential for online continual learning remains relatively unexplored. In online continual learning, a model must adapt to a non-stationary stream of data, minimizing the cumulative nextstep prediction loss. We focus on the supervised online continual learning setting, where we learn a predictor $x_t \rightarrow y_t$ for a sequence of examples $(x_t, y_t)$. Inspired by the in-context learning capabilities of transformers and their connection to meta-learning, we propose a method that leverages these strengths for online continual learning. Our approach explicitly conditions a transformer on recent observations, while at the same time online training it with stochastic gradient descent, following the procedure introduced with Transformer-XL. We incorporate replay to maintain the benefits of multi-epoch training while adhering to the sequential protocol. We hypothesize that this combination enables fast adaptation through in-context learning and sustained longterm improvement via parametric learning. Our method demonstrates significant improvements over previous state-of-the-art results on CLOC, a challenging large-scale real-world benchmark for image geo-localization.

artificial intelligence, machine learning, natural language, (16 more...)

2403.01554

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Albania > Fier County (0.04)

Genre:

Research Report (1.00)
Instructional Material > Online (1.00)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.68)

Exploring the Design of Generative AI in Supporting Music-based Reminiscence for Older Adults

Jin, Yucheng, Cai, Wanling, Chen, Li, Zhang, Yizhe, Doherty, Gavin, Jiang, Tonglin

Music-based reminiscence has the potential to positively impact the psychological well-being of older adults. However, the aging process and physiological changes, such as memory decline and limited verbal communication, may impede the ability of older adults to recall their memories and life experiences. Given the advanced capabilities of generative artificial intelligence (AI) systems, such as generated conversations and images, and their potential to facilitate the reminiscing process, this study aims to explore the design of generative AI to support music-based reminiscence in older adults. This study follows a user-centered design approach incorporating various stages, including detailed interviews with two social workers and two design workshops (involving ten older adults). Our work contributes to an in-depth understanding of older adults' attitudes toward utilizing generative AI for supporting music-based reminiscence and identifies concrete design considerations for the future design of generative AI to enhance the reminiscence experience of older adults.

older adult, participant, reminiscence, (13 more...)

2403.01413

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.05)
Asia > China > Hong Kong (0.05)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Personal > Interview (0.93)
Overview (0.92)
Instructional Material > Course Syllabus & Notes (0.67)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos

Niu, Yulei, Guo, Wenliang, Chen, Long, Lin, Xudong, Chang, Shih-Fu

We study the problem of procedure planning in instructional videos, which aims to make a goal-oriented sequence of action steps given partial visual state observations. The motivation of this problem is to learn a structured and plannable state and action space. Recent works succeeded in sequence modeling of steps with only sequence-level annotations accessible during training, which overlooked the roles of states in the procedures. In this work, we point out that State CHangEs MAtter (SCHEMA) for procedure planning in instructional videos. We aim to establish a more structured state space by investigating the causal relations between steps and states in procedures. Specifically, we explicitly represent each step as state changes and track the state changes in procedures. For step representation, we leveraged the commonsense knowledge in large language models (LLMs) to describe the state changes of steps via our designed chain-of-thought prompting. For state change tracking, we align visual state observations with language state descriptions via cross-modal contrastive learning, and explicitly model the intermediate states of the procedure using LLM-generated state descriptions. Experiments on CrossTask, COIN, and NIV benchmark datasets demonstrate that our proposed SCHEMA model achieves state-of-the-art performance and obtains explainable visualizations. Humans are natural experts in procedure planning, i.e., arranging a sequence of instruction steps to achieve a specific goal. Procedure planning is an essential and fundamental reasoning ability for embodied AI systems and is crucial in complicated real-world problems like robotic navigation (Tellex et al., 2011; Jansen, 2020; Brohan et al., 2022). Instruction steps in procedural tasks are commonly state-modifying actions that induce state changes of objects. For example, for the task of "grilling steak", a raw steak would be first topped with pepper after "seasoning the steak", then placed on the grill before "closing the lid", and become cooked pieces after "cutting the steak". These before-states and after-states reflect fine-grained information like shape, color, size, and location of entities. Therefore, the planning agents need to figure out both the temporal relations between action steps and the causal relations between steps and states.

procedure planning, state change, video, (14 more...)

2403.01599

Country:

North America > United States (0.14)
Asia > China > Hong Kong (0.04)

Genre:

Research Report (1.00)
Workflow (0.93)
Instructional Material > Course Syllabus & Notes (0.82)

Industry:

Education > Educational Technology > Audio & Video (0.92)
Education > Educational Technology > Media (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Enhancing Neural Machine Translation of Low-Resource Languages: Corpus Development, Human Evaluation and Explainable AI Architectures

Lankford, Séamus

evaluation and explainable ai architecture, infrastructure rapid prototype development, language resource and evaluation conference, (14 more...)

In the current machine translation (MT) landscape, the Transformer architecture stands out as the gold standard, especially for high-resource language pairs. This research delves into its efficacy for low-resource language pairs including both the English$\leftrightarrow$Irish and English$\leftrightarrow$Marathi language pairs. Notably, the study identifies the optimal hyperparameters and subword model type to significantly improve the translation quality of Transformer models for low-resource language pairs. The scarcity of parallel datasets for low-resource languages can hinder MT development. To address this, gaHealth was developed, the first bilingual corpus of health data for the Irish language. Focusing on the health domain, models developed using this in-domain dataset exhibited very significant improvements in BLEU score when compared with models from the LoResMT2021 Shared Task. A subsequent human evaluation using the multidimensional quality metrics error taxonomy showcased the superior performance of the Transformer system in reducing both accuracy and fluency errors compared to an RNN-based counterpart. Furthermore, this thesis introduces adaptNMT and adaptMLLM, two open-source applications streamlined for the development, fine-tuning, and deployment of neural machine translation models. These tools considerably simplify the setup and evaluation process, making MT more accessible to both developers and translators. Notably, adaptNMT, grounded in the OpenNMT ecosystem, promotes eco-friendly natural language processing research by highlighting the environmental footprint of model development. Fine-tuning of MLLMs by adaptMLLM demonstrated advancements in translation performance for two low-resource language pairs: English$\leftrightarrow$Irish and English$\leftrightarrow$Marathi, compared to baselines from the LoResMT2021 Shared Task.

2403.0158

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Portugal > Lisbon > Lisbon (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
(37 more...)

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Overview (1.00)
(2 more...)

Industry:

Information Technology > Services (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Fernandez, Nigel, Scarlatos, Alexander, Lan, Andrew

SyllabusQA: A Course Logistics Question Answering Dataset

arXiv.org Artificial IntelligenceMar-2-2024

Moreover, text similarity metrics may not be suitable In educational applications, artificial intelligence for some open-ended natural language generation (AI) approaches have shown significant promise in tasks (Amidei et al., 2018). As an example, the answer improving learning outcomes (Aleven et al., 2016; "The final exam will be on Dec 15", has high VanLehn, 2011), by automatically providing feedback surface-level textual similarity with the reference to students or engaging in tutoring dialogues answer, "The final exam is on Dec 14", but contains with them. The key idea is to use AI to create an ondemand a critical factual error that may lead to significant virtual teaching assistant to interact with negative consequences to students. Meanwhile, many students simultaneously; see, e.g., Khamigo human instructors and teaching assistants often answer from Khan Academy (Academy, 2022). These approaches student questions in a concise way, without can scale up the effort of expert human giving any unnecessary information. Therefore, it teachers and tutors, and relieve them from doing is important for LLM-based approaches to generate repetitive tasks so that they can focus on providing answers that are both concise and precise.

information, question type, yllabus qa, (16 more...)

2403.14666

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
(5 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Rohani, Narjes, Rohani, Behnam, Manataki, Areti

ClickTree: A Tree-based Method for Predicting Math Students' Performance Based on Clickstream Data

arXiv.org Artificial IntelligenceMar-1-2024

ClickTree: A Tree-based Method for Predicting Math Students' Performance Based on Clickstream Data The prediction of student performance and the analysis of students' learning behavior play an important role in enhancing online courses. By analysing a massive amount of clickstream data that captures student behavior, educators can gain valuable insights into the factors that influence academic outcomes and identify areas of improvement in courses. In this study, we developed ClickTree, a tree-based methodology, to predict student performance in mathematical assignments based on students' clickstream data. We extracted a set of features, including problem-level, assignment-level and student-level features, from the extensive clickstream data and trained a CatBoost tree to predict whether a student successfully answers a problem in an assignment. The developed method achieved an AUC of 0.78844 in the Educational Data Mining Cup 2023 and ranked second in the competition. Furthermore, our results indicate that students encounter more difficulties in the problem types that they must select a subset of answers from a given set as well as problem subjects of Algebra II. Additionally, students who performed well in answering end-unit assignment problems engaged more with in-unit assignments and answered more problems correctly, while those who struggled had higher tutoring request rate. The proposed method can be utilized to improve students' learning experiences, and the above insights can be integrated into mathematical courses to enhance students' learning outcomes. In recent years, massive amounts of log data have been collected from students' interactions with online courses, providing researchers with valuable information to analyze student behavior and its impact on academic performance (Yi et al., 2018; Aljohani et al., 2019). By examining clickstream data, educators can gain deeper insights into students' study habits, navigation patterns, and levels of engagement (Wen and Rosé, 2014; Li et al., 2020; Matcha et al., 2020).

assignment, clickstream data, student, (16 more...)

2403.14664

Country:

North America > United States > New York (0.04)
Europe > Middle East > Malta > Port Region > Southern Harbour District > Valletta (0.04)
Asia > Thailand > Chiang Mai > Chiang Mai (0.04)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)
Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMar-1-2024

VisRec: A Semi-Supervised Approach to Radio Interferometric Data Reconstruction

Wang, Ruoqi, Wang, Haitao, Luo, Qiong, Wang, Feng, Wu, Hejun

Radio telescopes produce visibility data about celestial objects, but these data are sparse and noisy. As a result, images created on raw visibility data are of low quality. Recent studies have used deep learning models to reconstruct visibility data to get cleaner images. However, these methods rely on a substantial amount of labeled training data, which requires significant labeling effort from radio astronomers. Addressing this challenge, we propose VisRec, a model-agnostic semi-supervised learning approach to the reconstruction of visibility data. Specifically, VisRec consists of both a supervised learning module and an unsupervised learning module. In the supervised learning module, we introduce a set of data augmentation functions to produce diverse training examples. In comparison, the unsupervised learning module in VisRec augments unlabeled data and uses reconstructions from non-augmented visibility data as pseudo-labels for training. This hybrid approach allows VisRec to effectively leverage both labeled and unlabeled data. This way, VisRec performs well even when labeled data is scarce. Our evaluation results show that VisRec outperforms all baseline methods in reconstruction quality, robustness against common observation perturbation, and generalizability to different telescope configurations.

reconstruction, visibility data, visrec, (15 more...)

2403.00897

Country:

North America > United States > Arizona (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report > New Finding (0.88)
Instructional Material > Course Syllabus & Notes (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)