AITopics | Instructional Material

Collaborating Authors

Instructional Material

Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts

arXiv.org Artificial IntelligenceFeb-23-2024

Steering the behavior of a strong model pre-trained on internet-scale data can be difficult due to the scarcity of competent supervisors. Recent studies reveal that, despite supervisory noises, a strong student model may surpass its weak teacher when fine-tuned on specific objectives. Yet, the effectiveness of such weak-to-strong generalization remains limited, especially in the presence of large capability gaps. In this paper, we propose to address this challenge by harnessing a diverse set of specialized teachers, instead of a single generalist one, that collectively supervises the strong student. Our approach resembles the classical hierarchical mixture of experts, with two components tailored for co-supervision: (i) we progressively alternate student training and teacher assignment, leveraging the growth of the strong student to identify plausible supervisions; (ii) we conservatively enforce teacher-student and local-global consistency, leveraging their dependencies to reject potential annotation noises. We validate the proposed method through visual recognition tasks on the OpenAI weak-to-strong benchmark and additional multi-domain datasets. Our code is available at \url{https://github.com/yuejiangliu/csl}.

student, supervisor, weak supervisor, (14 more...)

arXiv.org Artificial Intelligence

2402.15505

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.46)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Education (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Build your own chatbots and more with this bundle -- 125 off this week

PCWorldFeb-22-2024, 10:00:00 GMT

You've probably heard of ChatGPT by now, and heard about some of the impacts that artificial intelligence is having on the working world. AI is shaping the future, but humans still shape AI, and if you want to know how, then The Ultimate AI ChatGPT & Python Programming Bundle is for you. This bundle includes 14 courses from some of the web's top instructors, including John Elder (4.4/5-star instructor rating), Hugo Ferro (4.6/5-star rating), and Dr. Chris Mall (4.4/5-star rating). Through these courses, you'll learn how to use Python (and its various libraries) for machine learning, data science, and more. You'll also get a crash course in how to build your own chatbots that use ChatGPT to answer queries.

5-star rating, chatgpt & python programming bundle, ultimate ai chatgpt, (2 more...)

PCWorld

Genre: Instructional Material > Course Syllabus & Notes (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

Leveraging Large Language Models for Concept Graph Recovery and Question Answering in NLP Education

Yang, Rui, Yang, Boming, Ouyang, Sixun, She, Tianwei, Feng, Aosong, Jiang, Yuang, Lecue, Freddy, Lu, Jinghui, Li, Irene

arXiv.org Artificial IntelligenceFeb-22-2024

In the domain of Natural Language Processing (NLP), Large Language Models (LLMs) have demonstrated promise in text-generation tasks. However, their educational applications, particularly for domain-specific queries, remain underexplored. This study investigates LLMs' capabilities in educational scenarios, focusing on concept graph recovery and question-answering (QA). We assess LLMs' zero-shot performance in creating domain-specific concept graphs and introduce TutorQA, a new expert-verified NLP-focused benchmark for scientific graph reasoning and QA. TutorQA consists of five tasks with 500 QA pairs. To tackle TutorQA queries, we present CGLLM, a pipeline integrating concept graphs with LLMs for answering diverse questions. Our results indicate that LLMs' zero-shot concept graph recovery is competitive with supervised methods, showing an average 3% F1 score improvement. In TutorQA tasks, LLMs achieve up to 26% F1 score enhancement. Moreover, human evaluation and analysis show that CGLLM generates answers with more fine-grained concepts.

concept graph, graph, project description, (16 more...)

arXiv.org Artificial Intelligence

2402.14293

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry: Education > Educational Setting (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

QACP: An Annotated Question Answering Dataset for Assisting Chinese Python Programming Learners

Xiao, Rui, Han, Lu, Zhou, Xiaoying, Wang, Jiong, Zong, Na, Zhang, Pengyu

arXiv.org Artificial IntelligenceFeb-22-2024

In online learning platforms, particularly in rapidly growing computer programming courses, addressing the thousands of students' learning queries requires considerable human cost. The creation of intelligent assistant large language models (LLMs) tailored for programming education necessitates distinct data support. However, in real application scenarios, the data resources for training such LLMs are relatively scarce. Therefore, to address the data scarcity in intelligent educational systems for programming, this paper proposes a new Chinese question-and-answer dataset for Python learners. To ensure the authenticity and reliability of the sources of the questions, we collected questions from actual student questions and categorized them according to various dimensions such as the type of questions and the type of learners. This annotation principle is designed to enhance the effectiveness and quality of online programming education, providing a solid data foundation for developing the programming teaching assists (TA). Furthermore, we conducted comprehensive evaluations of various LLMs proficient in processing and generating Chinese content, highlighting the potential limitations of general LLMs as intelligent teaching assistants in computer programming courses.

language model, learner, llm, (17 more...)

arXiv.org Artificial Intelligence

2402.07913

Genre:

Instructional Material > Online (0.68)
Research Report > New Finding (0.47)

Industry:

Education > Curriculum > Subject-Specific Education (0.68)
Education > Educational Setting > Higher Education (0.50)
Education > Educational Setting > Online (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Nonsmooth Nonparametric Regression via Fractional Laplacian Eigenmaps

Shi, Zhaoyang, Balasubramanian, Krishnakumar, Polonik, Wolfgang

arXiv.org Machine LearningFeb-22-2024

Laplacian based nonparametric regression is a widely used approach in machine learning that leverages the Laplacian Eigenmaps algorithm to perform regression tasks without relying on explicit parametric models. The nonparametric nature of the approach makes it flexible and adaptable to data generating process without imposing strict assumptions about the functional form of the relationship between the response and the covariates. Existing theoretical studies of this approach are restricted to establishing minimax rates of convergence and adaptivity properties when the true regression function lies in Sobolev spaces; see Section 1.1 for details. Such spaces are inherently smooth in nature and exclude important function classes in nonparametric statistics, such as piecewise constant or polynomial functions, bump functions and other such nonsmooth function classes. In this work, using the framework of fractional Laplacians, we propose a novel approach called Principal Component Regression using Fractional Laplacian Eigenmaps (PCR-FLE) for nonsmooth and nonparametric regression. The PCR-FLE algorithm generalizes the PCR-LE algorithm by Green et al. (2023) and the PCR-WLE algorithm by Shi et al. (2024), and is designed to naturally handle the case when the true regression function lies in an L

fractional sobolev space, laplacian, sobolev space, (16 more...)

arXiv.org Machine Learning

2402.14985

Country:

North America > United States > California > Yolo County > Davis (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.84)
Instructional Material > Course Syllabus & Notes (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Add feedback

How to speak to the public about AI – a Michael Wooldridge talk at #AAAI2024 today

AIHubFeb-21-2024, 14:22:14 GMT

After being struck down by Covid, our Managing Editor Lucy Smith has been unable to make the journey to AAAI 2024 to deliver the AIhub training session on science communication for AI researchers. However, Professor Michael Wooldridge has very kindly stepped in, and will present on "How to speak to the public about AI". In this talk, Michael will elaborate on 14 lessons that he has learnt during his time communicating about AI. His vast experience in speaking about AI has ranged from media interviews, to advising the UK Government, to giving the prestigious Royal Institute Christmas Lectures. The talk will take place today (Wednesday 21 February) from 14:00 – 15:00 in room 113, Vancouver Convention Centre.

michael wooldridge talk

AIHub

Genre: Instructional Material (0.65)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Interview with Célian Ringwald: Natural language processing and knowledge graphs

AIHubFeb-20-2024, 12:39:24 GMT

The AAAI/SIGAI Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. This year, 30 students have been selected for this programme, and we'll be hearing from them over the course of the next few months. In this interview, Célian Ringwald, tells us about his work on natural language processing and knowledge graphs. I am a PhD student at the Université Côte d'Azur in Inria, the French Institute in Research in AI. I am part of the Wimmics team, a research group bridging formal semantics and social semantics on the web.

extraction, knowledge graph, language processing and knowledge graph, (12 more...)

AIHub

Country:

Europe > France > Provence-Alpes-Côte d'Azur (0.25)
Europe > Belgium (0.05)

Genre:

Personal > Interview (0.35)
Instructional Material > Course Syllabus & Notes (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.63)

Add feedback

User Modeling and User Profiling: A Comprehensive Survey

Purificato, Erasmo, Boratto, Ludovico, De Luca, Ernesto William

arXiv.org Artificial IntelligenceFeb-20-2024

The integration of artificial intelligence (AI) into daily life, particularly through information retrieval and recommender systems, has necessitated advanced user modeling and profiling techniques to deliver personalized experiences. These techniques aim to construct accurate user representations based on the rich amounts of data generated through interactions with these systems. This paper presents a comprehensive survey of the current state, evolution, and future directions of user modeling and profiling research. We provide a historical overview, tracing the development from early stereotype models to the latest deep learning techniques, and propose a novel taxonomy that encompasses all active topics in this research area, including recent trends. Our survey highlights the paradigm shifts towards more sophisticated user profiling methods, emphasizing implicit data collection, multi-behavior modeling, and the integration of graph data structures. We also address the critical need for privacy-preserving techniques and the push towards explainability and fairness in user modeling approaches. By examining the definitions of core terminology, we aim to clarify ambiguities and foster a clearer understanding of the field by proposing two novel encyclopedic definitions of the main terms. Furthermore, we explore the application of user modeling in various domains, such as fake news detection, cybersecurity, and personalized education. This survey serves as a comprehensive resource for researchers and practitioners, offering insights into the evolution of user modeling and profiling and guiding the development of more personalized, ethical, and effective AI systems.

13th international conference, fifteenth acm international conference, ieee international conference, (17 more...)

arXiv.org Artificial Intelligence

2402.0966

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
North America > United States > California > San Francisco County > San Francisco (0.13)
(42 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material > Online (0.92)
(2 more...)

Industry:

Media (1.00)
Information Technology > Services (1.00)
Information Technology > Security & Privacy (1.00)
(4 more...)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Communications > Social Media (1.00)
(14 more...)

Add feedback

Understanding Multimodal Procedural Knowledge by Sequencing Multimodal Instructional Manuals

Wu, Te-Lin, Spangher, Alex, Alipoormolabashi, Pegah, Freedman, Marjorie, Weischedel, Ralph, Peng, Nanyun

arXiv.org Artificial IntelligenceFeb-20-2024

The ability to sequence unordered events is an essential skill to comprehend and reason about real world task procedures, which often requires thorough understanding of temporal common sense and multimodal information, as these procedures are often communicated through a combination of texts and images. Such capability is essential for applications such as sequential task planning and multi-source instruction summarization. While humans are capable of reasoning about and sequencing unordered multimodal procedural instructions, whether current machine learning models have such essential capability is still an open question. In this work, we benchmark models' capability of reasoning over and sequencing unordered multimodal instructions by curating datasets from popular online instructional manuals and collecting comprehensive human annotations. We find models not only perform significantly worse than humans but also seem incapable of efficiently utilizing the multimodal information. To improve machines' performance on multimodal event sequencing, we propose sequentiality-aware pretraining techniques that exploit the sequential alignment properties of both texts and images, resulting in > 5% significant improvements.

category, dataset, wikihow, (16 more...)

arXiv.org Artificial Intelligence

2110.08486

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > China (0.04)
North America > United States > New York (0.04)
Asia > Macao (0.04)

Genre:

Research Report (0.82)
Instructional Material > Training Manual (0.60)

Industry:

Education > Educational Setting > Online (0.88)
Education > Educational Technology > Educational Software > Computer Based Training (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Li, Haoran, Dong, Qingxiu, Tang, Zhengyang, Wang, Chaojun, Zhang, Xingxing, Huang, Haoyang, Huang, Shaohan, Huang, Xiaolong, Huang, Zeqiang, Zhang, Dongdong, Gu, Yuxian, Cheng, Xin, Wang, Xun, Chen, Si-Qing, Dong, Li, Lu, Wei, Sui, Zhifang, Wang, Benyou, Lam, Wai, Wei, Furu

arXiv.org Artificial IntelligenceFeb-20-2024

We introduce Generalized Instruction Tuning (called GLAN), a general and scalable method for instruction tuning of Large Language Models (LLMs). Unlike prior work that relies on seed examples or existing datasets to construct instruction tuning data, GLAN exclusively utilizes a pre-curated taxonomy of human knowledge and capabilities as input and generates large-scale synthetic instruction data across all disciplines. Specifically, inspired by the systematic structure in human education system, we build the taxonomy by decomposing human knowledge and capabilities to various fields, sub-fields and ultimately, distinct disciplines semi-automatically, facilitated by LLMs. Subsequently, we generate a comprehensive list of subjects for every discipline and proceed to design a syllabus tailored to each subject, again utilizing LLMs. With the fine-grained key concepts detailed in every class session of the syllabus, we are able to generate diverse instructions with a broad coverage across the entire spectrum of human knowledge and skills. Extensive experiments on large language models (e.g., Mistral) demonstrate that GLAN excels in multiple dimensions from mathematical reasoning, coding, academic exams, logical reasoning to general instruction following without using task-specific training data of these tasks. In addition, GLAN allows for easy customization and new fields or skills can be added by simply incorporating a new node into our taxonomy.

arxiv preprint arxiv, class session, instruction, (14 more...)

arXiv.org Artificial Intelligence

2402.13064

Country:

Asia > China > Hong Kong (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre:

Research Report (0.50)
Instructional Material (0.48)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Energy (1.00)
Education > Educational Setting (0.93)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback