AITopics | educational value

Collaborating Authors

educational value

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Romanian LLM Pretraining Data using Diversity and Quality Filtering

Negoita, Vlad, Masala, Mihai, Rebedea, Traian

arXiv.org Artificial IntelligenceNov-4-2025

Large Language Models (LLMs) have recently exploded in popularity, often matching or outperforming human abilities on many tasks. One of the key factors in training LLMs is the availability and curation of high-quality data. Data quality is especially crucial for under-represented languages, where high-quality corpora are scarce. In this work we study the characteristics and coverage of Romanian pretraining corpora and we examine how they differ from English data. By training a lightweight multitask model on carefully LLM-annotated Romanian texts, we are able to analyze and perform multi-level filtering (e.g., educational value, topic, format) to generate high-quality pretraining datasets. Our experiments show noteworthy trends in the topics present in Romanian and English data, while also proving the effectiveness of filtering data through improved LLM pretraining performance across multiple benchmarks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.0109

Country: North America > Mexico (0.28)

Genre: Research Report (0.64)

Industry: Education > Educational Setting (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models

Ali, Mehdi, Brack, Manuel, Lübbering, Max, Wendt, Elias, Khan, Abbas Goher, Rutmann, Richard, Jude, Alex, Kraus, Maurice, Weber, Alexander Arno, Kaczér, David, Mai, Florian, Flek, Lucie, Sifa, Rafet, Flores-Herr, Nicolas, Köhler, Joachim, Schramowski, Patrick, Fromm, Michael, Kersting, Kristian

arXiv.org Artificial IntelligenceJun-3-2025

High-quality multilingual training data is essential for effectively pretraining large language models (LLMs). Yet, the availability of suitable open-source multilingual datasets remains limited. Existing state-of-the-art datasets mostly rely on heuristic filtering methods, restricting both their cross-lingual transferability and scalability. Here, we introduce JQL, a systematic approach that efficiently curates diverse and high-quality multilingual data at scale while significantly reducing computational demands. JQL distills LLMs' annotation capabilities into lightweight annotators based on pretrained multilingual embeddings. These models exhibit robust multilingual and cross-lingual performance, even for languages and scripts unseen during training. Evaluated empirically across 35 languages, the resulting annotation pipeline substantially outperforms current heuristic filtering methods like Fineweb2. JQL notably enhances downstream model training quality and increases data retention rates. Our research provides practical insights and valuable resources for multilingual data curation, raising the standards of multilingual dataset development.

annotator, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.22232

Country:

North America > United States (0.46)
Europe > Germany (0.28)
Europe > Austria (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

ConQuer: A Framework for Concept-Based Quiz Generation

Fu, Yicheng, Wang, Zikui, Yang, Liuxin, Huo, Meiqing, Dai, Zhongdongming

arXiv.org Artificial IntelligenceMar-18-2025

Quizzes play a crucial role in education by reinforcing students' understanding of key concepts and encouraging self-directed exploration. However, compiling high-quality quizzes can be challenging and require deep expertise and insight into specific subject matter. Although LLMs have greatly enhanced the efficiency of quiz generation, concerns remain regarding the quality of these AI-generated quizzes and their educational impact on students. To address these issues, we introduce ConQuer, a concept-based quiz generation framework that leverages external knowledge sources. We employ comprehensive evaluation dimensions to assess the quality of the generated quizzes, using LLMs as judges. Our experiment results demonstrate a 4.8% improvement in evaluation scores and a 77.52% win rate in pairwise comparisons against baseline quiz sets. Ablation studies further underscore the effectiveness of each component in our framework. Code available at https://github.com/sofyc/ConQuer.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.14662

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Zhejiang Province > Hangzhou (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)
(5 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Education > Educational Setting > K-12 Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

The best STEM toys in 2024 for kids of all ages

We may earn revenue from the products available on this page and participate in affiliate programs. STEM toys are a fantastic way to combine fun and learning, sparking curiosity and creativity in children of all ages. Whether shopping for science gifts for kids or searching for activities that teach valuable skills, these toys offer engaging ways to explore science, technology, engineering, and math. From hands-on activities like the GraviTrax JUNIOR Starter-Set for younger kids to the advanced challenges of the LEGO Technic NASA Mars Rover Perseverance for teens, the best STEM toys offer something for every age and interest. STEM toys can range from really high-tech gadgets to simple, hands-on activities like marble runs and model rocket kits to even more traditional options like classic paper toys. These toys are designed to build essential skills in science, technology, engineering, and math, providing fun ways to learn while encouraging problem-solving and creativity.

creativity, engineering, stem toy, (10 more...)

Popular Science

Industry:

Education (1.00)
Government > Space Agency (0.35)

Technology: Information Technology > Artificial Intelligence > Robots (0.52)

Add feedback

The best robot kits for kids in 2024

Popular ScienceOct-7-2024, 14:00:00 GMT

We may earn revenue from the products available on this page and participate in affiliate programs. Building a robot at home is more than just a fun activity--it's a hands-on way to explore the exciting world of STEM [Science, Technology, Engineering, and Math]. Whether you're searching for a children's toy robot to inspire curiosity or a more advanced robot-building kit for older kids or teens, like our best overall Sillbird STEM 12-in-1 Education Solar Robot Toy, the best robot kits offer options for all ages and skill levels. Robot building kits offer a perfect blend of creativity and learning, teaching essential skills like coding, problem-solving, and engineering through play. From preschool-friendly robot toys to beginner robotics kits for older children, these sets provide a fantastic introduction to the basics of robotics.

artificial intelligence, robot, robot kit, (17 more...)

Popular Science

Genre: Instructional Material (0.70)

Industry:

Education (1.00)
Energy > Renewable (0.71)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Edu-Values: Towards Evaluating the Chinese Education Values of Large Language Models

Zhang, Peiyi, Zhang, Yazhou, Wang, Bo, Rong, Lu, Qin, Jing

arXiv.org Artificial IntelligenceSep-19-2024

With the recent evolution of large language models (LLMs), concerns about aligning such models with human values have grown. Previous research has primarily focused on assessing LLMs' performance in terms of the Helpful, Honest, Harmless (3H) basic principles, while often overlooking their alignment with educational values in the Chinese context. To fill this gap, we present Edu-Values, the first Chinese education values evaluation benchmark designed to measure LLMs' alignment ability across seven dimensions: professional ideology, cultural literacy, educational knowledge and skills, education laws and regulations, teachers' professional ethics, basic competencies, and subject knowledge. We meticulously design and compile 1,418 questions, including multiple-choice, multi-modal question answering, subjective analysis, adversarial prompts, and questions on traditional Chinese culture. We conduct both human evaluation and automatic evaluation over 11 state-of-the-art (SoTA) LLMs, and highlight three main findings: (1) due to differences in educational culture, Chinese LLMs significantly outperform English LLMs, with Qwen 2 ranking the first with a score of 81.37; (2) LLMs perform well in subject knowledge and teaching skills but struggle with teachers' professional ethics and basic competencies; (3) LLMs excel at multiple-choice questions but perform poorly on subjective analysis and multi-modal tasks. This demonstrates the effectiveness and potential of the proposed benchmark. Our dataset is available at https://github.com/zhangpeii/Edu-Values.git.

dimension, llm, multimodal question, (14 more...)

arXiv.org Artificial Intelligence

2409.12739

Country:

North America > United States (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Hong Kong (0.04)

Genre: Instructional Material (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts

Zhang, Yifan, Luo, Yifan, Yuan, Yang, Yao, Andrew Chi-Chih

arXiv.org Artificial IntelligenceFeb-12-2024

To improve language models' proficiency in mathematical reasoning via continual pretraining, we introduce a novel strategy that leverages base language models for autonomous data selection. Departing from conventional supervised fine-tuning or trained classifiers with human-annotated data, our approach utilizes meta-prompted language models as zero-shot verifiers to autonomously evaluate and select high-quality mathematical content, and we release the curated open-source AutoMathText dataset encompassing over 200GB of data. To demonstrate the efficacy of our method, we continuously pretrained a 7B-parameter Mistral language model on the AutoMathText dataset, achieving substantial improvements in downstream performance on the MATH dataset with a token amount reduced by orders of magnitude compared to previous continuous pretraining works. Our method showcases a 2 times increase in pretraining token efficiency compared to baselines, underscoring the potential of our approach in enhancing models' mathematical reasoning capabilities. The AutoMathText dataset is available at https://huggingface.co/datasets/math-ai/AutoMathText. The code is available at https://github.com/yifanzhang-pro/AutoMathText.

arxiv preprint arxiv, dataset, language model, (13 more...)

arXiv.org Artificial Intelligence

2402.07625

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.64)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Salespeople vs SalesBot: Exploring the Role of Educational Value in Conversational Recommender Systems

Murakhovs'ka, Lidiya, Laban, Philippe, Xie, Tian, Xiong, Caiming, Wu, Chien-Sheng

arXiv.org Artificial IntelligenceOct-26-2023

Making big purchases requires consumers to research or consult a salesperson to gain domain expertise. However, existing conversational recommender systems (CRS) often overlook users' lack of background knowledge, focusing solely on gathering preferences. In this work, we define a new problem space for conversational agents that aim to provide both product recommendations and educational value through mixed-type mixed-initiative dialog. We introduce SalesOps, a framework that facilitates the simulation and evaluation of such systems by leveraging recent advancements in large language models (LLMs). We build SalesBot and ShopperBot, a pair of LLM-powered agents that can simulate either side of the framework. A comprehensive human study compares SalesBot against professional salespeople, revealing that although SalesBot approaches professional performance in terms of fluency and informativeness, it lags behind in recommendation quality. We emphasize the distinct limitations both face in providing truthful information, highlighting the challenges of ensuring faithfulness in the CRS context. We release our code and make all data available.

conversational recommender system, educational value, salespeople vs salesbot, (1 more...)

arXiv.org Artificial Intelligence

2310.17749

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.89)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)

Add feedback

Will AI Destroy Education?

Communications of the ACMDec-18-2021, 09:15:05 GMT

Artificial intelligence is everywhere these days. The National AI Initiative Act became law in the U.S. on Jan. 1, 2021, aiming "to accelerate AI research and application for the Nation's economic prosperity and national security." The U.S. National Science Foundation launched in 2020 several AI Research Institutes to push forward the frontiers of artificial intelligence. One of the themes of this research initiative is "AI-Augmented Learning." This quest to improve education via technology reminds me of "Profession;" a 1957 science-fiction story by Isaac Asimov.

ai destroy education, mooc, student, (10 more...)

Communications of the ACM

Country: North America > United States > Texas > Harris County > Houston (0.05)

Genre: Instructional Material > Online (0.59)

Industry:

Education > Educational Setting > Online (0.60)
Education > Educational Setting > Higher Education (0.54)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)

Technology:

Information Technology > Artificial Intelligence > Science Fiction (0.56)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.40)
Information Technology > Communications > Social Media (0.34)

Add feedback

Learning about Machine Learning: An Extended Assignment to Classify Twitter Accounts

Mustafaraj, Eni (Wellesley College) | Anderson, Scott D. (Wellesley College)

AAAI ConferencesMay-18-2011

We describe a four-week series of assignments in an undergraduate AI course at a liberal arts college developing a supervised learning solution to the problem of classifying Twitter accounts as either a person account or a non-person account (e.g. organization or spambot). This problem employs real data in an ongoing research project by the first author, yet is accessible to students with limited programming expertise.The students were able to experience a complete cycle of creating a machine learning solution: exploring raw data,creating a training set, engineering features, comparing different classifiers, evaluating the results, and performing erroranalysis. We received positive feedback from the students and intend to refine the assignment and make it available (together with the created training data) for use by the research community.

classifier, student, tweet, (15 more...)

AAAI Conferences

Twenty-Fourth International FLAIRS Conference

Country: North America > United States > Massachusetts > Norfolk County > Wellesley (0.04)

Genre: Instructional Material (0.46)

Industry:

Information Technology > Services (1.00)
Education (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)

Add feedback