educational value
Improving Romanian LLM Pretraining Data using Diversity and Quality Filtering
Negoita, Vlad, Masala, Mihai, Rebedea, Traian
Large Language Models (LLMs) have recently exploded in popularity, often matching or outperforming human abilities on many tasks. One of the key factors in training LLMs is the availability and curation of high-quality data. Data quality is especially crucial for under-represented languages, where high-quality corpora are scarce. In this work we study the characteristics and coverage of Romanian pretraining corpora and we examine how they differ from English data. By training a lightweight multitask model on carefully LLM-annotated Romanian texts, we are able to analyze and perform multi-level filtering (e.g., educational value, topic, format) to generate high-quality pretraining datasets. Our experiments show noteworthy trends in the topics present in Romanian and English data, while also proving the effectiveness of filtering data through improved LLM pretraining performance across multiple benchmarks.
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Ali, Mehdi, Brack, Manuel, Lรผbbering, Max, Wendt, Elias, Khan, Abbas Goher, Rutmann, Richard, Jude, Alex, Kraus, Maurice, Weber, Alexander Arno, Kaczรฉr, David, Mai, Florian, Flek, Lucie, Sifa, Rafet, Flores-Herr, Nicolas, Kรถhler, Joachim, Schramowski, Patrick, Fromm, Michael, Kersting, Kristian
High-quality multilingual training data is essential for effectively pretraining large language models (LLMs). Yet, the availability of suitable open-source multilingual datasets remains limited. Existing state-of-the-art datasets mostly rely on heuristic filtering methods, restricting both their cross-lingual transferability and scalability. Here, we introduce JQL, a systematic approach that efficiently curates diverse and high-quality multilingual data at scale while significantly reducing computational demands. JQL distills LLMs' annotation capabilities into lightweight annotators based on pretrained multilingual embeddings. These models exhibit robust multilingual and cross-lingual performance, even for languages and scripts unseen during training. Evaluated empirically across 35 languages, the resulting annotation pipeline substantially outperforms current heuristic filtering methods like Fineweb2. JQL notably enhances downstream model training quality and increases data retention rates. Our research provides practical insights and valuable resources for multilingual data curation, raising the standards of multilingual dataset development.
ConQuer: A Framework for Concept-Based Quiz Generation
Fu, Yicheng, Wang, Zikui, Yang, Liuxin, Huo, Meiqing, Dai, Zhongdongming
Quizzes play a crucial role in education by reinforcing students' understanding of key concepts and encouraging self-directed exploration. However, compiling high-quality quizzes can be challenging and require deep expertise and insight into specific subject matter. Although LLMs have greatly enhanced the efficiency of quiz generation, concerns remain regarding the quality of these AI-generated quizzes and their educational impact on students. To address these issues, we introduce ConQuer, a concept-based quiz generation framework that leverages external knowledge sources. We employ comprehensive evaluation dimensions to assess the quality of the generated quizzes, using LLMs as judges. Our experiment results demonstrate a 4.8% improvement in evaluation scores and a 77.52% win rate in pairwise comparisons against baseline quiz sets. Ablation studies further underscore the effectiveness of each component in our framework. Code available at https://github.com/sofyc/ConQuer.
The best STEM toys in 2024 for kids of all ages
We may earn revenue from the products available on this page and participate in affiliate programs. STEM toys are a fantastic way to combine fun and learning, sparking curiosity and creativity in children of all ages. Whether shopping for science gifts for kids or searching for activities that teach valuable skills, these toys offer engaging ways to explore science, technology, engineering, and math. From hands-on activities like the GraviTrax JUNIOR Starter-Set for younger kids to the advanced challenges of the LEGO Technic NASA Mars Rover Perseverance for teens, the best STEM toys offer something for every age and interest. STEM toys can range from really high-tech gadgets to simple, hands-on activities like marble runs and model rocket kits to even more traditional options like classic paper toys. These toys are designed to build essential skills in science, technology, engineering, and math, providing fun ways to learn while encouraging problem-solving and creativity.
The best robot kits for kids in 2024
We may earn revenue from the products available on this page and participate in affiliate programs. Building a robot at home is more than just a fun activity--it's a hands-on way to explore the exciting world of STEM [Science, Technology, Engineering, and Math]. Whether you're searching for a children's toy robot to inspire curiosity or a more advanced robot-building kit for older kids or teens, like our best overall Sillbird STEM 12-in-1 Education Solar Robot Toy, the best robot kits offer options for all ages and skill levels. Robot building kits offer a perfect blend of creativity and learning, teaching essential skills like coding, problem-solving, and engineering through play. From preschool-friendly robot toys to beginner robotics kits for older children, these sets provide a fantastic introduction to the basics of robotics.
Edu-Values: Towards Evaluating the Chinese Education Values of Large Language Models
Zhang, Peiyi, Zhang, Yazhou, Wang, Bo, Rong, Lu, Qin, Jing
With the recent evolution of large language models (LLMs), concerns about aligning such models with human values have grown. Previous research has primarily focused on assessing LLMs' performance in terms of the Helpful, Honest, Harmless (3H) basic principles, while often overlooking their alignment with educational values in the Chinese context. To fill this gap, we present Edu-Values, the first Chinese education values evaluation benchmark designed to measure LLMs' alignment ability across seven dimensions: professional ideology, cultural literacy, educational knowledge and skills, education laws and regulations, teachers' professional ethics, basic competencies, and subject knowledge. We meticulously design and compile 1,418 questions, including multiple-choice, multi-modal question answering, subjective analysis, adversarial prompts, and questions on traditional Chinese culture. We conduct both human evaluation and automatic evaluation over 11 state-of-the-art (SoTA) LLMs, and highlight three main findings: (1) due to differences in educational culture, Chinese LLMs significantly outperform English LLMs, with Qwen 2 ranking the first with a score of 81.37; (2) LLMs perform well in subject knowledge and teaching skills but struggle with teachers' professional ethics and basic competencies; (3) LLMs excel at multiple-choice questions but perform poorly on subjective analysis and multi-modal tasks. This demonstrates the effectiveness and potential of the proposed benchmark. Our dataset is available at https://github.com/zhangpeii/Edu-Values.git.
AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts
Zhang, Yifan, Luo, Yifan, Yuan, Yang, Yao, Andrew Chi-Chih
To improve language models' proficiency in mathematical reasoning via continual pretraining, we introduce a novel strategy that leverages base language models for autonomous data selection. Departing from conventional supervised fine-tuning or trained classifiers with human-annotated data, our approach utilizes meta-prompted language models as zero-shot verifiers to autonomously evaluate and select high-quality mathematical content, and we release the curated open-source AutoMathText dataset encompassing over 200GB of data. To demonstrate the efficacy of our method, we continuously pretrained a 7B-parameter Mistral language model on the AutoMathText dataset, achieving substantial improvements in downstream performance on the MATH dataset with a token amount reduced by orders of magnitude compared to previous continuous pretraining works. Our method showcases a 2 times increase in pretraining token efficiency compared to baselines, underscoring the potential of our approach in enhancing models' mathematical reasoning capabilities. The AutoMathText dataset is available at https://huggingface.co/datasets/math-ai/AutoMathText. The code is available at https://github.com/yifanzhang-pro/AutoMathText.
Salespeople vs SalesBot: Exploring the Role of Educational Value in Conversational Recommender Systems
Murakhovs'ka, Lidiya, Laban, Philippe, Xie, Tian, Xiong, Caiming, Wu, Chien-Sheng
Making big purchases requires consumers to research or consult a salesperson to gain domain expertise. However, existing conversational recommender systems (CRS) often overlook users' lack of background knowledge, focusing solely on gathering preferences. In this work, we define a new problem space for conversational agents that aim to provide both product recommendations and educational value through mixed-type mixed-initiative dialog. We introduce SalesOps, a framework that facilitates the simulation and evaluation of such systems by leveraging recent advancements in large language models (LLMs). We build SalesBot and ShopperBot, a pair of LLM-powered agents that can simulate either side of the framework. A comprehensive human study compares SalesBot against professional salespeople, revealing that although SalesBot approaches professional performance in terms of fluency and informativeness, it lags behind in recommendation quality. We emphasize the distinct limitations both face in providing truthful information, highlighting the challenges of ensuring faithfulness in the CRS context. We release our code and make all data available.
Will AI Destroy Education?
Artificial intelligence is everywhere these days. The National AI Initiative Act became law in the U.S. on Jan. 1, 2021, aiming "to accelerate AI research and application for the Nation's economic prosperity and national security." The U.S. National Science Foundation launched in 2020 several AI Research Institutes to push forward the frontiers of artificial intelligence. One of the themes of this research initiative is "AI-Augmented Learning." This quest to improve education via technology reminds me of "Profession;" a 1957 science-fiction story by Isaac Asimov.
Learning about Machine Learning: An Extended Assignment to Classify Twitter Accounts
Mustafaraj, Eni (Wellesley College) | Anderson, Scott D. (Wellesley College)
We describe a four-week series of assignments in an undergraduate AI course at a liberal arts college developing a supervised learning solution to the problem of classifying Twitter accounts as either a person account or a non-person account (e.g. organization or spambot). This problem employs real data in an ongoing research project by the first author, yet is accessible to students with limited programming expertise.The students were able to experience a complete cycle of creating a machine learning solution: exploring raw data,creating a training set, engineering features, comparing different classifiers, evaluating the results, and performing erroranalysis. We received positive feedback from the students and intend to refine the assignment and make it available (together with the created training data) for use by the research community.