AITopics

2412.19471

Country:

Asia > China > Hong Kong (0.04)
North America > United States > New York (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre:

Instructional Material > Course Syllabus & Notes (0.46)
Research Report > New Finding (0.34)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Rajapakse, Chathura, Ariyarathna, Wathsala, Selvakan, Shanmugalingam

A Self-Efficacy Theory-based Study on the Teachers Readiness to Teach Artificial Intelligence in Public Schools in Sri Lanka

arXiv.org Artificial IntelligenceDec-26-2024

The need for and challenges of teaching artificial intelligence (AI) at primary, secondary, and upper-secondary levels have been a major focus of recent academic discussions [1],[2],[3]. Often referred to as AI4K12 [4], this area explores global initiatives that introduce AI to students from kindergarten through high school. The rapid advancements in deep learning and generative AI technologies suggest AI will become a transformative force. This realisation has prompted governments and policymakers to recognise the need to prepare future citizens for a world heavily influenced by AI. As AI becomes increasingly integrated into information systems, concerns are mounting about citizens' ability to use these systems responsibly and understand the consequences of not doing so [5]. Furthermore, anxieties regarding AI's potential impact on societal sustainability highlight the need to equip future workforces with the skills to combine human creativity with AI's potential to create sustainable systems.

artificial intelligence, deep learning, machine learning, (15 more...)

doi: 10.1145/3691354

2412.19425

Country:

Europe (0.93)
Asia > Sri Lanka (0.51)
North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Instructional Material (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.89)
Education > Educational Setting > K-12 Education > Secondary School (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.54)

arXiv.org Artificial IntelligenceDec-26-2024

Hierarchical Multi-agent Meta-Reinforcement Learning for Cross-channel Bidding

He, Shenghong, Yu, Chao

Real-time bidding (RTB) plays a pivotal role in online advertising ecosystems. Advertisers employ strategic bidding to optimize their advertising impact while adhering to various financial constraints, such as the return-on-investment (ROI) and cost-per-click (CPC). Primarily focusing on bidding with fixed budget constraints, traditional approaches cannot effectively manage the dynamic budget allocation problem where the goal is to achieve global optimization of bidding performance across multiple channels with a shared budget. In this paper, we propose a hierarchical multi-agent reinforcement learning framework for multi-channel bidding optimization. In this framework, the top-level strategy applies a CPC constrained diffusion model to dynamically allocate budgets among the channels according to their distinct features and complex interdependencies, while the bottom-level strategy adopts a state-action decoupled actor-critic method to address the problem of extrapolation errors in offline learning caused by out-of-distribution actions and a context-based meta-channel knowledge learning method to improve the state representation capability of the policy based on the shared knowledge among different channels. Comprehensive experiments conducted on a large scale real-world industrial dataset from the Meituan ad bidding platform demonstrate that our method achieves a state-of-the-art performance.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2412.19064

Country: Asia > China (1.00)

Genre:

Research Report > Promising Solution (0.46)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Marketing (1.00)
Information Technology > Services (1.00)
Banking & Finance > Trading (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceDec-25-2024

Overview of MWE history, challenges, and horizons: standing at the 20th anniversary of the MWE workshop series via MWE-UD2024

Han, Lifeng, Evang, Kilian, Bhatia, Archna, Bouma, Gosse, Doğruöz, A. Seza, Garcia, Marcos, Giouli, Voula, Nivre, Joakim, Rademacher, Alexandre

Starting in 2003 when the first MWE workshop was held with ACL in Sapporo, Japan, this year, the joint workshop of MWE-UD co-located with the LREC-COLING 2024 conference marked the 20th anniversary of MWE workshop events over the past nearly two decades. Standing at this milestone, we look back to this workshop series and summarise the research topics and methodologies researchers have carried out over the years. We also discuss the current challenges that we are facing and the broader impacts/synergies of MWE research within the CL and NLP fields. Finally, we give future research perspectives. We hope this position paper can help researchers, students, and industrial practitioners interested in MWE get a brief but easy understanding of its history, current, and possible future.

artificial intelligence, natural language, text processing, (13 more...)

2412.18868

Country:

Europe (1.00)
North America > United States > New Mexico (0.29)
Asia > Japan > Hokkaidō > Hokkaidō Prefecture > Sapporo (0.25)

Genre: Instructional Material > Course Syllabus & Notes (0.71)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)

arXiv.org Artificial IntelligenceDec-25-2024

LearnLM: Improving Gemini for Learning

LearnLM Team, null, Modi, Abhinit, Veerubhotla, Aditya Srikanth, Rysbek, Aliya, Huber, Andrea, Wiltshire, Brett, Veprek, Brian, Gillick, Daniel, Kasenberg, Daniel, Ahmed, Derek, Jurenka, Irina, Cohan, James, She, Jennifer, Wilkowski, Julia, Alarakyia, Kaiz, McKee, Kevin R., Wang, Lisa, Kunesch, Markus, Schaekermann, Mike, Pîslar, Miruna, Joshi, Nikhil, Mahmoudieh, Parsa, Jhun, Paul, Wiltberger, Sara, Mohamed, Shakir, Agarwal, Shashank, Phal, Shubham Milind, Lee, Sun Jae, Strinopoulos, Theofilos, Ko, Wei-Jen, Wang, Amy, Anand, Ankit, Bhoopchand, Avishkar, Wild, Dan, Pandya, Divya, Bar, Filip, Graham, Garth, Winnemoeller, Holger, Nagda, Mahvish, Kolhar, Prateek, Schneider, Renee, Zhu, Shaojian, Chan, Stephanie, Yadlowsky, Steve, Sounderajah, Viknesh, Assael, Yannis

Today's generative AI systems are tuned to present information by default rather than engage users in service of learning as a human tutor would. To address the wide range of potential education use cases for these systems, we reframe the challenge of injecting pedagogical behavior as one of \textit{pedagogical instruction following}, where training and evaluation examples include system-level instructions describing the specific pedagogy attributes present or desired in subsequent model turns. This framing avoids committing our models to any particular definition of pedagogy, and instead allows teachers or developers to specify desired model behavior. It also clears a path to improving Gemini models for learning -- by enabling the addition of our pedagogical data to post-training mixtures -- alongside their rapidly expanding set of capabilities. Both represent important changes from our initial tech report. We show how training with pedagogical instruction following produces a LearnLM model (available on Google AI Studio) that is preferred substantially by expert raters across a diverse set of learning scenarios, with average preference strengths of 31\% over GPT-4o, 11\% over Claude 3.5, and 13\% over the Gemini 1.5 Pro model LearnLM was based on.

large language model, machine learning, natural language, (21 more...)

2412.16429

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (0.94)
Instructional Material > Course Syllabus & Notes (0.68)

Industry:

Health & Medicine (1.00)
Education > Curriculum (0.68)
Education > Educational Setting > Higher Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Enhancing Online Continual Learning with Plug-and-Play State Space Model and Class-Conditional Mixture of Discretization

Liu, Sihao, Yang, Yibo, Li, Xiaojie, Clifton, David A., Ghanem, Bernard

Online continual learning (OCL) seeks to learn new tasks from data streams that appear only once, while retaining knowledge of previously learned tasks. Most existing methods rely on replay, focusing on enhancing memory retention through regularization or distillation. However, they often overlook the adaptability of the model, limiting the ability to learn generalizable and discriminative features incrementally from online training data. To address this, we introduce a plug-and-play module, S6MOD, which can be integrated into most existing methods and directly improve adaptability. Specifically, S6MOD introduces an extra branch after the backbone, where a mixture of discretization selectively adjusts parameters in a selective state space model, enriching selective scan patterns such that the model can adaptively select the most sensitive discretization method for current dynamics. We further design a class-conditional routing algorithm for dynamic, uncertainty-based adjustment and implement a contrastive discretization loss to optimize it. Extensive experiments combining our module with various models demonstrate that S6MOD significantly enhances model adaptability, leading to substantial performance gains and achieving the state-of-the-art results.

artificial intelligence, learning, machine learning, (12 more...)

2412.18177

Country:

Asia > China > Heilongjiang Province > Harbin (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre:

Instructional Material > Online (0.62)
Research Report (0.50)

Industry:

Health & Medicine (1.00)
Education > Educational Setting > Online (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Using Large Language Models for Automated Grading of Student Writing about Science

Impey, Chris, Wenger, Matthew, Garuda, Nikhil, Golchin, Shahriar, Stamer, Sarah

Assessing writing in large classes for formal or informal learners presents a significant challenge. Consequently, most large classes, particularly in science, rely on objective assessment tools such as multiple-choice quizzes, which have a single correct answer. The rapid development of AI has introduced the possibility of using large language models (LLMs) to evaluate student writing. An experiment was conducted using GPT-4 to determine if machine learning methods based on LLMs can match or exceed the reliability of instructor grading in evaluating short writing assignments on topics in astronomy. The audience consisted of adult learners in three massive open online courses (MOOCs) offered through Coursera. One course was on astronomy, the second was on astrobiology, and the third was on the history and philosophy of astronomy. The results should also be applicable to non-science majors in university settings, where the content and modes of evaluation are similar. The data comprised answers from 120 students to 12 questions across the three courses. GPT-4 was provided with total grades, model answers, and rubrics from an instructor for all three courses. In addition to evaluating how reliably the LLM reproduced instructor grades, the LLM was also tasked with generating its own rubrics. Overall, the LLM was more reliable than peer grading, both in aggregate and by individual student, and approximately matched instructor grades for all three online courses. The implication is that LLMs may soon be used for automated, reliable, and scalable grading of student science writing.

large language model, machine learning, natural language, (20 more...)

2412.18719

Country: North America > United States > Arizona (0.28)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Online (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry:

Education > Educational Technology > Educational Software > Computer-Aided Assessment (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)
Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Mathematics and Machine Creativity: A Survey on Bridging Mathematics with AI

Liang, Shizhe, Zhang, Wei, Zhong, Tianyang, Liu, Tianming

This paper presents a comprehensive overview on the applications of artificial intelligence (AI) in mathematical research, highlighting the transformative role AI has begun to play in this domain. Traditionally, AI advancements have heavily relied on theoretical foundations provided by mathematics and statistics. However, recent developments in AI, particularly in reinforcement learning (RL) and large language models (LLMs), have demonstrated the potential for AI to contribute back to mathematics by offering flexible algorithmic frameworks and powerful inductive reasoning capabilities that support various aspects of mathematical research. This survey aims to establish a bridge between AI and mathematics, providing insights into the mutual benefits and fostering deeper interdisciplinary understanding. In particular, we argue that while current AI and LLMs may struggle with complex deductive reasoning, their "inherent creativity", the ability to generate outputs at high throughput based on recognition of shallow patterns, holds significant potential to support and inspire mathematical research. This creative capability, often overlooked, could be the key to unlocking new perspectives and methodologies in mathematics. Furthermore, we address the lack of cross-disciplinary communication: mathematicians may not fully comprehend the latest advances in AI, while AI researchers frequently prioritize benchmark performance over real-world applications in frontier mathematical research. This paper seeks to close that gap, offering a detailed exploration of AI fundamentals, its strengths, and its emerging applications in the mathematical sciences.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

2412.16543

Country:

North America > United States (0.93)
North America > Canada > Alberta (0.28)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.46)
Research Report > Promising Solution (0.46)

Industry:

Health & Medicine (1.00)
Education > Educational Setting (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Clay, Viviane, Leadholm, Niels, Hawkins, Jeff

The Thousand Brains Project: A New Paradigm for Sensorimotor Intelligence

Artificial intelligence has advanced rapidly in the last decade, driven primarily by progress in the scale of deep-learning systems. Despite these advances, the creation of intelligent systems that can operate effectively in diverse, real-world environments remains a significant challenge. In this white paper, we outline the Thousand Brains Project, an ongoing research effort to develop an alternative, complementary form of AI, derived from the operating principles of the neocortex. We present an early version of a thousand-brains system, a sensorimotor agent that is uniquely suited to quickly learn a wide range of tasks and eventually implement any capabilities the human neocortex has. Core to its design is the use of a repeating computational unit, the learning module, modeled on the cortical columns found in mammalian brains. Each learning module operates as a semi-independent unit that can model entire objects, represents information through spatially structured reference frames, and both estimates and is able to effect movement in the world. Learning is a quick, associative process, similar to Hebbian learning in the brain, and leverages inductive biases around the spatial structure of the world to enable rapid and continual learning. Multiple learning modules can interact with one another both hierarchically and non-hierarchically via a "cortical messaging protocol" (CMP), creating more abstract representations and supporting multimodal integration. We outline the key principles motivating the design of thousand-brains systems and provide details about the implementation of Monty, our first instantiation of such a system. Code can be found at https://github.com/thousandbrainsproject/tbp.monty, along with more detailed documentation at https://thousandbrainsproject.readme.io/.

artificial intelligence, machine learning, module, (18 more...)

2412.18354

Genre:

Instructional Material > Course Syllabus & Notes (0.68)
Research Report (0.40)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

YuLan-Mini: An Open Data-efficient Language Model

Hu, Yiwen, Song, Huatong, Deng, Jia, Wang, Jiapeng, Chen, Jie, Zhou, Kun, Zhu, Yutao, Jiang, Jinhao, Dong, Zican, Zhao, Wayne Xin, Wen, Ji-Rong

Effective pre-training of large language models (LLMs) has been challenging due to the immense resource demands and the complexity of the technical processes involved. This paper presents a detailed technical report on YuLan-Mini, a highly capable base model with 2.42B parameters that achieves top-tier performance among models of similar parameter scale. Our pre-training approach focuses on enhancing training efficacy through three key technical contributions: an elaborate data pipeline combines data cleaning with data schedule strategies, a robust optimization method to mitigate training instability, and an effective annealing approach that incorporates targeted data selection and long context training. Remarkably, YuLan-Mini, trained on 1.08T tokens, achieves performance comparable to industry-leading models that require significantly more data. To facilitate reproduction, we release the full details of the data composition for each training phase.

large language model, machine learning, natural language, (19 more...)

2412.17743

Country:

North America > United States (1.00)
Asia (1.00)
Europe > Austria > Vienna (0.15)

Genre:

Research Report > New Finding (0.67)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Education > Educational Setting > K-12 Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)