AITopics

2311.07126

Country:

North America > United States > Massachusetts (0.28)
North America > United States > Wisconsin (0.14)
Europe > Germany (0.14)
(4 more...)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.67)

Industry:

Information Technology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area (0.92)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(7 more...)

arXiv.org Artificial IntelligenceNov-13-2023

Knowledge Tracing Challenge: Optimal Activity Sequencing for Students

Hicke, Yann

Knowledge tracing is a method used in education to assess and track the acquisition of knowledge by individual learners. It involves using a variety of techniques, such as quizzes, tests, and other forms of assessment, to determine what a learner knows and does not know about a particular subject. The goal of knowledge tracing is to identify gaps in understanding and provide targeted instruction to help learners improve their understanding and retention of material. This can be particularly useful in situations where learners are working at their own pace, such as in online learning environments. By providing regular feedback and adjusting instruction based on individual needs, knowledge tracing can help learners make more efficient progress and achieve better outcomes. Effectively solving the KT problem would unlock the potential of computer-aided education applications such as intelligent tutoring systems, curriculum learning, and learning materials recommendations. In this paper, we will present the results of the implementation of two Knowledge Tracing algorithms on a newly released dataset as part of the AAAI2023 Global Knowledge Tracing Challenge.

knowledge, sequence, student, (14 more...)

2311.14707

Country:

North America > United States (0.04)
Asia > China (0.04)

Genre:

Research Report (0.82)
Instructional Material (0.67)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)

arXiv.org Machine LearningNov-13-2023

Faster Algorithms for Structured Linear and Kernel Support Vector Machines

Gu, Yuzhou, Song, Zhao, Zhang, Lichen

Quadratic programming is a ubiquitous prototype in convex programming. Many combinatorial optimizations on graphs and machine learning problems can be formulated as quadratic programming; for example, Support Vector Machines (SVMs). Linear and kernel SVMs have been among the most popular models in machine learning over the past three decades, prior to the deep learning era. Generally, a quadratic program has an input size of $\Theta(n^2)$, where $n$ is the number of variables. Assuming the Strong Exponential Time Hypothesis ($\textsf{SETH}$), it is known that no $O(n^{2-o(1)})$ algorithm exists (Backurs, Indyk, and Schmidt, NIPS'17). However, problems such as SVMs usually feature much smaller input sizes: one is given $n$ data points, each of dimension $d$, with $d \ll n$. Furthermore, SVMs are variants with only $O(1)$ linear constraints. This suggests that faster algorithms are feasible, provided the program exhibits certain underlying structures. In this work, we design the first nearly-linear time algorithm for solving quadratic programs whenever the quadratic objective has small treewidth or admits a low-rank factorization, and the number of linear constraints is small. Consequently, we obtain a variety of results for SVMs: * For linear SVM, where the quadratic constraint matrix has treewidth $\tau$, we can solve the corresponding program in time $\widetilde O(n\tau^{(\omega+1)/2}\log(1/\epsilon))$; * For linear SVM, where the quadratic constraint matrix admits a low-rank factorization of rank-$k$, we can solve the corresponding program in time $\widetilde O(nk^{(\omega+1)/2}\log(1/\epsilon))$; * For Gaussian kernel SVM, where the data dimension $d = \Theta(\log n)$ and the squared dataset radius is small, we can solve it in time $O(n^{1+o(1)}\log(1/\epsilon))$. We also prove that when the squared dataset radius is large, then $\Omega(n^{2-o(1)})$ time is required.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2307.07735

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Virginia (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Overview (0.65)
Research Report (0.49)
Instructional Material > Course Syllabus & Notes (0.45)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Mbacke, Sokhna Diarra, Clerc, Florence, Germain, Pascal

PAC-Bayesian Generalization Bounds for Adversarial Generative Models

arXiv.org Machine LearningNov-13-2023

Moreover, models and develop generalization bounds for having generalization bounds not only contributes to the theoretical models based on the Wasserstein distance and understanding of GANs themselves, but also to the the total variation distance. Our first result on understanding of the structure of real-life datasets, if those the Wasserstein distance assumes the instance can be provably approximated by GAN-generated data. In space is bounded, while our second result takes addition, given that GANs are used for data-augmentation advantage of dimensionality reduction. Our results in fields such as medical image classification (see e.g. Frid-naturally apply to Wasserstein GANs and Adar et al., 2018), theoretical guarantees can substantiate Energy-Based GANs, and our bounds provide the soundness of such applications.

artificial intelligence, machine learning, pac-bayesian generalization bound, (13 more...)

arXiv.org Machine Learning

2302.08942

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.34)
Instructional Material > Course Syllabus & Notes (0.34)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
(2 more...)

arXiv.org Machine LearningNov-12-2023

Understanding Optimization of Deep Learning via Jacobian Matrix and Lipschitz Constant

Qi, Xianbiao, Wang, Jianan, Zhang, Lei

This article provides a comprehensive understanding of optimization in deep learning, with a primary focus on the challenges of gradient vanishing and gradient exploding, which normally lead to diminished model representational ability and training instability, respectively. We analyze these two challenges through several strategic measures, including the improvement of gradient flow and the imposition of constraints on a network's Lipschitz constant. To help understand the current optimization methodologies, we categorize them into two classes: explicit optimization and implicit optimization. Explicit optimization methods involve direct manipulation of optimizer parameters, including weight, gradient, learning rate, and weight decay. Implicit optimization methods, by contrast, focus on improving the overall landscape of a network by enhancing its modules, such as residual shortcuts, normalization methods, attention mechanisms, and activations. In this article, we provide an in-depth analysis of these two optimization classes and undertake a thorough examination of the Jacobian matrices and the Lipschitz constants of many widely used deep learning modules, highlighting existing issues as well as potential improvements. Moreover, we also conduct a series of analytical experiments to substantiate our theoretical discussions. This article does not aim to propose a new optimizer or network. Rather, our intention is to present a comprehensive understanding of optimization in deep learning. We hope that this article will assist readers in gaining a deeper insight in this field and encourages the development of more robust, efficient, and high-performing models.

artificial intelligence, lipschitz constant, machine learning, (15 more...)

arXiv.org Machine Learning

2306.09338

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia (0.04)
Europe > Finland > Central Finland > Jyväskylä (0.04)
(4 more...)

Genre:

Research Report (0.49)
Overview (0.46)
Instructional Material > Course Syllabus & Notes (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-11-2023

Semantics-Empowered Communication: A Tutorial-cum-Survey

Lu, Zhilin, Li, Rongpeng, Lu, Kun, Chen, Xianfu, Hossain, Ekram, Zhao, Zhifeng, Zhang, Honggang

Along with the springing up of the semantics-empowered communication (SemCom) research, it is now witnessing an unprecedentedly growing interest towards a wide range of aspects (e.g., theories, applications, metrics and implementations) in both academia and industry. In this work, we primarily aim to provide a comprehensive survey on both the background and research taxonomy, as well as a detailed technical tutorial. Specifically, we start by reviewing the literature and answering the "what" and "why" questions in semantic transmissions. Afterwards, we present the ecosystems of SemCom, including history, theories, metrics, datasets and toolkits, on top of which the taxonomy for research directions is presented. Furthermore, we propose to categorize the critical enabling techniques by explicit and implicit reasoning-based methods, and elaborate on how they evolve and contribute to modern content & channel semantics-empowered communications. Besides reviewing and summarizing the latest efforts in SemCom, we discuss the relations with other communication levels (e.g., conventional communications) from a holistic and unified viewpoint. Subsequently, in order to facilitate future developments and industrial applications, we also highlight advanced practical techniques for boosting semantic accuracy, robustness, and large-scale scalability, just to mention a few. Finally, we discuss the technical challenges that shed light on future research opportunities.

communication, information, semcom, (16 more...)

2212.08487

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.27)
North America > United States > Missouri > Jackson County > Kansas City (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(48 more...)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.81)
Research Report > New Finding (0.67)

Industry:

Telecommunications (1.00)
Information Technology > Security & Privacy (1.00)
Energy (1.00)
(3 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Internet of Things (1.00)
(13 more...)

Villan, Fabiano, Santos, Renato P. dos

ChatGPT as Co-Advisor in Scientific Initiation: Action Research with Project-Based Learning in Elementary Education

Background: In the contemporary educational landscape, technology has the power to drive innovative pedagogical practices. Overcoming the resistance of teachers and students to adopting new methods and technologies is a challenge that needs to be addressed. Objectives: To evaluate the effectiveness of ChatGPT as a co-advisor in research projects and its influence on the implementation of Project-Based Learning (PBL), as well as overcoming resistance to the use of new pedagogical methodologies. Design: An action-research methodology was employed, including unstructured interviews and the application of questionnaires via Google Forms. Setting and Participants: The research was conducted in an elementary school, involving 353 students and 16 teachers. Data Collection and Analysis: Data were gathered through observations and notes in meetings and interviews, complemented by electronic questionnaires, with quantitative and qualitative analyses performed via Microsoft Excel and Google Forms. Results: The introduction of ChatGPT as a pedagogical tool led to increased student engagement and decreased teacher resistance, reflected in recognition at local science fairs. Conclusion: The study confirmed the utility of ChatGPT in school research co-orientation, highlighting its role in facilitating PBL and promoting cultural changes in educational practice, with proactive school management identified as a catalysing element in adapting to educational innovations.

acta sci, chatgpt, student, (16 more...)

doi: 10.17648/acta.scientiae.7474

2311.14701

Country:

Europe > United Kingdom > England (0.04)
South America > Venezuela (0.04)
South America > Suriname (0.04)
(16 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Instructional Material (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Setting > K-12 Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Aliabadi, Roozbeh, Singh, Aditi, Wilson, Eryka

Transdisciplinary AI Education: The Confluence of Curricular and Community Needs in the Instruction of Artificial Intelligence

The integration of artificial intelligence (AI) into education has the potential to transform the way we learn and teach. In this paper, we examine the current state of AI in education and explore the potential benefits and challenges of incorporating this technology into the classroom. The approaches currently available for AI education often present students with experiences only focusing on discrete computer science concepts agnostic to a larger curriculum. However, teaching AI must not be siloed or interdisciplinary. Rather, AI instruction ought to be transdisciplinary, including connections to the broad curriculum and community in which students are learning. This paper delves into the AI program currently in development for Neom Community School and the larger Education, Research, and Innovation Sector in Neom, Saudi Arabia s new megacity under development. In this program, AI is both taught as a subject and to learn other subjects within the curriculum through the school systems International Baccalaureate (IB) approach, which deploys learning through Units of Inquiry. This approach to education connects subjects across a curriculum under one major guiding question at a time. The proposed method offers a meaningful approach to introducing AI to students throughout these Units of Inquiry, as it shifts AI from a subject that students like or not like to a subject that is taught throughout the curriculum.

ai curriculum, curriculum, student, (10 more...)

doi: 10.1007/978-981-99-7947-9_11

2311.14702

Country:

Asia > Middle East > Saudi Arabia (0.25)
South America (0.04)
North America > United States > Ohio > Franklin County > Columbus (0.04)
(6 more...)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.94)

Industry:

Education > Educational Setting > K-12 Education (1.00)
Education > Curriculum > Subject-Specific Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)

Online Continual Learning via Logit Adjusted Softmax

Huang, Zhehao, Li, Tao, Yuan, Chenhe, Wu, Yingwen, Huang, Xiaolin

Online continual learning is a challenging problem where models must learn from a non-stationary data stream while avoiding catastrophic forgetting. Inter-class imbalance during training has been identified as a major cause of forgetting, leading to model prediction bias towards recently learned classes. In this paper, we theoretically analyze that inter-class imbalance is entirely attributed to imbalanced class-priors, and the function learned from intra-class intrinsic distributions is the Bayes-optimal classifier. To that end, we present that a simple adjustment of model logits during training can effectively resist prior class bias and pursue the corresponding Bayes-optimum. Our proposed method, Logit Adjusted Softmax, can mitigate the impact of inter-class imbalance not only in class-incremental but also in realistic general setups, with little additional computational cost. We evaluate our approach on various benchmarks and demonstrate significant performance improvements compared to prior arts. For example, our approach improves the best baseline by 4.6% on CIFAR10.

logit adjusted softmax, online continual learning

2311.0646

Genre:

Instructional Material > Online (0.60)
Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.60)

Neuro-GPT: Developing A Foundation Model for EEG

Cui, Wenhui, Jeong, Woojae, Thölke, Philipp, Medani, Takfarinas, Jerbi, Karim, Joshi, Anand A., Leahy, Richard M.

To handle the scarcity and heterogeneity of electroencephalography (EEG) data for Brain-Computer Interface (BCI) tasks, and to harness the power of large publicly available data sets, we propose Neuro-GPT, a foundation model consisting of an EEG encoder and a GPT model. The foundation model is pre-trained on a large-scale data set using a self-supervised task that learns how to reconstruct masked EEG segments. We then fine-tune the model on a Motor Imagery Classification task to validate its performance in a low-data regime (9 subjects). Our experiments demonstrate that applying a foundation model can significantly improve classification performance compared to a model trained from scratch, which provides evidence for the generalizability of the foundation model and its ability to address challenges of data scarcity and heterogeneity in EEG.

classification, encoder, gpt model, (15 more...)

2311.03764

Country:

North America > United States > California (0.14)
Europe > Austria > Styria > Graz (0.05)
North America > Canada > Quebec > Montreal (0.04)

Genre:

Research Report (0.40)
Instructional Material > Course Syllabus & Notes (0.34)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.95)
Health & Medicine > Health Care Technology (0.89)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)