AITopics

2410.16711

Country:

North America > United States > New York (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(2 more...)

Genre:

Instructional Material > Online (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Research Report > New Finding (0.68)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-22-2024

Deep Learning and Machine Learning -- Python Data Structures and Mathematics Fundamental: From Theory to Practice

Chen, Silin, Bi, Ziqian, Liu, Junyu, Peng, Benji, Zhang, Sen, Pan, Xuanhe, Xu, Jiawei, Wang, Jinlang, Chen, Keyu, Yin, Caitlyn Heqi, Feng, Pohsun, Wen, Yizhu, Wang, Tianyang, Li, Ming, Ren, Jintao, Niu, Qian, Liu, Ming

This book provides a comprehensive introduction to the foundational concepts of machine learning (ML) and deep learning (DL). It bridges the gap between theoretical mathematics and practical application, focusing on Python as the primary programming language for implementing key algorithms and data structures. The book covers a wide range of topics, including basic and advanced Python programming, fundamental mathematical operations, matrix operations, linear algebra, and optimization techniques crucial for training ML and DL models. Advanced subjects like neural networks, optimization algorithms, and frequency domain methods are also explored, along with real-world applications of large language models (LLMs) and artificial intelligence (AI) in big data management. Designed for both beginners and advanced learners, the book emphasizes the critical role of mathematical principles in developing scalable AI solutions. Practical examples and Python code are provided throughout, ensuring readers gain hands-on experience in applying theoretical knowledge to solve complex problems in ML, DL, and big data analytics.

data mining, natural language, programming language, (20 more...)

2410.19849

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(13 more...)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.92)
Summary/Review (0.85)

Industry:

Education (1.00)
Transportation > Passenger (0.92)
Transportation > Ground > Road (0.92)
(2 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(3 more...)

Raihan, Nishat, Siddiq, Mohammed Latif, Santos, Joanna C. S., Zampieri, Marcos

Large Language Models in Computer Science Education: A Systematic Literature Review

Large language models (LLMs) are becoming increasingly better at a wide range of Natural Language Processing tasks (NLP), such as text generation and understanding. Recently, these models have extended their capabilities to coding tasks, bridging the gap between natural languages (NL) and programming languages (PL). Foundational models such as the Generative Pre-trained Transformer (GPT) and LLaMA series have set strong baseline performances in various NL and PL tasks. Additionally, several models have been fine-tuned specifically for code generation, showing significant improvements in code-related applications. Both foundational and fine-tuned models are increasingly used in education, helping students write, debug, and understand code. We present a comprehensive systematic literature review to examine the impact of LLMs in computer science and computer engineering education. We analyze their effectiveness in enhancing the learning experience, supporting personalized education, and aiding educators in curriculum development. We address five research questions to uncover insights into how LLMs contribute to educational outcomes, identify challenges, and suggest directions for future research.

large language model, machine learning, natural language, (17 more...)

2410.16349

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
Europe > Switzerland (0.04)
North America > United States > Virginia > Fairfax County > Fairfax (0.04)
(8 more...)

Genre:

Instructional Material > Course Syllabus & Notes (0.68)
Research Report > New Finding (0.66)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Tsipidi, Eleftheria, Nowak, Franz, Cotterell, Ryan, Wilcox, Ethan, Giulianelli, Mario, Warstadt, Alex

Surprise! Uniform Information Density Isn't the Whole Story: Predicting Surprisal Contours in Long-form Discourse

The Uniform Information Density (UID) hypothesis posits that speakers tend to distribute information evenly across linguistic units to achieve efficient communication. Of course, information rate in texts and discourses is not perfectly uniform. While these fluctuations can be viewed as theoretically uninteresting noise on top of a uniform target, another explanation is that UID is not the only functional pressure regulating information content in a language. Speakers may also seek to maintain interest, adhere to writing conventions, and build compelling arguments. In this paper, we propose one such functional pressure; namely that speakers modulate information rate based on location within a hierarchically-structured model of discourse. We term this the Structured Context Hypothesis and test it by predicting the surprisal contours of naturally occurring discourses extracted from large language models using predictors derived from discourse structure. We find that hierarchical predictors are significant predictors of a discourse's information contour and that deeply nested hierarchical predictors are more predictive than shallow ones. This work takes an initial step beyond UID to propose testable hypotheses for why the information rate fluctuates in predictable ways

artificial intelligence, large language model, natural language, (18 more...)

2410.16062

Country:

Europe > Germany > Brandenburg > Potsdam (0.04)
Asia > Singapore (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(22 more...)

Genre:

Research Report > New Finding (0.94)
Instructional Material (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.71)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.49)

Kumar, Sahil, Paikar, Deepa, Vutukuri, Kiran Sai, Ali, Haider, Ainala, Shashidhar Reddy, Krishnan, Aditya Murli, Zhang, Youshan

KatzBot: Revolutionizing Academic Chatbot for Enhanced Communication

Effective communication within universities is crucial for addressing the diverse information needs of students, alumni, and external stakeholders. However, existing chatbot systems often fail to deliver accurate, context-specific responses, resulting in poor user experiences. In this paper, we present KatzBot, an innovative chatbot powered by KatzGPT, a custom Large Language Model (LLM) fine-tuned on domain-specific academic data. KatzGPT is trained on two university-specific datasets: 6,280 sentence-completion pairs and 7,330 question-answer pairs. KatzBot outperforms established existing open source LLMs, achieving higher accuracy and domain relevance. KatzBot offers a user-friendly interface, significantly enhancing user satisfaction in real-world applications. The source code is publicly available at \url{https://github.com/AiAI-99/katzbot}.

large language model, machine learning, natural language, (20 more...)

2410.16385

Country:

North America > United States > New York (0.04)
Asia > Middle East > Israel (0.04)
Oceania > Palau (0.04)
(2 more...)

Genre:

Research Report > Promising Solution (0.46)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Information Technology > Security & Privacy (0.68)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Reza, Mohi, Anastasopoulos, Ioannis, Bhandari, Shreya, Pardos, Zachary A.

PromptHive: Bringing Subject Matter Experts Back to the Forefront with Collaborative Prompt Engineering for Educational Content Creation

With the right design [46], such interfaces could enable experts to steer the output of LLMs toward content that better aligns with the nuances and needs of their domains, and transform the role of the subject matter expert from a producer to a curator--a competent and critical judge who instructs the AI agent on what is needed, evaluates the output, and iterates on the instructions until the results are satisfactory. Instead of replacing human experts, these interfaces could help bridge human intelligence with machine intelligence to dramatically reduce the time and effort required to create content that adheres to expert tastes and standards. To realize the producer-to-curator shift and integrate domain expertise more closely into prompt engineering, we need authoring interfaces that: (i) deeply embed LLMs within existing expert workflows, augmenting content creation with carefully scaffolded interface support for prompt engineering; (ii) encourage experimentation on many prompt variations to systematically test the impact of changes in instructional wording on model output; (iii) offer mechanisms for curating prompt formulations that work well at various levels of abstraction; (iv) integrate generation into the publishing workflow. However, designing authoring interfaces that support experts across all four fronts is difficult as LLMs pose unique usability challenges tied to high metacognitive demands during prompt construction [45], and users can struggle to get the models to integrate well with their existing workflow as even small perturbations such as adding a space at the end of a prompt can cause the LLM to change its output [37]. For domain experts who aren't AI specialists, recent literature on prompt engineering has also highlighted how designing effective prompts can be surprisingly difficult for non-AI experts [8, 51].

large language model, machine learning, natural language, (17 more...)

2410.16547

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(3 more...)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.70)
Education > Educational Setting > Higher Education (0.68)
Education > Educational Setting > K-12 Education (0.47)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Kadiyala, Ram Mohan Rao, Rao, M. V. P. Chandra Sekhara

1024m at SMM4H 2024: Tasks 3, 5 & 6 -- Ensembles of Transformers and Large Language Models for Medical Text Classification

Social media is a great source of data for users reporting information and regarding their health and how various things have had an effect on them. This paper presents various approaches using Transformers and Large Language Models and their ensembles, their performance along with advantages and drawbacks for various tasks of SMM4H'24 - Classifying texts on impact of nature and outdoor spaces on the author's mental health (Task 3), Binary classification of tweets reporting their children's health disorders like Asthma, Autism, ADHD and Speech disorder (task 5), Binary classification of users self-reporting their age (task 6).

classification, large language model, machine learning, (22 more...)

2410.15998

Country:

North America > United States > Maryland > Prince George's County > College Park (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre:

Instructional Material > Online (0.40)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Attention Deficit/Hyperactivity Disorder (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

George, Thomas, Nodet, Pierre, Bondu, Alexis, Lemaire, Vincent

Mislabeled examples detection viewed as probing machine learning models: concepts, survey and extensive benchmark

Mislabeled examples are ubiquitous in real-world machine learning datasets, advocating the development of techniques for automatic detection. We show that most mislabeled detection methods can be viewed as probing trained machine learning models using a few core principles. We formalize a modular framework that encompasses these methods, parameterized by only 4 building blocks, as well as a Python library that demonstrates that these principles can actually be implemented. The focus is on classifier-agnostic concepts, with an emphasis on adapting methods developed for deep learning models to non-deep classifiers for tabular data. We benchmark existing methods on (artificial) Completely At Random (NCAR) as well as (realistic) Not At Random (NNAR) labeling noise from a variety of tasks with imperfect labeling rules. This benchmark provides new insights as well as limitations of existing methods in this setup.

artificial intelligence, inductive learning, machine learning, (19 more...)

2410.15772

Country:

North America > United States > California (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Seo, Minhyuk, Koh, Hyunseo, Choi, Jonghyun

Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling

arXiv.org Artificial IntelligenceOct-19-2024

The majority of online continual learning (CL) advocates single-epoch training and imposes restrictions on the size of replay memory. However, single-epoch training would incur a different amount of computations per CL algorithm, and the additional storage cost to store logit or model in addition to replay memory is largely ignored in calculating the storage budget. Arguing different computational and storage budgets hinder fair comparison among CL algorithms in practice, we propose to use floating point operations (FLOPs) and total memory size in Byte as a metric for computational and memory budgets, respectively, to compare and develop CL algorithms in the same 'total resource budget.' To improve a CL method in a limited total budget, we propose adaptive layer freezing that does not update the layers for less informative batches to reduce computational costs with a negligible loss of accuracy. In addition, we propose a memory retrieval method that allows the model to learn the same amount of knowledge as using random retrieval in fewer iterations. Empirical validations on the CIFAR-10/100, CLEAR-10/100, and ImageNet-1K datasets demonstrate that the proposed approach outperforms the state-of-the-art methods within the same total budget

artificial intelligence, machine learning, setup, (16 more...)

2410.15143

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre:

Instructional Material > Online (0.70)
Research Report > New Finding (0.46)

Industry: Education > Educational Setting > Online (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Maity, Subhankar, Deroy, Aniket

Human-Centric eXplainable AI in Education

arXiv.org Artificial IntelligenceOct-18-2024

As artificial intelligence (AI) becomes more integrated into educational environments, how can we ensure that these systems are both understandable and trustworthy? The growing demand for explainability in AI systems is a critical area of focus. This paper explores Human-Centric eXplainable AI (HCXAI) in the educational landscape, emphasizing its role in enhancing learning outcomes, fostering trust among users, and ensuring transparency in AI-driven tools, particularly through the innovative use of large language models (LLMs). What challenges arise in the implementation of explainable AI in educational contexts? It outlines comprehensive frameworks for developing HCXAI systems that prioritize user understanding and engagement, ensuring that educators and students can effectively interact with these technologies. Furthermore, what steps can educators, developers, and policymakers take to create more effective, inclusive, and ethically responsible AI solutions in education? The paper provides targeted recommendations to address this question, highlighting the necessity of prioritizing explainability. By doing so, how can we leverage AI's transformative potential to foster equitable and engaging educational experiences that support diverse learners? The rapid advancement of AI technologies has transformed various sectors, including education, by introducing innovative solutions that enhance teaching and learning experiences. In recent years, AI systems have increasingly been utilized for personalized learning, assessment, and feedback mechanisms (Maghsudi et al., 2021; Maity and Deroy, 2024a; Maity and Deroy, 2024b).

artificial intelligence, natural language, student, (16 more...)

2410.19822

Country:

Asia > Middle East > UAE (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Asia > India > West Bengal > Kharagpur (0.04)
(6 more...)

Genre:

Research Report (1.00)
Instructional Material (0.69)

Industry:

Government (1.00)
Education > Educational Setting (1.00)
Information Technology > Security & Privacy (0.93)
Education > Educational Technology > Educational Software > Computer Based Training (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)