Instructional Material
Development of CNN Architectures using Transfer Learning Methods for Medical Image Classification
Basyal, Ganga Prasad, Zeng, David, Rimal, Bhaskar Pm
The application of deep learning-based architecture has seen a tremendous rise in recent years. For example, medical image classification using deep learning achieved breakthrough results. Convolutional Neural Networks (CNNs) are implemented predominantly in medical image classification and segmentation. On the other hand, transfer learning has emerged as a prominent supporting tool for enhancing the efficiency and accuracy of deep learning models. This paper investigates the development of CNN architectures using transfer learning techniques in the field of medical image classification using a timeline mapping model for key image classification challenges. Our findings help make an informed decision while selecting the optimum and state-of-the-art CNN architectures.
Deep Learning and Machine Learning -- Python Data Structures and Mathematics Fundamental: From Theory to Practice
Chen, Silin, Bi, Ziqian, Liu, Junyu, Peng, Benji, Zhang, Sen, Pan, Xuanhe, Xu, Jiawei, Wang, Jinlang, Chen, Keyu, Yin, Caitlyn Heqi, Feng, Pohsun, Wen, Yizhu, Wang, Tianyang, Li, Ming, Ren, Jintao, Niu, Qian, Liu, Ming
This book provides a comprehensive introduction to the foundational concepts of machine learning (ML) and deep learning (DL). It bridges the gap between theoretical mathematics and practical application, focusing on Python as the primary programming language for implementing key algorithms and data structures. The book covers a wide range of topics, including basic and advanced Python programming, fundamental mathematical operations, matrix operations, linear algebra, and optimization techniques crucial for training ML and DL models. Advanced subjects like neural networks, optimization algorithms, and frequency domain methods are also explored, along with real-world applications of large language models (LLMs) and artificial intelligence (AI) in big data management. Designed for both beginners and advanced learners, the book emphasizes the critical role of mathematical principles in developing scalable AI solutions. Practical examples and Python code are provided throughout, ensuring readers gain hands-on experience in applying theoretical knowledge to solve complex problems in ML, DL, and big data analytics.
Large Language Models in Computer Science Education: A Systematic Literature Review
Raihan, Nishat, Siddiq, Mohammed Latif, Santos, Joanna C. S., Zampieri, Marcos
Large language models (LLMs) are becoming increasingly better at a wide range of Natural Language Processing tasks (NLP), such as text generation and understanding. Recently, these models have extended their capabilities to coding tasks, bridging the gap between natural languages (NL) and programming languages (PL). Foundational models such as the Generative Pre-trained Transformer (GPT) and LLaMA series have set strong baseline performances in various NL and PL tasks. Additionally, several models have been fine-tuned specifically for code generation, showing significant improvements in code-related applications. Both foundational and fine-tuned models are increasingly used in education, helping students write, debug, and understand code. We present a comprehensive systematic literature review to examine the impact of LLMs in computer science and computer engineering education. We analyze their effectiveness in enhancing the learning experience, supporting personalized education, and aiding educators in curriculum development. We address five research questions to uncover insights into how LLMs contribute to educational outcomes, identify challenges, and suggest directions for future research.
Surprise! Uniform Information Density Isn't the Whole Story: Predicting Surprisal Contours in Long-form Discourse
Tsipidi, Eleftheria, Nowak, Franz, Cotterell, Ryan, Wilcox, Ethan, Giulianelli, Mario, Warstadt, Alex
The Uniform Information Density (UID) hypothesis posits that speakers tend to distribute information evenly across linguistic units to achieve efficient communication. Of course, information rate in texts and discourses is not perfectly uniform. While these fluctuations can be viewed as theoretically uninteresting noise on top of a uniform target, another explanation is that UID is not the only functional pressure regulating information content in a language. Speakers may also seek to maintain interest, adhere to writing conventions, and build compelling arguments. In this paper, we propose one such functional pressure; namely that speakers modulate information rate based on location within a hierarchically-structured model of discourse. We term this the Structured Context Hypothesis and test it by predicting the surprisal contours of naturally occurring discourses extracted from large language models using predictors derived from discourse structure. We find that hierarchical predictors are significant predictors of a discourse's information contour and that deeply nested hierarchical predictors are more predictive than shallow ones. This work takes an initial step beyond UID to propose testable hypotheses for why the information rate fluctuates in predictable ways
KatzBot: Revolutionizing Academic Chatbot for Enhanced Communication
Kumar, Sahil, Paikar, Deepa, Vutukuri, Kiran Sai, Ali, Haider, Ainala, Shashidhar Reddy, Krishnan, Aditya Murli, Zhang, Youshan
Effective communication within universities is crucial for addressing the diverse information needs of students, alumni, and external stakeholders. However, existing chatbot systems often fail to deliver accurate, context-specific responses, resulting in poor user experiences. In this paper, we present KatzBot, an innovative chatbot powered by KatzGPT, a custom Large Language Model (LLM) fine-tuned on domain-specific academic data. KatzGPT is trained on two university-specific datasets: 6,280 sentence-completion pairs and 7,330 question-answer pairs. KatzBot outperforms established existing open source LLMs, achieving higher accuracy and domain relevance. KatzBot offers a user-friendly interface, significantly enhancing user satisfaction in real-world applications. The source code is publicly available at \url{https://github.com/AiAI-99/katzbot}.
PromptHive: Bringing Subject Matter Experts Back to the Forefront with Collaborative Prompt Engineering for Educational Content Creation
Reza, Mohi, Anastasopoulos, Ioannis, Bhandari, Shreya, Pardos, Zachary A.
With the right design [46], such interfaces could enable experts to steer the output of LLMs toward content that better aligns with the nuances and needs of their domains, and transform the role of the subject matter expert from a producer to a curator--a competent and critical judge who instructs the AI agent on what is needed, evaluates the output, and iterates on the instructions until the results are satisfactory. Instead of replacing human experts, these interfaces could help bridge human intelligence with machine intelligence to dramatically reduce the time and effort required to create content that adheres to expert tastes and standards. To realize the producer-to-curator shift and integrate domain expertise more closely into prompt engineering, we need authoring interfaces that: (i) deeply embed LLMs within existing expert workflows, augmenting content creation with carefully scaffolded interface support for prompt engineering; (ii) encourage experimentation on many prompt variations to systematically test the impact of changes in instructional wording on model output; (iii) offer mechanisms for curating prompt formulations that work well at various levels of abstraction; (iv) integrate generation into the publishing workflow. However, designing authoring interfaces that support experts across all four fronts is difficult as LLMs pose unique usability challenges tied to high metacognitive demands during prompt construction [45], and users can struggle to get the models to integrate well with their existing workflow as even small perturbations such as adding a space at the end of a prompt can cause the LLM to change its output [37]. For domain experts who aren't AI specialists, recent literature on prompt engineering has also highlighted how designing effective prompts can be surprisingly difficult for non-AI experts [8, 51].
1024m at SMM4H 2024: Tasks 3, 5 & 6 -- Ensembles of Transformers and Large Language Models for Medical Text Classification
Kadiyala, Ram Mohan Rao, Rao, M. V. P. Chandra Sekhara
Social media is a great source of data for users reporting information and regarding their health and how various things have had an effect on them. This paper presents various approaches using Transformers and Large Language Models and their ensembles, their performance along with advantages and drawbacks for various tasks of SMM4H'24 - Classifying texts on impact of nature and outdoor spaces on the author's mental health (Task 3), Binary classification of tweets reporting their children's health disorders like Asthma, Autism, ADHD and Speech disorder (task 5), Binary classification of users self-reporting their age (task 6).
Mislabeled examples detection viewed as probing machine learning models: concepts, survey and extensive benchmark
George, Thomas, Nodet, Pierre, Bondu, Alexis, Lemaire, Vincent
Mislabeled examples are ubiquitous in real-world machine learning datasets, advocating the development of techniques for automatic detection. We show that most mislabeled detection methods can be viewed as probing trained machine learning models using a few core principles. We formalize a modular framework that encompasses these methods, parameterized by only 4 building blocks, as well as a Python library that demonstrates that these principles can actually be implemented. The focus is on classifier-agnostic concepts, with an emphasis on adapting methods developed for deep learning models to non-deep classifiers for tabular data. We benchmark existing methods on (artificial) Completely At Random (NCAR) as well as (realistic) Not At Random (NNAR) labeling noise from a variety of tasks with imperfect labeling rules. This benchmark provides new insights as well as limitations of existing methods in this setup.
Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling
Seo, Minhyuk, Koh, Hyunseo, Choi, Jonghyun
The majority of online continual learning (CL) advocates single-epoch training and imposes restrictions on the size of replay memory. However, single-epoch training would incur a different amount of computations per CL algorithm, and the additional storage cost to store logit or model in addition to replay memory is largely ignored in calculating the storage budget. Arguing different computational and storage budgets hinder fair comparison among CL algorithms in practice, we propose to use floating point operations (FLOPs) and total memory size in Byte as a metric for computational and memory budgets, respectively, to compare and develop CL algorithms in the same 'total resource budget.' To improve a CL method in a limited total budget, we propose adaptive layer freezing that does not update the layers for less informative batches to reduce computational costs with a negligible loss of accuracy. In addition, we propose a memory retrieval method that allows the model to learn the same amount of knowledge as using random retrieval in fewer iterations. Empirical validations on the CIFAR-10/100, CLEAR-10/100, and ImageNet-1K datasets demonstrate that the proposed approach outperforms the state-of-the-art methods within the same total budget
Human-Centric eXplainable AI in Education
Maity, Subhankar, Deroy, Aniket
As artificial intelligence (AI) becomes more integrated into educational environments, how can we ensure that these systems are both understandable and trustworthy? The growing demand for explainability in AI systems is a critical area of focus. This paper explores Human-Centric eXplainable AI (HCXAI) in the educational landscape, emphasizing its role in enhancing learning outcomes, fostering trust among users, and ensuring transparency in AI-driven tools, particularly through the innovative use of large language models (LLMs). What challenges arise in the implementation of explainable AI in educational contexts? It outlines comprehensive frameworks for developing HCXAI systems that prioritize user understanding and engagement, ensuring that educators and students can effectively interact with these technologies. Furthermore, what steps can educators, developers, and policymakers take to create more effective, inclusive, and ethically responsible AI solutions in education? The paper provides targeted recommendations to address this question, highlighting the necessity of prioritizing explainability. By doing so, how can we leverage AI's transformative potential to foster equitable and engaging educational experiences that support diverse learners? The rapid advancement of AI technologies has transformed various sectors, including education, by introducing innovative solutions that enhance teaching and learning experiences. In recent years, AI systems have increasingly been utilized for personalized learning, assessment, and feedback mechanisms (Maghsudi et al., 2021; Maity and Deroy, 2024a; Maity and Deroy, 2024b).