AITopics

2305.09535

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre:

Research Report (0.50)
Instructional Material (0.46)
Personal (0.46)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.66)
(2 more...)

arXiv.org Artificial IntelligenceAug-20-2023

LMTuner: An user-friendly and highly-integrable Training Framework for fine-tuning Large Language Models

Weng, Yixuan, Wang, Zhiqi, Liao, Huanxuan, He, Shizhu, Liu, Shengping, Liu, Kang, Zhao, Jun

With the burgeoning development in the realm of large language models (LLMs), the demand for efficient incremental training tailored to specific industries and domains continues to increase. Currently, the predominantly employed frameworks lack modular design, it often takes a lot of coding work to kickstart the training of LLM. To address this, we present "LMTuner", a highly usable, integrable, and scalable system for training LLMs expeditiously and with minimal user-input. LMTuner comprises three main modules - the Interaction, Training, and Inference Modules. We advocate that LMTuner's usability and integrality alleviate the complexities in training large language models. Remarkably, even a novice user could commence training large language models within five minutes. Furthermore, it integrates DeepSpeed frameworks and supports Efficient Fine-Tuning methodologies like Low Rank Adaptation (LoRA), Quantized LoRA (QLoRA), etc., enabling the training of language models scaling from 300M to a whopping 130B parameters using a single server. The LMTuner's homepage (https://wengsyx.github.io/LMTuner/)and screencast video (https://youtu.be/nsXmWOmN3rE) are now publicly available.

dataset, language model, lmtuner, (13 more...)

2308.10252

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre:

Instructional Material (0.69)
Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceAug-20-2023

PaniniQA: Enhancing Patient Education Through Interactive Question Answering

Cai, Pengshan, Yao, Zonghai, Liu, Fei, Wang, Dakuo, Reilly, Meghan, Zhou, Huixue, Li, Lingxi, Cao, Yi, Kapoor, Alok, Bajracharya, Adarsha, Berlowitz, Dan, Yu, Hong

Patient portal allows discharged patients to access their personalized discharge instructions in electronic health records (EHRs). However, many patients have difficulty understanding or memorizing their discharge instructions. In this paper, we present PaniniQA, a patient-centric interactive question answering system designed to help patients understand their discharge instructions. PaniniQA first identifies important clinical content from patients' discharge instructions and then formulates patient-specific educational questions. In addition, PaniniQA is also equipped with answer verification functionality to provide timely feedback to correct patients' misunderstandings. Our comprehensive automatic and human evaluation results demonstrate our PaniniQA is capable of improving patients' mastery of their medical instructions through effective interactions

large language model, machine learning, question answering, (19 more...)

2308.03253

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(13 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Health & Medicine > Health Care Providers & Services (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Yu, Longlin, Zhang, Cheng

Semi-Implicit Variational Inference via Score Matching

Semi-implicit variational inference (SIVI) greatly enriches the expressiveness of variational families by considering implicit variational distributions defined in a hierarchical manner. However, due to the intractable densities of variational distributions, current SIVI approaches often use surrogate evidence lower bounds (ELBOs) or employ expensive inner-loop MCMC runs for unbiased ELBOs for training. In this paper, we propose SIVI-SM, a new method for SIVI based on an alternative training objective via score matching. Leveraging the hierarchical structure of semi-implicit variational families, the score matching objective allows a minimax formulation where the intractable variational densities can be naturally handled with denoising score matching. We show that SIVI-SM closely matches the accuracy of MCMC and outperforms ELBO-based SIVI methods in a variety of Bayesian inference tasks.

artificial intelligence, machine learning, variational inference, (16 more...)

2308.10014

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.05)
Asia > China > Beijing > Beijing (0.04)
(3 more...)

Genre:

Research Report (0.65)
Instructional Material (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models

Zhang, Liwen, Cai, Weige, Liu, Zhaowei, Yang, Zhi, Dai, Wei, Liao, Yujie, Qin, Qianru, Li, Yifei, Liu, Xingyu, Liu, Zhiqiang, Zhu, Zhoufan, Wu, Anbo, Guo, Xin, Chen, Yun

Large language models (LLMs) have demonstrated exceptional performance in various natural language processing tasks, yet their efficacy in more challenging and domain-specific tasks remains largely unexplored. This paper presents FinEval, a benchmark specifically designed for the financial domain knowledge in the LLMs. FinEval is a collection of high-quality multiple-choice questions covering Finance, Economy, Accounting, and Certificate. It includes 4,661 questions spanning 34 different academic subjects. To ensure a comprehensive model performance evaluation, FinEval employs a range of prompt types, including zero-shot and few-shot prompts, as well as answer-only and chain-of-thought prompts. Evaluating state-of-the-art Chinese and English LLMs on FinEval, the results show that only GPT-4 achieved an accuracy close to 70% in different prompt settings, indicating significant growth potential for LLMs in the financial domain knowledge. Our work offers a more comprehensive financial knowledge evaluation benchmark, utilizing data of mock exams and covering a wide range of evaluated LLMs.

category, large language model, machine learning, (18 more...)

2308.09975

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > Middle East > Jordan (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.93)

Industry:

Education (1.00)
Banking & Finance > Trading (0.93)
Banking & Finance > Economy (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Story Visualization by Online Text Augmentation with Context Memory

Ahn, Daechul, Kim, Daneul, Song, Gwangmo, Kim, Seung Hwan, Lee, Honglak, Kang, Dongyeop, Choi, Jonghyun

Story visualization (SV) is a challenging text-to-image generation task for the difficulty of not only rendering visual details from the text descriptions but also encoding a long-term context across multiple sentences. While prior efforts mostly focus on generating a semantically relevant image for each sentence, encoding a context spread across the given paragraph to generate contextually convincing images (e.g., with a correct character or with a proper background of the scene) remains a challenge. To this end, we propose a novel memory architecture for the Bi-directional Transformer framework with an online text augmentation that generates multiple pseudo-descriptions as supplementary supervision during training for better generalization to the language variation at inference. In extensive experiments on the two popular SV benchmarks, i.e., the Pororo-SV and Flintstones-SV, the proposed method significantly outperforms the state of the arts in various metrics including FID, character F1, frame accuracy, BLEU-2/3, and R-precision with similar or less computational complexity.

artificial intelligence, machine learning, natural language, (17 more...)

2308.07575

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Minnesota (0.04)
North America > United States > Michigan (0.04)

Genre:

Instructional Material > Online (0.61)
Instructional Material > Course Syllabus & Notes (0.61)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

AdaER: An Adaptive Experience Replay Approach for Continual Lifelong Learning

Li, Xingyu, Tang, Bo, Li, Haifeng

Continual lifelong learning is an machine learning framework inspired by human learning, where learners are trained to continuously acquire new knowledge in a sequential manner. However, the non-stationary nature of streaming training data poses a significant challenge known as catastrophic forgetting, which refers to the rapid forgetting of previously learned knowledge when new tasks are introduced. While some approaches, such as experience replay (ER), have been proposed to mitigate this issue, their performance remains limited, particularly in the class-incremental scenario which is considered natural and highly challenging. In this paper, we present a novel algorithm, called adaptive-experience replay (AdaER), to address the challenge of continual lifelong learning. AdaER consists of two stages: memory replay and memory update. In the memory replay stage, AdaER introduces a contextually-cued memory recall (C-CMR) strategy, which selectively replays memories that are most conflicting with the current input data in terms of both data and task. Additionally, AdaER incorporates an entropy-balanced reservoir sampling (E-BRS) strategy to enhance the performance of the memory buffer by maximizing information entropy. To evaluate the effectiveness of AdaER, we conduct experiments on established supervised continual lifelong learning benchmarks, specifically focusing on class-incremental learning scenarios. The results demonstrate that AdaER outperforms existing continual lifelong learning baselines, highlighting its efficacy in mitigating catastrophic forgetting and improving learning performance.

artificial intelligence, learning, machine learning, (17 more...)

2308.0381

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Fujian Province > Xiamen (0.04)
(8 more...)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.48)

Industry: Education > Educational Setting > Continuing Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.66)

arXiv.org Artificial IntelligenceAug-18-2023

Exploring the Power of Creative AI Tools and Game-Based Methodologies for Interactive Web-Based Programming

Kenwright, Benjamin

In recent years, the fields of artificial intelligence and web-based programming have seen tremendous advancements, enabling developers to create dynamic and interactive websites and applications. At the forefront of these advancements, creative AI tools and game-based methodologies have emerged as potent instruments, promising enhanced user experiences and increased engagement in educational environments. This chapter explores the potential of these tools and methodologies for interactive web-based programming, examining their benefits, limitations, and real-world applications. We examine the challenges and ethical considerations that arise when integrating these technologies into web development, such as privacy concerns and the potential for bias in AI-generated content. Through this exploration, we aim to provide insights into the exciting possibilities that creative AI tools and game-based methodologies offer for the future of web-based programming.

artificial intelligence, machine learning, natural language, (18 more...)

2308.11649

Country:

Asia (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)

Genre:

Research Report (1.00)
Instructional Material (1.00)
Overview > Innovation (0.46)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(3 more...)

Moon, Jun-Yeong, Park, Keon-Hee, Kim, Jung Uk, Park, Gyeong-Moon

Online Class Incremental Learning on Stochastic Blurry Task Boundary via Mask and Visual Prompt Tuning

arXiv.org Artificial IntelligenceAug-18-2023

Continual learning aims to learn a model from a continuous stream of data, but it mainly assumes a fixed number of data and tasks with clear task boundaries. However, in real-world scenarios, the number of input data and tasks is constantly changing in a statistical way, not a static way. Although recently introduced incremental learning scenarios having blurry task boundaries somewhat address the above issues, they still do not fully reflect the statistical properties of real-world situations because of the fixed ratio of disjoint and blurry samples. In this paper, we propose a new Stochastic incremental Blurry task boundary scenario, called Si-Blurry, which reflects the stochastic properties of the real-world. We find that there are two major challenges in the Si-Blurry scenario: (1) inter- and intra-task forgettings and (2) class imbalance problem. To alleviate them, we introduce Mask and Visual Prompt tuning (MVP). In MVP, to address the inter- and intra-task forgetting issues, we propose a novel instance-wise logit masking and contrastive visual prompt tuning loss. Both of them help our model discern the classes to be learned in the current batch. It results in consolidating the previous knowledge. In addition, to alleviate the class imbalance problem, we introduce a new gradient similarity-based focal loss and adaptive feature scaling to ease overfitting to the major classes and underfitting to the minor classes. Extensive experiments show that our proposed MVP significantly outperforms the existing state-of-the-art methods in our challenging Si-Blurry scenario.

artificial intelligence, machine learning, scenario, (13 more...)

2308.09303

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceAug-18-2023

Relation-aware graph structure embedding with co-contrastive learning for drug-drug interaction prediction

Jiang, Mengying, Liu, Guizhong, Zhao, Biao, Su, Yuanchao, Jin, Weiqiang

Relation-aware graph structure embedding is promising for predicting multi-relational drug-drug interactions (DDIs). Typically, most existing methods begin by constructing a multi-relational DDI graph and then learning relation-aware graph structure embeddings (RaGSEs) of drugs from the DDI graph. Nevertheless, most existing approaches are usually limited in learning RaGSEs of new drugs, leading to serious over-fitting when the test DDIs involve such drugs. To alleviate this issue, we propose a novel DDI prediction method based on relation-aware graph structure embedding with co-contrastive learning, RaGSECo. The proposed RaGSECo constructs two heterogeneous drug graphs: a multi-relational DDI graph and a multi-attribute drug-drug similarity (DDS) graph. The two graphs are used respectively for learning and propagating the RaGSEs of drugs, aiming to ensure all drugs, including new ones, can possess effective RaGSEs. Additionally, we present a novel co-contrastive learning module to learn drug-pairs (DPs) representations. This mechanism learns DP representations from two distinct views (interaction and similarity views) and encourages these views to supervise each other collaboratively to obtain more discriminative DP representations. We evaluate the effectiveness of our RaGSECo on three different tasks using two real datasets. The experimental results demonstrate that RaGSECo outperforms existing state-of-the-art prediction methods.

artificial intelligence, machine learning, representation, (19 more...)

2307.01507

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(2 more...)

Genre:

Instructional Material > Course Syllabus & Notes (0.48)
Research Report > New Finding (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)