AITopics

2503.20215

Country:

North America > United States (0.28)
Europe > Austria (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)

arXiv.org Artificial IntelligenceDec-11-2024

AI-assisted Knowledge Discovery in Biomedical Literature to Support Decision-making in Precision Oncology

He, Ting, Kreimeyer, Kory, Najjar, Mimi, Spiker, Jonathan, Fatteh, Maria, Anagnostou, Valsamo, Botsis, Taxiarchis

The delivery of appropriate targeted therapies to cancer patients requires the complete analysis of the molecular profiling of tumors and the patient's clinical characteristics in the context of existing knowledge and recent findings described in biomedical literature and several other sources. We evaluated the potential contributions of specific natural language processing solutions to support knowledge discovery from biomedical literature. Two models from the Bidirectional Encoder Representations from Transformers (BERT) family, two Large Language Models, and PubTator 3.0 were tested for their ability to support the named entity recognition (NER) and the relation extraction (RE) tasks. PubTator 3.0 and the BioBERT model performed best in the NER task (best F1-score equal to 0.93 and 0.89, respectively), while BioBERT outperformed all other solutions in the RE task (best F1-score 0.79) and a specific use case it was applied to by recognizing nearly all entity mentions and most of the relations.

large language model, machine learning, relation, (21 more...)

2412.089

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-10-2024

FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

An, Keyu, Chen, Qian, Deng, Chong, Du, Zhihao, Gao, Changfeng, Gao, Zhifu, Gu, Yue, He, Ting, Hu, Hangrui, Hu, Kai, Ji, Shengpeng, Li, Yabin, Li, Zerui, Lu, Heng, Luo, Haoneng, Lv, Xiang, Ma, Bin, Ma, Ziyang, Ni, Chongjia, Song, Changhe, Shi, Jiaqi, Shi, Xian, Wang, Hao, Wang, Wen, Wang, Yuxuan, Xiao, Zhangyu, Yan, Zhijie, Yang, Yexin, Zhang, Bin, Zhang, Qinglin, Zhang, Shiliang, Zhao, Nan, Zheng, Siqi

This report introduces FunAudioLLM, a model family designed to enhance natural voice interactions between humans and large language models (LLMs). At its core are two innovative models: SenseVoice, which handles multilingual speech recognition, emotion recognition, and audio event detection; and CosyVoice, which facilitates natural speech generation with control over multiple languages, timbre, speaking style, and speaker identity. SenseVoice-Small delivers exceptionally low-latency ASR for 5 languages, and SenseVoice-Large supports high-precision ASR for over 50 languages, while CosyVoice excels in multi-lingual voice generation, zero-shot in-context learning, cross-lingual voice cloning, and instruction-following capabilities. The models related to SenseVoice and CosyVoice have been open-sourced on Modelscope and Huggingface, along with the corresponding training, inference, and fine-tuning codes released on GitHub. By integrating these models with LLMs, FunAudioLLM enables applications such as speech-to-speech translation, emotional voice chat, interactive podcasts, and expressive audiobook narration, thereby pushing the boundaries of voice interaction technology. Demos are available at https://fun-audio-llm.github.io, and the code can be accessed at https://github.com/FunAudioLLM.

large language model, machine learning, natural language, (20 more...)

2407.04051

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceJan-5-2024

Energy-efficient Decentralized Learning via Graph Sparsification

Zhang, Xusheng, Chiu, Cho-Chun, He, Ting

This work aims at improving the energy efficiency of decentralized learning by optimizing the mixing matrix, which controls the communication demands during the learning process. Through rigorous analysis based on a state-of-the-art decentralized learning algorithm, the problem is formulated as a bi-level optimization, with the lower level solved by graph sparsification. A solution with guaranteed performance is proposed for the special case of fully-connected base topology and a greedy heuristic is proposed for the general case. Simulations based on real topology and dataset show that the proposed solution can lower the energy consumption at the busiest node by 54%-76% while maintaining the quality of the trained model.

artificial intelligence, energy consumption, machine learning, (17 more...)

2401.03083

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.40)

Industry: Energy (0.39)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningJul-6-2020

Sharing Models or Coresets: A Study based on Membership Inference Attack

Lu, Hanlin, Liu, Changchang, He, Ting, Wang, Shiqiang, Chan, Kevin S.

While each approach preserves data them is still missing. In this work, we take a first step privacy to some extent thanks to not sharing the towards filling this gap by comparing the federated learning raw data, the exact extent of protection is unclear approach and the coreset-based approach in terms of (1) under sophisticated attacks that try to infer the the accuracy of the target machine learning model we want raw data from the shared information. We present to train, (2) the communication cost during training, and the first comparison between the two approaches (3) the leakage of the private training data. In particular, in terms of target model accuracy, communication although neither approach will require the data sources to cost, and data privacy, where the last is measured directly share their data, it has been shown in (Shokri et al., by the accuracy of a state-of-the-art attack strategy 2017) that models derived from a dataset can be used to called the membership inference attack. Our infer the membership of the dataset (i.e., whether or not a experiments quantify the accuracy-privacy-cost given data record is contained in the dataset), known as the tradeoff of each approach, and reveal a nontrivial membership inference attack (MIA). Since the coreset can comparison that can be used to guide the design also be viewed as a model, we can thus use the accuracy of of model training processes.

artificial intelligence, neural network, target model, (17 more...)

2007.02977

Country:

North America > United States (0.93)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningJul-6-2020

Online Learning of Facility Locations

Pasteris, Stephen, He, Ting, Vitale, Fabio, Wang, Shiqiang, Herbster, Mark

In this paper we consider an online learning version of the Facility location problem where users need to be served one at a time in a sequence of trials. The goal is to select, at each trial, a subset of a given set of sites, and then pay a loss equal to their total "opening cost" plus the minimum "connection cost" for connecting the user to one of the sites in the subset. More precisely, we are given a set of N sites. At the beginning of each trial, an opening cost and a connection cost for the arriving user are associated with each site and are unknown. At each trial, the learner has to select a subset of sites and incurs a loss given by the minimum connection cost over the selected sites plus the sum of the opening costs of all selected sites. After each subset selection, the opening and connection costs of all sites are revealed. To solve this problem, we design and rigorously analyse an algorithm which belongs to the class of online learning algorithms that make use of the Exponentiated gradient method [15]. We measure, and rigorously analyse, the performance of our method by comparing its cumulative loss with that of any fixed subset of sites.

algorithm, computer based training, educational technology, (22 more...)

2007.02801

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.81)

arXiv.org Machine LearningApr-11-2019

Robust Coreset Construction for Distributed Machine Learning

Lu, Hanlin, Li, Ming-Ju, He, Ting, Wang, Shiqiang, Narayanan, Vijay, Chan, Kevin S

Motivated by the need of solving machine learning problems over distributed datasets, we explore the use of coreset to reduce the communication overhead. Coreset is a summary of the original dataset in the form of a small weighted set in the same sample space. Compared to other data summaries, coreset has the advantage that it can be used as a proxy of the original dataset, potentially for different applications. However, existing coreset construction algorithms are each tailor-made for a specific machine learning problem. Thus, to solve different machine learning problems, one has to collect coresets of different types, defeating the purpose of saving communication overhead. We resolve this dilemma by developing coreset construction algorithms based on k-means/median clustering, that give a provably good approximation for a broad range of machine learning problems with sufficiently continuous cost functions. Through evaluations on diverse datasets and machine learning problems, we verify the robust performance of the proposed algorithms.

algorithm, artificial intelligence, survey article, (17 more...)

1904.05961

Country: North America > United States > Pennsylvania (0.14)

Genre:

Research Report (0.81)
Overview (0.67)

Industry:

Education > Focused Education > Special Education (1.00)
Government (0.68)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

arXiv.org Machine LearningApr-14-2018

When Edge Meets Learning: Adaptive Control for Resource-Constrained Distributed Machine Learning

Wang, Shiqiang, Tuor, Tiffany, Salonidis, Theodoros, Leung, Kin K., Makaya, Christian, He, Ting, Chan, Kevin

Emerging technologies and applications including Internet of Things (IoT), social networking, and crowd-sourcing generate large amounts of data at the network edge. Machine learning models are often built from the collected data, to enable the detection, classification, and prediction of future events. Due to bandwidth, storage, and privacy concerns, it is often impractical to send all the data to a centralized location. In this paper, we consider the problem of learning model parameters from data distributed across multiple edge nodes, without sending raw data to a centralized place. Our focus is on a generic class of machine learning models that are trained using gradient-descent based approaches. We analyze the convergence rate of distributed gradient descent from a theoretical point of view, based on which we propose a control algorithm that determines the best trade-off between local update and global parameter aggregation to minimize the loss function under a given resource budget. The performance of the proposed algorithm is evaluated via extensive experiments with real datasets, both on a networked prototype system and in a larger-scale simulated environment. The experimentation results show that our proposed approach performs near to the optimum with various machine learning models and different data distributions.

artificial intelligence, machine learning, node, (16 more...)

1804.05271

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)