AITopics | Stripelis, Dimitris

Collaborating Authors

Stripelis, Dimitris

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fox-1 Technical Report

Hu, Zijian, Zhang, Jipeng, Pan, Rui, Xu, Zhaozhuo, Han, Shanshan, Jin, Han, Shah, Alay Dilipbhai, Stripelis, Dimitris, Yao, Yuhang, Avestimehr, Salman, He, Chaoyang, Zhang, Tong

arXiv.org Artificial IntelligenceNov-17-2024

We present Fox-1, a series of small language models (SLMs) consisting of Fox-1-1.6B and Fox-1-1.6B-Instruct-v0.1. These models are pre-trained on 3 trillion tokens of web-scraped document data and fine-tuned with 5 billion tokens of instruction-following and multi-turn conversation data. Aiming to improve the pre-training efficiency, Fox-1-1.6B model introduces a novel 3-stage data curriculum across all the training data with 2K-8K sequence length. In architecture design, Fox-1 features a deeper layer structure, an expanded vocabulary, and utilizes Grouped Query Attention (GQA), offering a performant and efficient architecture compared to other SLMs. Fox-1 achieves better or on-par performance in various benchmarks compared to StableLM-2-1.6B, Gemma-2B, Qwen1.5-1.8B, and OpenELM1.1B, with competitive inference speed and throughput. The model weights have been released under the Apache 2.0 license, where we aim to promote the democratization of LLMs and make them fully accessible to the whole open-source community.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.05281

Country: North America > United States (0.29)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Alopex: A Computational Framework for Enabling On-Device Function Calls with LLMs

Ran, Yide, Xu, Zhaozhuo, Yao, Yuhang, Hu, Zijian, Han, Shanshan, Jin, Han, Shah, Alay Dilipbhai, Zhang, Jipeng, Stripelis, Dimitris, Zhang, Tong, Avestimehr, Salman, He, Chaoyang

arXiv.org Artificial IntelligenceNov-7-2024

The rapid advancement of Large Language Models (LLMs) has led to their increased integration into mobile devices for personalized assistance, which enables LLMs to call external API functions to enhance their performance. However, challenges such as data scarcity, ineffective question formatting, and catastrophic forgetting hinder the development of on-device LLM agents. To tackle these issues, we propose Alopex, a framework that enables precise on-device function calls using the Fox LLM. Alopex introduces a logic-based method for generating high-quality training data and a novel ``description-question-output'' format for fine-tuning, reducing risks of function information leakage. Additionally, a data mixing strategy is used to mitigate catastrophic forgetting, combining function call data with textbook datasets to enhance performance in various tasks. Experimental results show that Alopex improves function call accuracy and significantly reduces catastrophic forgetting, providing a robust solution for integrating function call capabilities into LLMs without manual intervention.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.05209

Country:

North America > United States > Illinois (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Education (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

TorchOpera: A Compound AI System for LLM Safety

Han, Shanshan, Yao, Yuhang, Hu, Zijian, Stripelis, Dimitris, Xu, Zhaozhuo, He, Chaoyang

arXiv.org Artificial IntelligenceJun-16-2024

We introduce TorchOpera, a compound AI system for enhancing the safety and quality of prompts and responses for Large Language Models. TorchOpera ensures that all user prompts are safe, contextually grounded, and effectively processed, while enhancing LLM responses to be relevant and high quality. TorchOpera utilizes the vector database for contextual grounding, rule-based wrappers for flexible modifications, and specialized mechanisms for detecting and adjusting unsafe or incorrect content. We also provide a view of the compound AI system to reduce the computational cost. Extensive experiments show that TorchOpera ensures the safety, reliability, and applicability of LLMs in real-world settings while maintaining the efficiency of LLM responses.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.10847

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry: Information Technology (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

MetisFL: An Embarrassingly Parallelized Controller for Scalable & Efficient Federated Learning Workflows

Stripelis, Dimitris, Anastasiou, Chrysovalantis, Toral, Patrick, Asghar, Armaghan, Ambite, Jose Luis

arXiv.org Artificial IntelligenceNov-13-2023

A Federated Learning (FL) system typically consists of two core processing entities: the federation controller and the learners. The controller is responsible for managing the execution of FL workflows across learners and the learners for training and evaluating federated models over their private datasets. While executing an FL workflow, the FL system has no control over the computational resources or data of the participating learners. Still, it is responsible for other operations, such as model aggregation, task dispatching, and scheduling. These computationally heavy operations generally need to be handled by the federation controller. Even though many FL systems have been recently proposed to facilitate the development of FL workflows, most of these systems overlook the scalability of the controller. To meet this need, we designed and developed a novel FL system called MetisFL, where the federation controller is the first-class citizen. MetisFL re-engineers all the operations conducted by the federation controller to accelerate the training of large-scale FL workflows. By quantitatively comparing MetisFL against other state-of-the-art FL systems, we empirically demonstrate that MetisFL leads to a 10-fold wall-clock time execution boost across a wide range of challenging FL workflows with increasing model sizes and federation sites.

artificial intelligence, machine learning, metisfl, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3630048.3630186

2311.00334

Country: North America > United States > California (0.15)

Genre: Workflow (1.00)

Industry:

Education (0.94)
Health & Medicine (0.94)
Information Technology > Security & Privacy (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Secure & Private Federated Neuroimaging

Stripelis, Dimitris, Gupta, Umang, Saleem, Hamza, Dhinagar, Nikhil, Ghai, Tanmay, Anastasiou, Rafael Chrysovalantis, Asghar, Armaghan, Steeg, Greg Ver, Ravi, Srivatsan, Naveed, Muhammad, Thompson, Paul M., Ambite, Jose Luis

arXiv.org Artificial IntelligenceAug-28-2023

The amount of biomedical data continues to grow rapidly. However, collecting data from multiple sites for joint analysis remains challenging due to security, privacy, and regulatory concerns. To overcome this challenge, we use Federated Learning, which enables distributed training of neural network models over multiple data sources without sharing data. Each site trains the neural network over its private data for some time, then shares the neural network parameters (i.e., weights, gradients) with a Federation Controller, which in turn aggregates the local models, sends the resulting community model back to each site, and the process repeats. Our Federated Learning architecture, MetisFL, provides strong security and privacy. First, sample data never leaves a site. Second, neural network parameters are encrypted before transmission and the global neural model is computed under fully-homomorphic encryption. Finally, we use information-theoretic methods to limit information leakage from the neural model to prevent a "curious" site from performing model inversion or membership attacks. We present a thorough evaluation of the performance of secure, private federated learning in neuroimaging tasks, including for predicting Alzheimer's disease and estimating BrainAGE from magnetic resonance imaging (MRI) studies, in challenging, heterogeneous federated environments where sites have different amounts of data and statistical distributions.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2205.05249

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Federated Learning over Harmonized Data Silos

Stripelis, Dimitris, Ambite, Jose Luis

arXiv.org Artificial IntelligenceMay-15-2023

Federated Learning is a distributed machine learning approach that enables geographically distributed data silos to collaboratively learn a joint machine learning model without sharing data. Most of the existing work operates on unstructured data, such as images or text, or on structured data assumed to be consistent across the different sites. However, sites often have different schemata, data formats, data values, and access patterns. The field of data integration has developed many methods to address these challenges, including techniques for data exchange and query rewriting using declarative schema mappings, and for entity linkage. Therefore, we propose an architectural vision for an end-to-end Federated Learning and Integration system, incorporating the critical steps of data harmonization and data imputation, to spur further research on the intersection of data management information systems and machine learning.

artificial intelligence, information management, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2305.08985

Country: North America > United States > California (0.46)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine (0.94)
Education (0.94)
Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.88)

Add feedback

Federated Progressive Sparsification (Purge, Merge, Tune)+

Stripelis, Dimitris, Gupta, Umang, Steeg, Greg Ver, Ambite, Jose Luis

arXiv.org Artificial IntelligenceMay-15-2023

Federated learning is a promising approach for training machine learning models on decentralized data while keeping data private at each client. Model sparsification seeks to produce small neural models with comparable performance to large models; for example, for deployment on clients with limited memory or computational capabilites. We present FedSparsify, a simple yet effective sparsification strategy for federated training of neural networks based on progressive weight magnitude pruning. FedSparsify learns subnetworks smaller than 10% of the original network size with similar or better accuracy. Through extensive experiments, we demonstrate that FedSparsify results in an average 15-fold model size reduction, 4-fold model inference speedup, and a 3-fold training communication cost improvement across various challenging domains and model architectures. Finally, we also theoretically analyze FedSparsify's impact on the convergence of federated training. Overall, our results show that FedSparsify is an effective method to train extremely sparse and highly accurate models in federated learning settings.

artificial intelligence, fedsparsify, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2204.1243

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Accelerating Federated Learning in Heterogeneous Data and Computational Environments

Stripelis, Dimitris, Ambite, Jose Luis

arXiv.org Artificial IntelligenceAug-25-2020

There are situations where data relevant to a machine learning problem are distributed among multiple locations that cannot share the data due to regulatory, competitiveness, or privacy reasons. For example, data present in users' cellphones, manufacturing data of companies in a given industrial sector, or medical records located at different hospitals. Moreover, participating sites often have different data distributions and computational capabilities. Federated Learning provides an approach to learn a joint model over all the available data in these environments. In this paper, we introduce a novel distributed validation weighting scheme (DVW), which evaluates the performance of a learner in the federation against a distributed validation set. Each learner reserves a small portion (e.g., 5%) of its local training examples as a validation dataset and allows other learners models to be evaluated against it. We empirically show that DVW results in better performance compared to established methods, such as FedAvg, both under synchronous and asynchronous communication protocols in data and computationally heterogeneous environments.

computer based training, educational technology, learner, (25 more...)

arXiv.org Artificial Intelligence

2008.11281

Country: North America > United States > California (0.46)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback