AITopics | Lane, Nicholas

Collaborating Authors

Lane, Nicholas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis

Gai, Jiahao, Chen, Hao Mark, Wang, Zhican, Zhou, Hongyu, Zhao, Wanru, Lane, Nicholas, Fan, Hongxiang

arXiv.org Artificial IntelligenceMar-5-2025

Recent advances in code generation have illuminated the potential of employing large language models (LLMs) for general-purpose programming languages such as Python and C++, opening new opportunities for automating software development and enhancing programmer productivity. The potential of LLMs in software programming has sparked significant interest in exploring automated hardware generation and automation. Although preliminary endeavors have been made to adopt LLMs in generating hardware description languages (HDLs), several challenges persist in this direction. First, the volume of available HDL training data is substantially smaller compared to that for software programming languages. Second, the pre-trained LLMs, mainly tailored for software code, tend to produce HDL designs that are more error-prone. Third, the generation of HDL requires a significantly higher number of tokens compared to software programming, leading to inefficiencies in cost and energy consumption. To tackle these challenges, this paper explores leveraging LLMs to generate High-Level Synthesis (HLS)-based hardware design. Although code generation for domain-specific programming languages is not new in the literature, we aim to provide experimental results, insights, benchmarks, and evaluation infrastructure to investigate the suitability of HLS over low-level HDLs for LLM-assisted hardware design generation. To achieve this, we first finetune pre-trained models for HLS-based hardware generation, using a collected dataset with text prompts and corresponding reference HLS designs. An LLM-assisted framework is then proposed to automate end-to-end hardware code generation, which also investigates the impact of chain-of-thought and feedback loops promoting techniques on HLS-design generation. Limited by the timeframe of this research, we plan to evaluate more advanced reasoning models in the future.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.13921

Country:

Asia > Japan (0.16)
North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Space for Improvement: Navigating the Design Space for Federated Learning in Satellite Constellations

Kim, Grace, Powell, Luca, Svoboda, Filip, Lane, Nicholas

arXiv.org Artificial IntelligenceOct-31-2024

Space has emerged as an exciting new application area for machine learning, with several missions equipping deep learning capabilities on-board spacecraft. Pre-processing satellite data through on-board training is necessary to address the satellite downlink deficit, as not enough transmission opportunities are available to match the high rates of data generation. To scale this effort across entire constellations, collaborated training in orbit has been enabled through federated learning (FL). While current explorations of FL in this context have successfully adapted FL algorithms for scenario-specific constraints, these theoretical FL implementations face several limitations that prevent progress towards real-world deployment. To address this gap, we provide a holistic exploration of the FL in space domain on several fronts. 1) We develop a method for space-ification of existing FL algorithms, evaluated on 2) FLySTacK, our novel satellite constellation design and hardware aware testing platform where we perform rigorous algorithm evaluations. Finally we introduce 3) AutoFLSat, a generalized, hierarchical, autonomous FL algorithm for space that provides a 12.5% to 37.5% reduction in model training time than leading alternatives.

artificial intelligence, machine learning, satellite, (17 more...)

arXiv.org Artificial Intelligence

2411.00263

Country:

North America > United States (0.46)
Europe > United Kingdom (0.28)

Genre: Research Report (0.82)

Industry: Energy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

Karargyris, Alexandros, Umeton, Renato, Sheller, Micah J., Aristizabal, Alejandro, George, Johnu, Bala, Srini, Beutel, Daniel J., Bittorf, Victor, Chaudhari, Akshay, Chowdhury, Alexander, Coleman, Cody, Desinghu, Bala, Diamos, Gregory, Dutta, Debo, Feddema, Diane, Fursin, Grigori, Guo, Junyi, Huang, Xinyuan, Kanter, David, Kashyap, Satyananda, Lane, Nicholas, Mallick, Indranil, Mascagni, Pietro, Mehta, Virendra, Natarajan, Vivek, Nikolov, Nikola, Padoy, Nicolas, Pekhimenko, Gennady, Reddi, Vijay Janapa, Reina, G Anthony, Ribalta, Pablo, Rosenthal, Jacob, Singh, Abhishek, Thiagarajan, Jayaraman J., Wuest, Anna, Xenochristou, Maria, Xu, Daguang, Yadav, Poonam, Rosenthal, Michael, Loda, Massimo, Johnson, Jason M., Mattson, Peter

arXiv.org Artificial IntelligenceDec-28-2021

Medical AI has tremendous potential to advance healthcare by supporting the evidence-based practice of medicine, personalizing patient treatment, reducing costs, and improving provider and patient experience. We argue that unlocking this potential requires a systematic way to measure the performance of medical AI models on large-scale heterogeneous data. To meet this need, we are building MedPerf, an open framework for benchmarking machine learning in the medical domain. MedPerf will enable federated evaluation in which models are securely distributed to different facilities for evaluation, thereby empowering healthcare organizations to assess and verify the performance of AI models in an efficient and human-supervised process, while prioritizing privacy. We describe the current challenges healthcare and AI communities face, the need for an open platform, the design philosophy of MedPerf, its current implementation status, and our roadmap. We call for researchers and organizations to join us in creating the MedPerf open benchmarking platform.

federated evaluation, medical artificial intelligence, open benchmarking platform, (1 more...)

arXiv.org Artificial Intelligence

doi: 10.1038/s42256-023-00652-2

2110.01406

Genre: Research Report (0.40)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

A Channel Coding Benchmark for Meta-Learning

Li, Rui, Bohdal, Ondrej, Mishra, Rajesh, Kim, Hyeji, Li, Da, Lane, Nicholas, Hospedales, Timothy

arXiv.org Artificial IntelligenceJul-15-2021

Meta-learning provides a popular and effective family of methods for data-efficient learning of new tasks. However, several important issues in meta-learning have proven hard to study thus far. For example, performance degrades in real-world settings where meta-learners must learn from a wide and potentially multi-modal distribution of training tasks; and when distribution shift exists between meta-train and meta-test task distributions. These issues are typically hard to study since the shape of task distributions, and shift between them are not straightforward to measure or control in standard benchmarks. We propose the channel coding problem as a benchmark for meta-learning. Channel coding is an important practical application where task distributions naturally arise, and fast adaptation to new tasks is practically valuable. We use this benchmark to study several aspects of meta-learning, including the impact of task distribution breadth and shift, which can be controlled in the coding problem. Going forward, this benchmark provides a tool for the community to study the capabilities and limitations of meta-learning, and to drive research on practically robust and effective meta-learners.

artificial intelligence, neural network, task distribution, (19 more...)

arXiv.org Artificial Intelligence

2107.07579

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Single Shot Structured Pruning Before Training

van Amersfoort, Joost, Alizadeh, Milad, Farquhar, Sebastian, Lane, Nicholas, Gal, Yarin

arXiv.org Machine LearningJul-1-2020

We introduce a method to speed up training by 2x and inference by 3x in deep neural networks using structured pruning applied before training. Unlike previous works on pruning before training which prune individual weights, our work develops a methodology to remove entire channels and hidden units with the explicit aim of speeding up training and inference. We introduce a compute-aware scoring mechanism which enables pruning in units of sensitivity per FLOP removed, allowing even greater speed ups. Our method is fast, easy to implement, and needs just one forward/backward pass on a single batch of data to complete pruning before training begins.

deep learning, neural network, pruning, (17 more...)

arXiv.org Machine Learning

2007.00389

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Distilling Knowledge from Ensembles of Acoustic Models for Joint CTC-Attention End-to-End Speech Recognition

Gao, Yan, Parcollet, Titouan, Lane, Nicholas

arXiv.org Machine LearningMay-19-2020

Knowledge distillation has been widely used to compress existing deep learning models while preserving the performance on a wide range of applications. In the specific context of Automatic Speech Recognition (ASR), distillation from ensembles of acoustic models has recently shown promising results in increasing recognition performance. In this paper, we propose an extension of multi-teacher distillation methods to joint ctc-atention end-to-end ASR systems. We also introduce two novel distillation strategies. The core intuition behind both is to integrate the error rate metric to the teacher selection rather than solely focusing on the observed losses. This way, we directly distillate and optimize the student toward the relevant metric for speech recognition. We evaluated these strategies under a selection of training procedures on the TIMIT phoneme recognition task and observed promising error rate for these strategies compared to a common baseline. Indeed, the best obtained phoneme error rate of 16.4% represents a state-of-the-art score for end-to-end ASR systems.

deep learning, distillation, speech recognition, (16 more...)

arXiv.org Machine Learning

2005.0931

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Quaternion Neural Networks for Multi-channel Distant Speech Recognition

Qiu, Xinchi, Parcollet, Titouan, Ravanelli, Mirco, Lane, Nicholas, Morchid, Mohamed

arXiv.org Machine LearningMay-19-2020

Despite the significant progress in automatic speech recognition (ASR), distant ASR remains challenging due to noise and reverberation. A common approach to mitigate this issue consists of equipping the recording devices with multiple microphones that capture the acoustic scene from different perspectives. These multi-channel audio recordings contain specific internal relations between each signal. In this paper, we propose to capture these inter- and intra- structural dependencies with quaternion neural networks, which can jointly process multiple signals as whole quaternion entities. The quaternion algebra replaces the standard dot product with the Hamilton one, thus offering a simple and elegant way to model dependencies between elements. The quaternion layers are then coupled with a recurrent neural network, which can learn long-term dependencies in the time domain. We show that a quaternion long-short term memory neural network (QLSTM), trained on the concatenated multi-channel speech signals, outperforms equivalent real-valued LSTM on two different tasks of multi-channel distant speech recognition.

deep learning, neural network, speech recognition, (15 more...)

arXiv.org Machine Learning

doi: 10.13140/RG.2.2.17061.52969

2005.08566

Country:

North America (0.46)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (0.64)

Industry:

Media (0.48)
Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback