Goto

Collaborating Authors

 holmes


Meet Scotland's Whisky-Sniffing Robot Dog

WIRED

Inside Dewar's cavernous whisky warehouses, man's best mechanical friend--a Boston Dynamics robot dog with an ethanol sensor for a nose--is on the hunt for leaky barrels. Wooden barrels are what make the magic happen in your favorite bottle of whisky . At Bacardi Limited, the world's largest privately held spirits company, barrel leakage is a massive headache. Consider the company's Dewar's blended Scotch whisky brand (just one of the dozens it owns). Most of the time, Dewar's will have over 100 warehouses full of aging barrels of whisky, 25,000 casks in each one.


Our Greatest Living Biographer Is Back With His First Single-Subject Book in Decades. It's Enthralling.

Slate

Richard Holmes, our greatest living biographer, is back with an enthralling chronicle of the poet. Enter your email to receive alerts for this author. You can manage your newsletter subscriptions at any time. You're already subscribed to the aa_Laura_Miller newsletter. You can manage your newsletter subscriptions at any time.


Supplementary Material

Neural Information Processing Systems

This supplementary material provides implementation details, hyper-parameters settings, additional results and visualisations. Section A presents a focus on the design choices we use for IMGEP-HOLMES Section B provides implementation details for the main paper evaluation procedure - B.1: Quantitative evaluation of diversity - B.2: Quantitative evaluation of Representational Similarity - D.1: Complete RSA analysis of the hierarchy of behavioral characterizations learned Figure 6: Focus on the different design choices made for the HOLMES architecture. We summarize those components in Figure 6. The connection scheme is summarized in Figure 6. There are two main choices: when to split a node and how to redirect the patterns toward either the left or right children.


Holmes: Towards Distributed Training Across Clusters with Heterogeneous NIC Environment

arXiv.org Artificial Intelligence

Large language models (LLMs) such as GPT-3, OPT, and LLaMA have demonstrated remarkable accuracy in a wide range of tasks. However, training these models can incur significant expenses, often requiring tens of thousands of GPUs for months of continuous operation. Typically, this training is carried out in specialized GPU clusters equipped with homogeneous high-speed Remote Direct Memory Access (RDMA) network interface cards (NICs). The acquisition and maintenance of such dedicated clusters is challenging. Current LLM training frameworks, like Megatron-LM and Megatron-DeepSpeed, focus primarily on optimizing training within homogeneous cluster settings. In this paper, we introduce Holmes, a training framework for LLMs that employs thoughtfully crafted data and model parallelism strategies over the heterogeneous NIC environment. Our primary technical contribution lies in a novel scheduling method that intelligently allocates distinct computational tasklets in LLM training to specific groups of GPU devices based on the characteristics of their connected NICs. Furthermore, our proposed framework, utilizing pipeline parallel techniques, demonstrates scalability to multiple GPU clusters, even in scenarios without high-speed interconnects between nodes in distinct clusters. We conducted comprehensive experiments that involved various scenarios in the heterogeneous NIC environment. In most cases, our framework achieves performance levels close to those achievable with homogeneous RDMA-capable networks (InfiniBand or RoCE), significantly exceeding training efficiency within the pure Ethernet environment. Additionally, we verified that our framework outperforms other mainstream LLM frameworks under heterogeneous NIC environment in terms of training efficiency and can be seamlessly integrated with them.


Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond

arXiv.org Artificial Intelligence

In this study, we explore the potential of Multimodal Large Language Models (MLLMs) in improving embodied decision-making processes for agents. While Large Language Models (LLMs) have been widely used due to their advanced reasoning skills and vast world knowledge, MLLMs like GPT4-Vision offer enhanced visual understanding and reasoning capabilities. We investigate whether state-of-the-art MLLMs can handle embodied decision-making in an end-to-end manner and whether collaborations between LLMs and MLLMs can enhance decision-making. To address these questions, we introduce a new benchmark called PCA-EVAL, which evaluates embodied decision-making from the perspectives of Perception, Cognition, and Action. Additionally, we propose HOLMES, a multi-agent cooperation framework that allows LLMs to leverage MLLMs and APIs to gather multimodal information for informed decision-making. We compare end-to-end embodied decision-making and HOLMES on our benchmark and find that the GPT4-Vision model demonstrates strong end-to-end embodied decision-making abilities, outperforming GPT4-HOLMES in terms of average decision accuracy (+3%). However, this performance is exclusive to the latest GPT4-Vision model, surpassing the open-source state-of-the-art MLLM by 26%. Our results indicate that powerful MLLMs like GPT4-Vision hold promise for decision-making in embodied agents, offering new avenues for MLLM research. The capacity to make well-informed decisions is essential for the survival and success of living organisms in their respective environments. Similarly, a major goal in embodied artificial intelligence is to develop agents, like robots, with sophisticated decision-making abilities. Recently, there has been a notable increase in leveraging exceptional reasoning capabilities and world knowledge of Large Language Models (LLMs) to enhance decision making in agents. However, LLMs are primarily designed to process textual context, creating a modality gap (Liang et al., 2022; Ren et al., 2023a) for the LLM-powered agent when dealing with multimodal observations in real-world scenarios.


MeeQA: Natural Questions in Meeting Transcripts

arXiv.org Artificial Intelligence

We present MeeQA, a dataset for natural-language question answering over meeting transcripts. It includes real questions asked during meetings by its participants. The dataset contains 48K question-answer pairs, extracted from 422 meeting transcripts, spanning multiple domains. Questions in transcripts pose a special challenge as they are not always clear, and considerable context may be required in order to provide an answer. Further, many questions asked during meetings are left unanswered. To improve baseline model performance on this type of questions, we also propose a novel loss function, \emph{Flat Hierarchical Loss}, designed to enhance performance over questions with no answer in the text. Our experiments demonstrate the advantage of using our approach over standard QA models.


What can Google's AI-powered Bard do? We tested it for you

#artificialintelligence

To use, or not to use, Bard? That is the Shakespearean question an Associated Press reporter sought to answer while testing out Google's artificially intelligent chatbot. The recently rolled-out bot dubbed Bard is the internet search giant's answer to the ChatGPT tool that Microsoft has been melding into its Bing search engine and other software. During several hours of interaction, the AP learned Bard is quite forthcoming about its unreliability and other shortcomings, including its potential for mischief in next year's U.S. presidential election. Even as it occasionally warned of the problems it could unleash, Bard repeatedly emphasized its belief that it will blossom into a force for good.


Explained: What Can Google's AI-Powered Bard Do

#artificialintelligence

To use, or not to use, Bard? That is the Shakespearean question an Associated Press reporter sought to answer while testing out Google's artificially intelligent chatbot. The recently rolled-out bot dubbed Bard is the internet search giant's answer to the ChatGPT tool that Microsoft has been melding into its Bing search engine and other software. During several hours of interaction, the AP learned Bard is quite forthcoming about its unreliability and other shortcomings, including its potential for mischief in next year's US presidential election. Even as it occasionally warned of the problems it could unleash, Bard repeatedly emphasized its belief that it will blossom into a force for good.


AI does a poor job of diagnosing COVID-19 from coughs, study finds • TechCrunch

#artificialintelligence

Early in the pandemic, a number of researchers, startups and institutions developed AI systems that they claimed could diagnose COVID-19 from the sound of a person's cough. At the time, we ourselves were enthusiastic about the prospect of AI that could be yielded as a weapon against the virus; in one headline, we endorsed cough-scrutinizing AI as "promising." But a recent study (first reported on by The Register) suggests that some cough-analyzing algorithms are less accurate than we -- and the public -- were led to believe. It serves as a cautionary tale for machine learning tech in healthcare, whose flaws aren't always immediately apparent. Researchers from The Alan Turing Institute and Royal Statistical Society, commissioned by the U.K. Health Security Agency, conducted an independent review of audio-based AI tech as a COVID-19 screening tool.


Who Said Science and Art Were Two Cultures? - Issue 108: Change

Nautilus

On a May evening in 1959, C.P. Snow, a popular novelist and former research scientist, gave a lecture before a gathering of dons and students at the University of Cambridge, his alma mater. He called his talk "The Two Cultures and the Scientific Revolution." Snow declared that a gulf of mutual incomprehension divided literary intellectuals and scientists. "The non-scientists have a rooted impression that the scientists are shallowly optimistic, unaware of man's condition," Snow said. "On the other hand, the scientists believe that the literary intellectuals are totally lacking in foresight, peculiarly unconcerned with their brother men, in a deep sense anti-intellectual, anxious to restrict both art and thought to the existential moment." Snow didn't expect much of his talk.