AITopics | helm

Collaborating Authors

helm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rare 19th century pistol used to rob Tulsa liquor store

Popular ScienceDec-9-2025, 18:21:07 GMT

'This pistol is something a bit different,' according to a firearms expert. Despite its generic appearance, this 18th century firearm features a comparatively unique design. Breakthroughs, discoveries, and DIY tips sent every weekday. It's difficult to resist raising an eyebrow at an Oklahoma robbery suspect's alleged recent weapon-of-choice . According to several Oklahoma news outlets including WKTUL, a 24-year-old man was arrested on December 6 by Tulsa police after allegedly robbing a liquor store using what employees described as an "old-timey musket."

artificial intelligence, pistol, rare 19th century pistol, (12 more...)

Popular Science

Country:

North America > United States > Oklahoma (0.46)
North America > United States > Vermont (0.05)
North America > United States > Connecticut (0.05)
Asia > Middle East > Republic of Türkiye (0.05)

Industry:

Retail (0.87)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.69)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.62)

Technology: Information Technology > Artificial Intelligence (0.70)

Add feedback

SimBA: Simplifying Benchmark Analysis Using Performance Matrices Alone

Subramani, Nishant, Gomez, Alfredo, Diab, Mona

arXiv.org Artificial IntelligenceOct-22-2025

Modern language models are evaluated on large benchmarks, which are difficult to make sense of, especially for model selection. Looking at the raw evaluation numbers themselves using a model-centric lens, we propose SimBA, a three phase framework to Simplify Benchmark Analysis. The three phases of SimBA are: stalk, where we conduct dataset & model comparisons, prowl, where we discover a representative subset, and pounce, where we use the representative subset to predict performance on a held-out set of models. Applying SimBA to three popular LM benchmarks: HELM, MMLU, and BigBenchLite reveals that across all three benchmarks, datasets and models relate strongly to one another (stalk). We develop an representative set discovery algorithm which covers a benchmark using raw evaluation scores alone. Using our algorithm, we find that with 6.25% (1/16), 1.7% (1/58), and 28.4% (21/74) of the datasets for HELM, MMLU, and BigBenchLite respectively, we achieve coverage levels of at least 95% (prowl). Additionally, using just these representative subsets, we can both preserve model ranks and predict performance on a held-out set of models with near zero mean-squared error (pounce). Taken together, SimBA can help model developers improve efficiency during model training and dataset creators validate whether their newly created dataset differs from existing datasets in a benchmark. Our code is open source, available at https://github.com/nishantsubramani/simba.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.17998

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Honda taps Silicon Valley startup in self-driving software deal

The Japan TimesAug-21-2025, 05:27:00 GMT

Honda Motor has signed a multiyear agreement to work with a U.S. AI startup in which it has an equity stake to develop self driving capabilities, tapping into Silicon Valley know-how for next generation automated technologies. The Japanese carmaker and Redwood City, California-based Helm.ai will collaborate on producing advanced driver assistance systems for Honda's mass market vehicles by 2027, the two said Wednesday in a statement. No value or duration of the contract was disclosed. Honda's bid to join the race to develop so-called end-to-end driving technology aims for partially automated acceleration and steering on both regular roads and highways. It follows early movers in driver assistance software systems such as General Motors' SuperCruise, Tesla's Autopilot and BYD's God's Eye.

honda tap silicon valley startup, mass market vehicle, self-driving software deal, (1 more...)

The Japan Times

Country: North America > United States > California > San Mateo County > Redwood City (0.28)

Industry: Automobiles & Trucks > Manufacturer (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.95)

Add feedback

A Hybrid SMT-NRA Solver: Integrating 2D Cell-Jump-Based Local Search, MCSAT and OpenCAD

Ding, Tianyi, Li, Haokun, Ni, Xinpeng, Xia, Bican, Zhao, Tianqi

arXiv.org Artificial IntelligenceJul-14-2025

In this paper, we propose a hybrid framework for Satisfiability Modulo the Theory of Nonlinear Real Arithmetic (SMT-NRA for short). First, we introduce a two-dimensional cell-jump move, called \emph{$2d$-cell-jump}, generalizing the key operation, cell-jump, of the local search method for SMT-NRA. Then, we propose an extended local search framework, named \emph{$2d$-LS} (following the local search framework, LS, for SMT-NRA), integrating the model constructing satisfiability calculus (MCSAT) framework to improve search efficiency. To further improve the efficiency of MCSAT, we implement a recently proposed technique called \emph{sample-cell projection operator} for MCSAT, which is well suited for CDCL-style search in the real domain and helps guide the search away from conflicting states. Finally, we present a hybrid framework for SMT-NRA integrating MCSAT, $2d$-LS and OpenCAD, to improve search efficiency through information exchange. The experimental results demonstrate improvements in local search performance, highlighting the effectiveness of the proposed methods.

artificial intelligence, assignment, mcsat, (12 more...)

arXiv.org Artificial Intelligence

2507.00557

Country: Europe (0.28)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

HELM: Human-Preferred Exploration with Language Models

Liao, Shuhao, Lv, Xuxin, Cao, Yuhong, Lew, Jeric, Wu, Wenjun, Sartoretti, Guillaume

arXiv.org Artificial IntelligenceMar-10-2025

In autonomous exploration tasks, robots are required to explore and map unknown environments while efficiently planning in dynamic and uncertain conditions. Given the significant variability of environments, human operators often have specific preference requirements for exploration, such as prioritizing certain areas or optimizing for different aspects of efficiency. However, existing methods struggle to accommodate these human preferences adaptively, often requiring extensive parameter tuning or network retraining. With the recent advancements in Large Language Models (LLMs), which have been widely applied to text-based planning and complex reasoning, their potential for enhancing autonomous exploration is becoming increasingly promising. Motivated by this, we propose an LLM-based human-preferred exploration framework that seamlessly integrates a mobile robot system with LLMs. By leveraging the reasoning and adaptability of LLMs, our approach enables intuitive and flexible preference control through natural language while maintaining a task success rate comparable to state-of-the-art traditional methods. Experimental results demonstrate that our framework effectively bridges the gap between human intent and policy preference in autonomous exploration, offering a more user-friendly and adaptable solution for real-world robotic applications.

exploration, human preference, robot, (14 more...)

arXiv.org Artificial Intelligence

2503.07006

Country:

North America > United States (0.15)
Asia > Singapore (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Government (0.31)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

HELM: Hierarchical Encoding for mRNA Language Modeling

Yazdani-Jahromi, Mehdi, Prakash, Mangal, Mansi, Tommaso, Moskalev, Artem, Liao, Rui

arXiv.org Artificial IntelligenceOct-16-2024

Messenger RNA (mRNA) plays a crucial role in protein synthesis, with its codon structure directly impacting biological properties. While Language Models (LMs) have shown promise in analyzing biological sequences, existing approaches fail to account for the hierarchical nature of mRNA's codon structure. We introduce Hierarchical Encoding for mRNA Language Modeling (HELM), a novel pre-training strategy that incorporates codon-level hierarchical structure into language model training. HELM modulates the loss function based on codon synonymity, aligning the model's learning process with the biological reality of mRNA sequences. We evaluate HELM on diverse mRNA datasets and tasks, demonstrating that HELM outperforms standard language model pre-training as well as existing foundation model baselines on six diverse downstream property prediction tasks and an antibody region annotation tasks on average by around 8%. Additionally, HELM enhances the generative capabilities of language model, producing diverse mRNA sequences that better align with the underlying true data distribution compared to non-hierarchical baselines. RNA analysis is becoming increasingly important in molecular biology (Liu et al., 2023; Fu, 2014). Messenger RNA (mRNA) is of particular interest due to its unique role in protein synthesis (Sahin et al., 2014). Language Models (LMs) have emerged as powerful tools for analyzing biological sequences, with notable successes in protein (Elnaggar et al., 2021; Ferruz et al., 2022; Lin et al., 2023; Hie et al., 2024) and DNA (Nguyen et al., 2024a; Zhou et al., 2023) research. Despite the importance of mRNA, the field still lacks specialized LMs tailored for its analysis. Existing RNA LMs (Li et al., 2023; Chen et al., 2023) focus on non-coding sequences and do not account properly for codon hierarchy (Figure 1 right) which, as we demonstrate, falls short when dealing with mRNA tasks. In this work, we aim to address this gap in mRNA language modeling by focusing specifically on the unique challenges presented by mRNA sequences. To address the limitations of existing bio-language modeling methods, we introduce Hierarchical Encoding for mRNA Language Modeling (HELM), a novel pre-training strategy for mRNA sequences. The tree diagram illustrates the codon hierarchy used in the HELM approach, categorizing codons into Start, Coding (grouped by amino acids), and Stop. This hierarchy informs the loss calculation.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.12459

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.50)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Benchmarking of Language Models

Perlitz, Yotam, Bandel, Elron, Gera, Ariel, Arviv, Ofir, Ein-Dor, Liat, Shnarch, Eyal, Slonim, Noam, Shmueli-Scheuer, Michal, Choshen, Leshem

arXiv.org Artificial IntelligenceJan-30-2024

The increasing versatility of language models LMs has given rise to a new class of benchmarks that comprehensively assess a broad range of capabilities. Such benchmarks are associated with massive computational costs reaching thousands of GPU hours per model. However the efficiency aspect of these evaluation efforts had raised little discussion in the literature. In this work we present the problem of Efficient Benchmarking namely intelligently reducing the computation costs of LM evaluation without compromising reliability. Using the HELM benchmark as a test case we investigate how different benchmark design choices affect the computation-reliability tradeoff. We propose to evaluate the reliability of such decisions by using a new measure Decision Impact on Reliability DIoR for short. We find for example that the current leader on HELM may change by merely removing a low-ranked model from the benchmark and observe that a handful of examples suffice to obtain the correct benchmark ranking. Conversely a slightly different choice of HELM scenarios varies ranking widely. Based on our findings we outline a set of concrete recommendations for more efficient benchmark design and utilization practices leading to dramatic cost savings with minimal loss of benchmark reliability often reducing computation by x100 or more.

benchmark, reliability, scenario, (12 more...)

arXiv.org Artificial Intelligence

2308.11696

Country:

Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(8 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Data-driven Semi-supervised Machine Learning with Surrogate Safety Measures for Abnormal Driving Behavior Detection

Zhang, Lanxin, Dong, Yongqi, Farah, Haneen, Zgonnikov, Arkady, van Arem, Bart

arXiv.org Artificial IntelligenceDec-7-2023

Detecting abnormal driving behavior is critical for road traffic safety and the evaluation of drivers' behavior. With the advancement of machine learning (ML) algorithms and the accumulation of naturalistic driving data, many ML models have been adopted for abnormal driving behavior detection. Most existing ML-based detectors rely on (fully) supervised ML methods, which require substantial labeled data. However, ground truth labels are not always available in the real world, and labeling large amounts of data is tedious. Thus, there is a need to explore unsupervised or semi-supervised methods to make the anomaly detection process more feasible and efficient. To fill this research gap, this study analyzes large-scale real-world data revealing several abnormal driving behaviors (e.g., sudden acceleration, rapid lane-changing) and develops a Hierarchical Extreme Learning Machines (HELM) based semi-supervised ML method using partly labeled data to accurately detect the identified abnormal driving behaviors. Moreover, previous ML-based approaches predominantly utilize basic vehicle motion features (such as velocity and acceleration) to label and detect abnormal driving behaviors, while this study seeks to introduce Surrogate Safety Measures (SSMs) as the input features for ML models to improve the detection performance. Results from extensive experiments demonstrate the effectiveness of the proposed semi-supervised ML model with the introduced SSMs serving as important features. The proposed semi-supervised ML method outperforms other baseline semi-supervised or unsupervised methods regarding various metrics, e.g., delivering the best accuracy at 99.58% and the best F-1 measure at 0.9913. The ablation study further highlights the significance of SSMs for advancing detection performance.

abnormal driving behavior, dataset, driving behavior, (12 more...)

arXiv.org Artificial Intelligence

2312.0461

Country:

Europe > Netherlands > South Holland > Delft (0.05)
Europe > Switzerland (0.04)
North America > United States > Hawaii (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Health & Medicine (0.68)
Automobiles & Trucks (0.66)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference

Wang, Chi, Liu, Susan Xueqing, Awadallah, Ahmed H.

arXiv.org Artificial IntelligenceAug-8-2023

Large Language Models (LLMs) have sparked significant interest in their generative capabilities, leading to the development of various commercial applications. The high cost of using the models drives application builders to maximize the value of generation under a limited inference budget. This paper presents a study of optimizing inference hyperparameters such as the number of responses, temperature and max tokens, which significantly affects the utility/cost of text generation. We design a framework named EcoOptiGen which leverages economical hyperparameter optimization and cost-based pruning. Experiments with the GPT-3.5/GPT-4 models on a variety of tasks verify its effectiveness. EcoOptiGen is implemented in the `autogen' package of the FLAML library: \url{https://aka.ms/autogen}.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2303.04673

Country:

North America > United States (0.04)
Europe > United Kingdom > Wales > Denbighshire (0.04)
Europe > United Kingdom > Scotland (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multimodal LLMs for health grounded in individual-specific data

Belyaeva, Anastasiya, Cosentino, Justin, Hormozdiari, Farhad, Eswaran, Krish, Shetty, Shravya, Corrado, Greg, Carroll, Andrew, McLean, Cory Y., Furlotte, Nicholas A.

arXiv.org Artificial IntelligenceJul-20-2023

Foundation large language models (LLMs) have shown an impressive ability to solve tasks across a wide range of fields including health. To effectively solve personalized health tasks, LLMs need the ability to ingest a diversity of data modalities that are relevant to an individual's health status. In this paper, we take a step towards creating multimodal LLMs for health that are grounded in individual-specific data by developing a framework (HeLM: Health Large Language Model for Multimodal Understanding) that enables LLMs to use high-dimensional clinical modalities to estimate underlying disease risk. HeLM encodes complex data modalities by learning an encoder that maps them into the LLM's token embedding space and for simple modalities like tabular data by serializing the data into text. Using data from the UK Biobank, we show that HeLM can effectively use demographic and clinical features in addition to high-dimensional time-series data to estimate disease risk. For example, HeLM achieves an AUROC of 0.75 for asthma prediction when combining tabular and spirogram data modalities compared with 0.49 when only using tabular data. Overall, we find that HeLM outperforms or performs at parity with classical machine learning approaches across a selection of eight binary traits. Furthermore, we investigate the downstream uses of this model such as its generalizability to out-of-distribution traits and its ability to power conversations around individual health and wellness.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2307.09018

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
South America > Uruguay > Artigas > Artigas (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.50)
Research Report > New Finding (0.32)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Health Care Technology (0.93)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback