AITopics | hsa

Collaborating Authors

hsa

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Near-OptimalRegretforAdversarialMDPwith DelayedBanditFeedback

Neural Information Processing SystemsFeb-12-2026, 05:29:44 GMT

The standard assumption in reinforcement learning (RL) is that agents observe feedback for their actions immediately. However, in practice feedback is often observedindelay.

machine learning, qkh, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Nevada (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models

Hu, Xiang, Zhou, Zhanchao, Liang, Ruiqi, Li, Zehuan, Wu, Wei, Li, Jianguo

arXiv.org Artificial IntelligenceDec-1-2025

This work explores the challenge of building "Machines that Can Remember", framing long-term memory as the problem of efficient ultra-long context modeling. We argue that this requires three key properties: sparsity, random-access flexibility, and length generalization. To address ultra-long-context modeling, we leverage Hierarchical Sparse Attention (HSA), a novel attention mechanism that satisfies all three properties. We integrate HSA into Transformers to build HSA-UltraLong, which is an 8B-parameter MoE model trained on over 8 trillion tokens and is rigorously evaluated on different tasks with in-domain and out-of-domain context lengths to demonstrate its capability in handling ultra-long contexts. Results show that our model performs comparably to full-attention baselines on in-domain lengths while achieving over 90% accuracy on most in-context retrieval tasks with contexts up to 16M. This report outlines our experimental insights and open problems, contributing a foundation for future research in ultra-long context modeling. Figure 1: Despite being pre-trained with an 8K context window and mid-trained up to 32K, HSA-UltraLong achieves near-perfect accuracy on S-NIAH even at a 16M-token context length. The red dashed line at 32K marks the boundary between in-domain (left) and out-of-domain (right).

context length, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.23319

Country:

North America > Mexico (0.28)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access

Hu, Xiang, Leng, Jiaqi, Zhao, Jun, Tu, Kewei, Wu, Wei

arXiv.org Artificial IntelligenceNov-4-2025

A key advantage of Recurrent Neural Networks (RNNs) over Transformers is their linear computational and space complexity enables faster training and inference for long sequences. However, RNNs are fundamentally unable to randomly access historical context, and simply integrating attention mechanisms may undermine their efficiency advantages. To overcome this limitation, we propose Hierarchical Sparse Attention (HSA), a novel attention mechanism that enhances RNNs with long-range random access flexibility while preserving their merits in efficiency and length generalization. HSA divides inputs into chunks, selects the top-$k$ chunks and hierarchically aggregates information. The core innovation lies in learning token-to-chunk relevance based on fine-grained token-level information inside each chunk. This approach enhances the precision of chunk selection across both in-domain and out-of-domain context lengths. To make HSA efficient, we further introduce a hardware-aligned kernel design. By combining HSA with Mamba, we introduce RAMba, which achieves perfect accuracy in passkey retrieval across 64 million contexts despite pre-training on only 4K-length contexts, and significant improvements on various downstream tasks, with nearly constant memory footprint. These results show RAMba's huge potential in long-context modeling.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2504.16795

Country:

North America > United States (0.93)
Europe > Austria > Vienna (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

DrDiff: Dynamic Routing Diffusion with Hierarchical Attention for Breaking the Efficiency-Quality Trade-off

Zhang, Jusheng, Fan, Yijia, Cai, Kaitong, Huang, Zimeng, Sun, Xiaofei, Wang, Jian, Tang, Chengpei, Wang, Keze

arXiv.org Artificial IntelligenceOct-14-2025

This paper introduces DrDiff, a novel framework for long-text generation that overcomes the efficiency-quality trade-off through three core technologies. First, we design a dynamic expert scheduling mechanism that intelligently allocates computational resources during the diffusion process based on text complexity, enabling more efficient handling of text generation tasks of varying difficulty. Second, we introduce a Hierarchical Sparse Attention (HSA) mechanism that adaptively adjusts attention patterns according to a variety of input lengths, reducing computational complexity from O($n^2$) to O($n$) while maintaining model performance. Finally, we propose a soft absorption guidance optimization strategy that combines with DPM-solver++ to reduce diffusion steps, significantly improving generation speed. Comprehensive experiments on various long-text generation benchmarks demonstrate the superiority of our DrDiff over the existing SOTA methods.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2509.02785

Country:

Asia (0.67)
North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback

Hierarchical Self-Attention: Generalizing Neural Attention Mechanics to Multi-Scale Problems

Amizadeh, Saeed, Abdali, Sara, Li, Yinheng, Koishida, Kazuhito

arXiv.org Machine LearningSep-22-2025

Transformers and their attention mechanism have been revolutionary in the field of Machine Learning. While originally proposed for the language data, they quickly found their way to the image, video, graph, etc. data modalities with various signal geometries. Despite this versatility, generalizing the attention mechanism to scenarios where data is presented at different scales from potentially different modalities is not straightforward. The attempts to incorporate hierarchy and multi-modality within transformers are largely based on ad hoc heuristics, which are not seamlessly generalizable to similar problems with potentially different structures. To address this problem, in this paper, we take a fundamentally different approach: we first propose a mathematical construct to represent multi-modal, multi-scale data. We then mathematically derive the neural attention mechanics for the proposed construct from the first principle of entropy minimization. We show that the derived formulation is optimal in the sense of being the closest to the standard Softmax attention while incorporating the inductive biases originating from the hierarchical/geometric information of the problem. We further propose an efficient algorithm based on dynamic programming to compute our derived attention mechanism. By incorporating it within transformers, we show that the proposed hierarchical attention mechanism not only can be employed to train transformer models in hierarchical/multi-modal settings from scratch, but it can also be used to inject hierarchical information into classical, pre-trained transformer models post training, resulting in more efficient models in zero-shot manner.

hierarchy, node, transformer, (14 more...)

arXiv.org Machine Learning

2509.15448

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Spring-Brake! Handed Shearing Auxetics Improve Efficiency of Hopping and Standing

Sullivan, Joseph, Good, Ian, Burden, Samuel A., Lipton, Jeffrey Ian

arXiv.org Artificial IntelligenceMay-30-2025

Energy efficiency is critical to the success of legged robotics. Efficiency is lost through wasted energy during locomotion and standing. Including elastic elements has been shown to reduce movement costs, while including breaks can reduce standing costs. However, adding separate elements for each increases the mass and complexity of a leg, reducing overall system performance. Here we present a novel compliant mechanism using a Handed Shearing Auxetic (HSA) that acts as a spring and break in a monopod hopping robot. The HSA acts as a parallel elastic actuator, reducing electrical power for dynamic hopping and matching the efficiency of state-of-the-art compliant hoppers. The HSA\u2019s auxetic behavior enables dual functionality. During static tasks, it locks under large forces with minimal input power by blocking deformation, creating high friction similar to a capstan mechanism. This allows the leg to support heavy loads without motor torque, addressing thermal inefficiency. The multi-functional design enhances both dynamic and static performance, offering a versatile solution for robotic applications.

artificial intelligence, hsa, robot, (16 more...)

arXiv.org Artificial Intelligence

2505.22898

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)

Add feedback

Force and Speed in a Soft Stewart Platform

Ketchum, Jake, Avtges, James, Schlafly, Millicent, Young, Helena, Kim, Taekyoung, Truby, Ryan L., Murphey, Todd D.

arXiv.org Artificial IntelligenceApr-21-2025

--Many soft robots struggle to produce dynamic motions with fast, large displacements. We develop a parallel 6 degree-of-freedom (DoF) Stewart-Gough mechanism using Handed Shearing Auxetic (HSA) actuators. By using soft actuators, we are able to use one third as many mechatronic components as a rigid Stewart platform, while retaining a working payload of 2kg and an open-loop bandwidth greater than 16Hz. We show that the platform is capable of both precise tracing and dynamic disturbance rejection when controlling a ball and sliding puck using a Proportional Integral Derivative (PID) controller . We develop a machine-learning-based kinematics model and demonstrate a functional workspace of roughly 10cm in each translation direction and 28 degrees in each orientation. This 6DoF device has many of the characteristics associated with rigid components--power, speed, and total workspace-- while capturing the advantages of soft mechanisms. Soft robots promise to be safer, more resilient, and more adaptable than their rigid counterparts. This is particularly valuable for systems that are expected to touch and interact with people. However, existing soft 6 DoF parallel mechanisms struggle to produce the forces, displacements, and response times required for mass adoption [1]-[3]. A substantial driver of this capability gap is the many limitations of soft actuator technologies.

artificial intelligence, machine learning, platform, (18 more...)

arXiv.org Artificial Intelligence

2504.13127

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Torque Responsive Metamaterials Enable High Payload Soft Robot Arms

Good, Ian, Balaji, Srivatsan, Oh, David, Thomas, Sawyer, Lipton, Jeffrey I.

arXiv.org Artificial IntelligenceJan-16-2025

Soft robots have struggled to support large forces and moments while also supporting their own weight against gravity. This limits their ability to reach certain configurations necessary for tasks such as inspection and pushing objects up. We have overcome this limitation by creating an electrically driven metamaterial soft arm using handed shearing auxetics (HSA) and bendable extendable torque resistant (BETR) shafts. These use the large force and torque capacity of HSAs and the nestable torque transmission of BETRs to create a strong soft arm. We found that the HSA arm was able to push 2.3 kg vertically and lift more than 600 g when positioned horizontally, supporting 0.33 Nm of torque at the base. The arm is able to move between waypoints while carrying the large payload and demonstrates consistent movement with path variance below 5 mm. The HSA arm's ability to perform active grasping with HSA grippers was also demonstrated, requiring 20 N of pull force to dislodge the object. Finally, we test the arm in a pipe inspection task. The arm is able to locate all the defects while sliding against the inner surface of the pipe, demonstrating its compliance.

artificial intelligence, betr, hsa, (16 more...)

arXiv.org Artificial Intelligence

2501.09819

Country: North America > United States (0.93)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas (0.46)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.44)

Add feedback

Torsion Resistant Strain Limiting Layers Enable High Grip Strength of Electrically-Driven Handed Shearing Auxetic Grippers

Good, Ian, Balaji, Srivatsan, Lipton, Jeffrey I.

arXiv.org Artificial IntelligenceDec-10-2024

Torsion Resistant Strain Limiting Layers Enable High Grip Strength of Electrically-Driven Handed Shearing Auxetic Grippers Ian Good, Srivatsan Balaji, and Jeffrey I. Lipton Abstract --Soft grippers have demonstrated a strong ability to successfully pick and manipulate many objects. A key limitation to their wider adoption is their inability to grasp larger payloads due to objects slipping out of grasps. We have overcome this limitation by introducing a torsionally rigid strain limiting layer (TR-SLL). This reduces out-of-plane bending while maintaining the gripper's softness and in-plane flexibility. We characterize the design space of the strain limiting layer and Handed Shearing Auxetic (HSA) actuators for a soft gripper using simulation and experiment. The inclusion of the TR-SLL with HSAs enables HSA grippers to be made with a single digit. We found that the use of our TR-SLL HSA gripper enabled pinch grasping of payloads over 1 kg. We demonstrate a lifting capacity of 5 kg when loading using the TR-SLL. We also demonstrate a peak pinch grasp force of 5.8 N, and a peak planar caging force of 14.5 N. Finally, we test the TR-SLL gripper on a suite of 43 YCB objects. We show success on 37 objects demonstrating significant capabilities. I NTRODUCTION Soft robotic fingers have focused on emulating the ability of human and other biotas compliance when bending [1]. However, the key to human's remarkable grip is that our fingers can simultaneously bend while resisting torsion and lateral loading. People rely on a rigid skeleton with discrete joints to provide this selective compliance. We build upon a previous conference paper that introduced the torsion resistant strain limiting layer (TR-SLL) [2]. The TR-SLL provides soft grippers with same torsion resistance of a skeleton without discretization. This work extends this to entirely electrically driven grippers. This allows a single Handed Shearing Auxetic (HSA) to be used in gripper and produce a high holding force. The TR-SLLs constrict bending and serves as a reaction body for the HSA.

artificial intelligence, gripper, tr-sll, (17 more...)

arXiv.org Artificial Intelligence

2412.07976

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (0.34)

Add feedback

Accelerating Complex Disease Treatment through Network Medicine and GenAI: A Case Study on Drug Repurposing for Breast Cancer

Hamed, Ahmed Abdeen, Fandy, Tamer E.

arXiv.org Artificial IntelligenceJun-27-2024

The objective of this research is to introduce a network specialized in predicting drugs that can be repurposed by investigating real-world evidence sources, such as clinical trials and biomedical literature. Specifically, it aims to generate drug combination therapies for complex diseases (e.g., cancer, Alzheimer's). We present a multilayered network medicine approach, empowered by a highly configured ChatGPT prompt engineering system, which is constructed on the fly to extract drug mentions in clinical trials. Additionally, we introduce a novel algorithm that connects real-world evidence with disease-specific signaling pathways (e.g., KEGG database). This sheds light on the repurposability of drugs if they are found to bind with one or more protein constituents of a signaling pathway. To demonstrate, we instantiated the framework for breast cancer and found that, out of 46 breast cancer signaling pathways, the framework identified 38 pathways that were covered by at least two drugs. This evidence signals the potential for combining those drugs. Specifically, the most covered signaling pathway, ID hsa:2064, was covered by 108 drugs, some of which can be combined. Conversely, the signaling pathway ID hsa:1499 was covered by only two drugs, indicating a significant gap for further research. Our network medicine framework, empowered by GenAI, shows promise in identifying drug combinations with a high degree of specificity, knowing the exact signaling pathways and proteins that serve as targets. It is noteworthy that ChatGPT successfully accelerated the process of identifying drug mentions in clinical trials, though further investigations are required to determine the relationships among the drug mentions.

hsa, pathway, protein, (16 more...)

arXiv.org Artificial Intelligence

2406.13106

Country:

North America > United States > New York > Broome County > Binghamton (0.04)
North America > United States > Texas > El Paso County > El Paso (0.04)
Europe > Poland > Pomerania Province > Gdańsk (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.83)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback