AITopics | hierarchical attention network

Collaborating Authors

hierarchical attention network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unlocking Biomedical Insights: Hierarchical Attention Networks for High-Dimensional Data Interpretation

Nair, Rekha R, Babu, Tina, Panthakkan, Alavikunhu, Al-Ahmad, Hussain, Balusamy, Balamurugan

arXiv.org Artificial IntelligenceOct-28-2025

The proliferation of high-dimensional datasets in fields such as genomics, healthcare, and finance has created an urgent need for machine learning models that are both highly accurate and inherently interpretable. While traditional deep learning approaches deliver strong predictive performance, their lack of transparency often impedes their deployment in critical, decision-sensitive applications. In this work, we introduce the Hierarchical Attention-based Interpretable Network (HAIN), a novel architecture that unifies multi-level attention mechanisms, dimensionality reduction, and explanation-driven loss functions to deliver interpretable and robust analysis of complex biomedical data. HAIN provides feature-level interpretability via gradientweighted attention and offers global model explanations through prototype-based representations. Comprehensive evaluation on The Cancer Genome Atlas (TCGA) dataset demonstrates that HAIN achieves a classification accuracy of 94.3%, surpassing conventional post-hoc interpretability approaches such as SHAP and LIME in both transparency and explanatory power. Furthermore, HAIN effectively identifies biologically relevant cancer biomarkers, supporting its utility for clinical and research applications. By harmonizing predictive accuracy with interpretability, HAIN advances the development of transparent AI solutions for precision medicine and regulatory compliance.

artificial intelligence, explanation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.2182

Country: Asia > India (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

The Application of Transformer-Based Models for Predicting Consequences of Cyber Attacks

Chhetri, Bipin, Namin, Akbar Siami

arXiv.org Artificial IntelligenceAug-19-2025

Cyberattacks are increasing, and securing against such threats is costing industries billions of dollars annually. Threat Modeling, that is, comprehending the consequences of these attacks, can provide critical support to cybersecurity professionals, enabling them to take timely action and allocate resources that could be used elsewhere. Cybersecurity is heavily dependent on threat modeling, as it assists security experts in assessing and mitigating risks related to identifying vulnerabilities and threats. Recently, there has been a pressing need for automated methods to assess attack descriptions and forecast the future consequences of the increasing complexity of cyberattacks. This study examines how Natural Language Processing (NLP) and deep learning can be applied to analyze the potential impact of cyberattacks by leveraging textual descriptions from the MITRE Common Weakness Enumeration (CWE) database. We emphasize classifying attack consequences into five principal categories: Availability, Access Control, Confidentiality, Integrity, and Other. This paper investigates the use of Bidirectional Encoder Representations from Transformers (BERT) in combination with Hierarchical Attention Networks (HANs) for Multi-label classification, evaluating their performance in comparison with conventional CNN and LSTM-based models. Experimental findings show that BERT achieves an overall accuracy of $0.972$, far higher than conventional deep learning models in multi-label classification. HAN outperforms baseline forms of CNN and LSTM-based models on specific cybersecurity labels. However, BERT consistently achieves better precision and recall, making it more suitable for predicting the consequences of a cyberattack.

artificial intelligence, conséquence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2508.1303

Country: North America > United States (0.68)

Genre:

Overview (1.00)
Research Report > New Finding (0.88)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

BGM-HAN: A Hierarchical Attention Network for Accurate and Fair Decision Assessment on Semi-Structured Profiles

Liu, Junhua, Lee, Roy Ka-Wei, Lim, Kwan Hui

arXiv.org Artificial IntelligenceJul-24-2025

Human decision-making in high-stakes domains often relies on expertise and heuristics, but is vulnerable to hard-to-detect cognitive biases that threaten fairness and long-term outcomes. This work presents a novel approach to enhancing complex decision-making workflows through the integration of hierarchical learning alongside various enhancements. Focusing on university admissions as a representative high-stakes domain, we propose BGM-HAN, an enhanced Byte-Pair Encoded, Gated Multi-head Hierarchical Attention Network, designed to effectively model semi-structured applicant data. BGM-HAN captures multi-level representations that are crucial for nuanced assessment, improving both interpretability and predictive performance. Experimental results on real admissions data demonstrate that our proposed model significantly outperforms both state-of-the-art baselines from traditional machine learning to large language models, offering a promising framework for augmenting decision-making in domains where structure, context, and fairness matter. Source code is available at: https://github.com/junhua/bgm-han.

hierarchical attention network, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.17472

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.66)

Industry: Education > Educational Setting (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Towards Objective and Unbiased Decision Assessments with LLM-Enhanced Hierarchical Attention Networks

Liu, Junhua, Lim, Kwan Hui, Lee, Roy Ka-Wei

arXiv.org Artificial IntelligenceNov-14-2024

How objective and unbiased are we while making decisions? This work investigates cognitive bias identification in high-stake decision making process by human experts, questioning its effectiveness in real-world settings, such as candidates assessments for university admission. We begin with a statistical analysis assessing correlations among different decision points among in the current process, which discovers discrepancies that imply cognitive bias and inconsistency in decisions. This motivates our exploration of bias-aware AI-augmented workflow that surpass human judgment. We propose BGM-HAN, an enhanced Hierarchical Attention Network with Byte-Pair Encoding, Gated Residual Connections and Multi-Head Attention. Using it as a backbone model, we further propose a Shortlist-Analyse-Recommend (SAR) agentic Figure 1: University Admission Decision Process: overview workflow, which simulate real-world decision-making. In our of current workflow and possible agentic augmentation experiments, both the proposed model and the agentic workflow significantly improves on both human judgment and alternative models, validated with real-world data. Source code is available at: critical to ensure long-term sustainable outcomes and fairness to https://github.com/junhua/bgm-han.

accuracy, attention network, hierarchical attention network, (12 more...)

arXiv.org Artificial Intelligence

2411.08504

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Workflow (1.00)
Research Report (1.00)

Industry: Education > Educational Setting (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

3HAN: A Deep Neural Network for Fake News Detection

Singhania, Sneha, Fernandez, Nigel, Rao, Shrisha

arXiv.org Artificial IntelligenceJun-21-2023

The rapid spread of fake news is a serious problem calling for AI solutions. We employ a deep learning based automated detector through a three level hierarchical attention network (3HAN) for fast, accurate detection of fake news. 3HAN has three levels, one each for words, sentences, and the headline, and constructs a news vector: an effective representation of an input news article, by processing an article in an hierarchical bottom-up manner. The headline is known to be a distinguishing feature of fake news, and furthermore, relatively few words and sentences in an article are more important than the rest. 3HAN gives a differential importance to parts of an article, on account of its three layers of attention. By experiments on a large real-world data set, we observe the effectiveness of 3HAN with an accuracy of 96.77%. Unlike some other deep learning models, 3HAN provides an understandable output through the attention weights given to different parts of an article, which can be visualized through a heatmap to enable further manual fact checking.

artificial intelligence, machine learning, vector, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-319-70096-0_59

2306.12014

Country:

North America > Canada (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (0.64)

Industry: Media > News (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning Relation-Specific Representations for Few-shot Knowledge Graph Completion

Li, Yuling, Yu, Kui, Zhang, Yuhong, Wu, Xindong

arXiv.org Artificial IntelligenceSep-27-2022

Recent years have witnessed increasing interest in few-shot knowledge graph completion (FKGC), which aims to infer unseen query triples for a few-shot relation using a few reference triples about the relation. The primary focus of existing FKGC methods lies in learning relation representations that can reflect the common information shared by the query and reference triples. To this end, these methods learn entity-pair representations from the direct neighbors of head and tail entities, and then aggregate the representations of reference entity pairs. However, the entity-pair representations learned only from direct neighbors may have low expressiveness when the involved entities have sparse direct neighbors or share a common local neighborhood with other entities. Moreover, merely modeling the semantic information of head and tail entities is insufficient to accurately infer their relational information especially when they have multiple relations. To address these issues, we propose a Relation-Specific Context Learning (RSCL) framework, which exploits graph contexts of triples to learn global and local relation-specific representations for few-shot relations. Specifically, we first extract graph contexts for each triple, which can provide long-term entity-relation dependencies. To encode the extracted graph contexts, we then present a hierarchical attention network to capture contextualized information of triples and highlight valuable local neighborhood information of entities. Finally, we design a hybrid attention aggregator to evaluate the likelihood of the query triples at the global and local levels. Experimental results on two public datasets demonstrate that RSCL outperforms state-of-the-art FKGC methods.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2203.11639

Country:

Asia > China > Anhui Province > Hefei (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.63)

Add feedback

Hierarchical Attention Network for Explainable Depression Detection on Twitter Aided by Metaphor Concept Mappings

Han, Sooji, Mao, Rui, Cambria, Erik

arXiv.org Artificial IntelligenceSep-15-2022

Automatic depression detection on Twitter can help individuals privately and conveniently understand their mental health status in the early stages before seeing mental health professionals. Most existing black-box-like deep learning methods for depression detection largely focused on improving classification performance. However, explaining model decisions is imperative in health research because decision-making can often be high-stakes and life-and-death. Reliable automatic diagnosis of mental health problems including depression should be supported by credible explanations justifying models' predictions. In this work, we propose a novel explainable model for depression detection on Twitter. It comprises a novel encoder combining hierarchical attention mechanisms and feed-forward neural networks. To support psycholinguistic studies, our model leverages metaphorical concept mappings as input. Thus, it not only detects depressed individuals, but also identifies features of such users' tweets and associated metaphor concept mappings.

artificial intelligence, machine learning, social media, (17 more...)

arXiv.org Artificial Intelligence

2209.07494

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Spain > Aragón (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Clustering Text Using Attention

Singh, Lovedeep

arXiv.org Artificial IntelligenceJan-8-2022

There are various situations where the need is to group In simple terms, attention mechanism can be thought of an similar texts into same buckets. We do not have enough additional layer somewhere in a network architecture which previous experience or knowledge to run a classification gives the deep learning model extra controlling parameters to algorithm on top of the available data. Clustering is the refine its learning by paying attention to different parts of the fundamental and intuitive solution to such problems.

algorithm, attention mechanism, hierarchical attention network, (9 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICCCNT51525.2021.9579951

2201.02816

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > India > Chandigarh (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Interest with Hierarchical Attention Network for Click-Through Rate Prediction

Xu, Weinan, He, Hengxu, Tan, Minshi, Li, Yunming, Lang, Jun, Guo, Dongbai

arXiv.org Artificial IntelligenceMay-22-2020

Deep Interest Network (DIN) is a state-of-the-art model which uses attention mechanism to capture user interests from historical behaviors. User interests intuitively follow a hierarchical pattern such that users generally show interests from a higher-level then to a lower-level abstraction. Modeling such an interest hierarchy in an attention network can fundamentally improve the representation of user behaviors. We, therefore, propose an improvement over DIN to model arbitrary interest hierarchy: Deep Interest with Hierarchical Attention Network (DHAN). In this model, a multi-dimensional hierarchical structure is introduced on the first attention layer which attends to an individual item, and the subsequent attention layers in the same dimension attend to higher-level hierarchy built on top of the lower corresponding layers. To enable modeling of multiple dimensional hierarchies, an expanding mechanism is introduced to capture one to many hierarchies. This design enables DHAN to attend different importance to different hierarchical abstractions thus can fully capture user interests at different dimensions (e.g. category, price, or brand).To validate our model, a simplified DHAN has applied to Click-Through Rate (CTR) prediction and our experimental results on three public datasets with two levels of the one-dimensional hierarchy only by category. It shows the superiority of DHAN with significant AUC uplift from 12% to 21% over DIN. DHAN is also compared with another state-of-the-art model Deep Interest Evolution Network (DIEN), which models temporal interest. The simplified DHAN also gets slight AUC uplift from 1.0% to 1.7% over DIEN. A potential future work can be a combination of DHAN and DIEN to model both temporal and hierarchical interests.

artificial intelligence, dhan, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2005.12981

Country:

Asia > China (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > Promising Solution (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

HAXMLNet: Hierarchical Attention Network for Extreme Multi-Label Text Classification

You, Ronghui, Zhang, Zihan, Dai, Suyang, Zhu, Shanfeng

arXiv.org Machine LearningMar-24-2019

Extreme multi-label text classification (XMTC) addresses the problem of tagging each text with the most relevant labels from an extreme-scale label set. Traditional methods use bag-of-words (BOW) representations without context information as their features. The state-ot-the-art deep learning-based method, AttentionXML, which uses a recurrent neural network (RNN) and the multi-label attention, can hardly deal with extreme-scale (hundreds of thousands labels) problem. To address this, we propose our HAXMLNet, which uses an efficient and effective hierarchical structure with the multi-label attention. Experimental results show that HAXMLNet reaches a competitive performance with other state-of-the-art methods.

classification, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

1904.12578

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback