AITopics | encoder-decoder framework

We propose a novel extension of the encoder-decoder framework, called a review network. The review network is generic and can enhance any existing encoderdecoder model: in this paper, we consider RNN decoders with both CNN and RNN encoders. The review network performs a number of review steps with attention mechanism on the encoder hidden states, and outputs a thought vector after each review step; the thought vectors are used as the input of the attention mechanism in the decoder. We show that conventional encoder-decoders are a special case of our framework.

Add feedback

Adaptively Aligned Image Captioning via Adaptive Attention Time

Lun Huang, Wenmin Wang, Yaxian Xia, Jie Chen

Neural Information Processing SystemsFeb-15-2026, 10:09:11 GMT

AATallowstheframeworktolearn howmany attention steps to take to output a caption word at each decoding step. With AAT, an image region can be mapped to an arbitrary number of caption words while a caption word can also attend to an arbitrary number of image regions. AAT is deterministic and differentiable, and doesn't introduce any noise to the parameter gradients.

artificial intelligence, attention step, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

Deliberation Networks: Sequence Generation Beyond One-Pass Decoding

Yingce Xia, Fei Tian, Lijun Wu, Jianxin Lin, Tao Qin, Nenghai Yu, Tie-Yan Liu

Neural Information Processing SystemsNov-21-2025, 12:51:09 GMT

A generated sequence is directly used as final output without further polishing. However, deliberation is a common behavior in human's daily life like reading

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Africa > Middle East > Egypt (0.06)
Europe > Middle East (0.04)
Asia > Middle East (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Decoding with Value Networks for Neural Machine Translation

Di He, Hanqing Lu, Yingce Xia, Tao Qin, Liwei Wang, Tie-Yan Liu

Neural Information Processing SystemsNov-21-2025, 06:13:20 GMT

Neural Information Processing Systems http://nips.cc/

machine learning, natural language, translation, (20 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Adaptively Aligned Image Captioning via Adaptive Attention Time

Lun Huang, Wenmin Wang, Yaxian Xia, Jie Chen

Neural Information Processing SystemsAug-20-2025, 11:27:34 GMT

Recent neural models for image captioning usually employ an encoder-decoder framework with an attention mechanism.

adaptive attention time, attention model, attention step, (11 more...)

Neural Information Processing Systems

Country:

Asia > Macao (0.04)
North America > Canada (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback

Romanized to Native Malayalam Script Transliteration Using an Encoder-Decoder Framework

Baiju, Bajiyo, Manohar, Kavya, Pillai, Leena G, Sherly, Elizabeth

arXiv.org Artificial IntelligenceDec-13-2024

In this work, we present the development of a reverse transliteration model to convert romanized Malayalam to native script using an encoder-decoder framework built with attention-based bidirectional Long Short Term Memory (Bi-LSTM) architecture. To train the model, we have used curated and combined collection of 4.3 million transliteration pairs derived from publicly available Indic language translitertion datasets, Dakshina and Aksharantar. We evaluated the model on two different test dataset provided by IndoNLP-2025-Shared-Task that contain, (1) General typing patterns and (2) Adhoc typing patterns, respectively. On the Test Set-1, we obtained a character error rate (CER) of 7.4%. However upon Test Set-2, with adhoc typing patterns, where most vowel indicators are missing, our model gave a CER of 22.7%.

artificial intelligence, machine learning, transliteration, (15 more...)

arXiv.org Artificial Intelligence

2412.09957

Country:

Asia > Singapore (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)
Asia > India > Kerala > Thiruvananthapuram (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generalizing Hyperedge Expansion for Hyper-relational Knowledge Graph Modeling

Liu, Yu, Yang, Shu, Ding, Jingtao, Yao, Quanming, Li, Yong

arXiv.org Artificial IntelligenceNov-9-2024

By representing knowledge in a primary triple associated with additional attribute-value qualifiers, hyper-relational knowledge graph (HKG) that generalizes triple-based knowledge graph (KG) has been attracting research attention recently. Compared with KG, HKG is enriched with the semantic qualifiers as well as the hyper-relational graph structure. However, to model HKG, existing studies mainly focus on either semantic information or structural information therein, which however fail to capture both simultaneously. To tackle this issue, in this paper, we generalize the hyperedge expansion in hypergraph learning and propose an equivalent transformation for HKG modeling, referred to as TransEQ. Specifically, the equivalent transformation transforms a HKG to a KG, which considers both semantic and structural characteristics. Then an encoder-decoder framework is developed to bridge the modeling research between KG and HKG. In the encoder part, KG-based graph neural networks are leveraged for structural modeling; while in the decoder part, various HKG-based scoring functions are exploited for semantic modeling. Especially, we design the sharing embedding mechanism in the encoder-decoder framework with semantic relatedness captured. We further theoretically prove that TransEQ preserves complete information in the equivalent transformation, and also achieves full expressivity. Finally, extensive experiments on three benchmarks demonstrate the superior performance of TransEQ in terms of both effectiveness and efficiency. On the largest benchmark WikiPeople, TransEQ significantly improves the state-of-the-art models by 15\% on MRR.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.06191

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.91)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Decoding with Value Networks for Neural Machine Translation

Di He, Hanqing Lu, Yingce Xia, Tao Qin, Liwei Wang, Tie-Yan Liu

Neural Information Processing SystemsOct-3-2024, 00:38:46 GMT

Neural Machine Translation (NMT) has become a popular technology in recent years, and beam search is its de facto decoding method due to the shrunk search space and reduced computational complexity. However, since it only searches for local optima at each time step through one-step forward looking, it usually cannot output the best target sentence.

machine translation, translation, value network, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Review Networks for Caption Generation

Neural Information Processing SystemsMar-12-2024, 14:58:52 GMT

We propose a novel extension of the encoder-decoder framework, called a review network. The review network is generic and can enhance any existing encoderdecoder model: in this paper, we consider RNN decoders with both CNN and RNN encoders. The review network performs a number of review steps with attention mechanism on the encoder hidden states, and outputs a thought vector after each review step; the thought vectors are used as the input of the attention mechanism in the decoder. We show that conventional encoder-decoders are a special case of our framework. Empirically, we show that our framework improves over state-ofthe-art encoder-decoder systems on the tasks of image captioning and source code captioning.

encoder, review network, vector, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Medical visual question answering using joint self-supervised learning

Zhou, Yuan, Mei, Jing, Yu, Yiqin, Syeda-Mahmood, Tanveer

arXiv.org Artificial IntelligenceFeb-25-2023

Visual Question Answering (VQA) becomes one of the most active research problems in the medical imaging domain. A well-known VQA challenge is the intrinsic diversity between the image and text modalities, and in the medical VQA task, there is another critical problem relying on the limited size of labelled image-question-answer data. In this study we propose an encoder-decoder framework that leverages the image-text joint representation learned from large-scaled medical image-caption data and adapted to the small-sized medical VQA task. The encoder embeds across the image-text dual modalities with self-attention mechanism and is independently pre-trained on the large-scaled medical image-caption dataset by multiple self-supervised learning tasks. Then the decoder is connected to the top of the encoder and fine-tuned using the small-sized medical VQA dataset. The experiment results present that our proposed method achieves better performance comparing with the baseline and SOTA methods.

machine learning, natural language, question answering, (19 more...)

arXiv.org Artificial Intelligence

2302.13069

Country: