AITopics | Luan, Huanbo

Collaborating Authors

Luan, Huanbo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FinTMMBench: Benchmarking Temporal-Aware Multi-Modal RAG in Finance

Zhu, Fengbin, Li, Junfeng, Pan, Liangming, Wang, Wenjie, Feng, Fuli, Wang, Chao, Luan, Huanbo, Chua, Tat-Seng

arXiv.org Artificial IntelligenceMar-7-2025

Finance decision-making often relies on in-depth data analysis across various data sources, including financial tables, news articles, stock prices, etc. In this work, we introduce FinTMMBench, the first comprehensive benchmark for evaluating temporal-aware multi-modal Retrieval-Augmented Generation (RAG) systems in finance. Built from heterologous data of NASDAQ 100 companies, FinTMMBench offers three significant advantages. 1) Multi-modal Corpus: It encompasses a hybrid of financial tables, news articles, daily stock prices, and visual technical charts as the corpus. 2) Temporal-aware Questions: Each question requires the retrieval and interpretation of its relevant data over a specific time period, including daily, weekly, monthly, quarterly, and annual periods. 3) Diverse Financial Analysis Tasks: The questions involve 10 different tasks, including information extraction, trend analysis, sentiment analysis and event detection, etc. We further propose a novel TMMHybridRAG method, which first leverages LLMs to convert data from other modalities (e.g., tabular, visual and time-series data) into textual format and then incorporates temporal information in each node when constructing graphs and dense indexes. Its effectiveness has been validated in extensive experiments, but notable gaps remain, highlighting the challenges presented by our FinTMMBench.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.05185

Country:

Asia (0.29)
North America > United States > New York (0.14)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
(2 more...)

Add feedback

MMDocBench: Benchmarking Large Vision-Language Models for Fine-Grained Visual Document Understanding

Zhu, Fengbin, Liu, Ziyang, Ng, Xiang Yao, Wu, Haohui, Wang, Wenjie, Feng, Fuli, Wang, Chao, Luan, Huanbo, Chua, Tat Seng

arXiv.org Artificial IntelligenceOct-25-2024

Large Vision-Language Models (LVLMs) have achieved remarkable performance in many vision-language tasks, yet their capabilities in fine-grained visual understanding remain insufficiently evaluated. Existing benchmarks either contain limited fine-grained evaluation samples that are mixed with other data, or are confined to object-level assessments in natural images. To holistically assess LVLMs' fine-grained visual understanding capabilities, we propose using document images with multi-granularity and multi-modal information to supplement natural images. In this light, we construct MMDocBench, a benchmark with various OCR-free document understanding tasks for the evaluation of fine-grained visual perception and reasoning abilities. MMDocBench defines 15 main tasks with 4,338 QA pairs and 11,353 supporting regions, covering various document images such as research papers, receipts, financial reports, Wikipedia tables, charts, and infographics. Based on MMDocBench, we conduct extensive experiments using 13 open-source and 3 proprietary advanced LVLMs, assessing their strengths and weaknesses across different tasks and document image types. The benchmark, task instructions, and evaluation code will be made publicly available.

large language model, lvlm, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.21311

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Graph Random Neural Network for Semi-Supervised Learning on Graphs

Feng, Wenzheng, Zhang, Jie, Dong, Yuxiao, Han, Yu, Luan, Huanbo, Xu, Qian, Yang, Qiang, Kharlamov, Evgeny, Tang, Jie

arXiv.org Machine LearningOct-26-2020

We study the problem of semi-supervised learning on graphs, for which graph neural networks (GNNs) have been extensively explored. However, most existing GNNs inherently suffer from the limitations of over-smoothing, non-robustness, and weak-generalization when labeled nodes are scarce. In this paper, we propose a simple yet effective framework---GRAPH RANDOM NEURAL NETWORKS (GRAND)---to address these issues. In GRAND, we first design a random propagation strategy to perform graph data augmentation. Then we leverage consistency regularization to optimize the prediction consistency of unlabeled nodes across different data augmentations. Extensive experiments on graph benchmark datasets suggest that GRAND significantly outperforms state-of-the-art GNN baselines on semi-supervised node classification. Finally, we show that GRAND mitigates the issues of over-smoothing and non-robustness, exhibiting better generalization behavior than existing GNNs. The source code of GRAND is publicly available at https://github.com/Grand20/grand.

deep learning, neural network, rand, (17 more...)

arXiv.org Machine Learning

2005.11079

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Bilingual Lexicon Induction from Non-Parallel Data with Minimal Supervision

Zhang, Meng (Tsinghua University) | Peng, Haoruo (University of Illinois, Urbana-Champaign) | Liu, Yang (Tsinghua University) | Luan, Huanbo (Tsinghua University) | Sun, Maosong (Tsinghua University)

AAAI ConferencesFeb-14-2017

Building bilingual lexica from non-parallel data is a long-standing natural language processing research problem that could benefit thousands of resource-scarce languages which lack parallel data. Recent advances of continuous word representations have opened up new possibilities for this task, e.g. by establishing cross-lingual mapping between word embeddings via a seed lexicon. The method is however unreliable when there are only a limited number of seeds, which is a reasonable setting for resource-scarce languages. We tackle the limitation by introducing a novel matching mechanism into bilingual word representation learning. It captures extra translation pairs exposed by the seeds to incrementally improve the bilingual word embeddings. In our experiments, we find the matching mechanism to substantially improve the quality of the bilingual vector space, which in turn allows us to induce better bilingual lexica with seeds as few as 10.

lexicon, machine translation, text processing, (18 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.28)
North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.47)

Add feedback

Representation Learning of Knowledge Graphs with Entity Descriptions

Xie, Ruobing (Tsinghua University) | Liu, Zhiyuan (Tsinghua University) | Jia, Jia (Tsinghua University) | Luan, Huanbo (Tsinghua University) | Sun, Maosong (Tsinghua University)

AAAI ConferencesApr-19-2016

Representation learning (RL) of knowledge graphs aims to project both entities and relations into a continuous low-dimensional space. Most methods concentrate on learning representations with knowledge triples indicating relations between entities. In fact, in most knowledge graphs there are usually concise descriptions for entities, which cannot be well utilized by existing methods. In this paper, we propose a novel RL method for knowledge graphs taking advantages of entity descriptions. More specifically, we explore two encoders, including continuous bag-of-words and deep convolutional neural models to encode semantics of entity descriptions. We further learn knowledge representations with both triples and descriptions. We evaluate our method on two tasks, including knowledge graph completion and entity classification. Experimental results on real-world datasets show that, our method outperforms other baselines on the two tasks, especially under the zero-shot setting, which indicates that our method is capable of building representations for novel entities according to their descriptions. The source code of this paper can be obtained from https://github.com/xrb92/DKRL.

artificial intelligence, neural network, representation, (17 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning to Appreciate the Aesthetic Effects of Clothing

AAAI ConferencesApr-19-2016

How do people describe clothing? The words like “formal”or "casual" are usually used. However, recent works often focus on recognizing or extracting visual features (e.g., sleeve length, color distribution and clothing pattern) from clothing images accurately. How can we bridge the gap between the visual features and the aesthetic words? In this paper, we formulate this task to a novel three-level framework: visual features(VF) - image-scale space (ISS) - aesthetic words space(AWS). Leveraging the art-field image-scale space served as an intermediate layer, we first propose a Stacked Denoising Autoencoder Guided by CorrelativeLabels (SDAE-GCL) to map the visual features to the image-scale space; and then according to the semantic distances computed byWordNet::Similarity, we map the most often used aesthetic words in online clothing shops to the image-scale space too. Employing upper body menswear images downloaded from several global online clothing shops as experimental data, the results indicate that the proposed three-level framework can help to capture the subtle relationship between visual features and aesthetic words better compared to several baselines. To demonstrate that our three-level framework and its implementation methods are universally applicable, we finally present some interesting analyses on the fashion trend of menswear in the last 10 years.

image understanding, image-scale space, neural network, (22 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.29)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Building Earth Mover's Distance on Bilingual Word Embeddings for Machine Translation

AAAI ConferencesApr-19-2016

Following their monolingual counterparts, bilingual word embeddings are also on the rise. As a major application task, word translation has been relying on the nearest neighbor to connect embeddings cross-lingually. However, the nearest neighbor strategy suffers from its inherently local nature and fails to cope with variations in realistic bilingual word embeddings. Furthermore, it lacks a mechanism to deal with many-to-many mappings that often show up across languages. We introduce Earth Mover's Distance to this task by providing a natural formulation that translates words in a holistic fashion, addressing the limitations of the nearest neighbor. We further extend the formulation to a new task of identifying parallel sentences, which is useful for statistical machine translation systems, thereby expanding the application realm of bilingual word embeddings. We show encouraging performance on both tasks.

artificial intelligence, machine translation, translation, (18 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.29)

Genre: Research Report (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Moodee: An Intelligent Mobile Companion for Sensing Your Stress from Your Social Media Postings

AAAI ConferencesApr-19-2016

In this demo, we build a practical mobile application, Moodee, to help detect and release users' psychological stress by leveraging users' social media data in online social networks, and provide an interactive user interface to present users' and friends' psychological stress states in an visualized and intuitional way. Given users' online social media data as input, Moodee intelligently and automatically detects users' stress states. Moreover, Moodee would recommend users with different links to help release their stress. The main technology of this demo is a novel hybrid model - a factor graph model combined with Deep Neural Network, which can leverage social media content and social interaction information for stress detection. We think that Moodee can be helpful to people's mental health, which is a vital problem in

attention deficit-hyperactivity disorder, deep learning, moodee, (20 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.17)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.35)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

Discrete Image Hashing Using Large Weakly Annotated Photo Collections

Zhang, Hanwang (National University of Singapore) | Zhao, Na (National University of Singapore) | Shang, Xindi (National University of Singapore) | Luan, Huanbo (Tsinghua University) | Chua, Tat-seng (National University of Singapore)

AAAI ConferencesApr-19-2016

We address the problem of image hashing by learning binary codes from large and weakly supervised photo collections. Due to the explosive growth of user generated media on the Web, this problem is becoming critical for large-scale visual applications like image retrieval. While most existing hashing methods fail to address this challenge well, our method shows promising improvement due to the following two key advantages.First, we formulate a novel hashing objective that can effectively mine implicit weak supervision by collaborative filtering. Second, we propose a discrete hashing algorithm, offered with efficient optimization, to overcome the inferior optimizations in obtaining binary codes from real-valued solutions. In this way, our method can be considered as a weakly-supervised discrete hashing framework which jointly learns image semantics and their corresponding binary codes. Through training on one million weakly annotated images, our experimental results demonstrate that image retrieval using the proposed hashing method outperforms the other state-of-the-art ones on image and video benchmarks.

optimization problem, supervision, text processing, (20 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
(2 more...)

Add feedback

Iterative Learning of Parallel Lexicons and Phrases from Non-Parallel Corpora

AAAI ConferencesJul-15-2015

While parallel corpora are an indispensable resource for data-driven multilingual natural language processing tasks such as machine translation, they are limited in quantity, quality and coverage. As a result, learning translation models from non-parallel corpora has become increasingly important nowadays, especially for low-resource languages. In this work, we propose a joint model for iteratively learning parallel lexicons and phrases from nonparallel corpora. The model is trained using a Viterbi EM algorithm that alternates between constructing parallel phrases using lexicons and updating lexicons based on the constructed parallel phrases. Experiments on Chinese-English datasets show that our approach learns better parallel lexicons and phrases and improves translation performance significantly.

artificial intelligence, english phrase, machine translation, (16 more...)

AAAI Conferences

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country: Asia > China (0.29)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback