AITopics

2411.08599

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

arXiv.org Artificial IntelligenceJun-12-2024

FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning

Li, Xiao, Zhu, Bolin, Liu, Sichen, Zhu, Yin, Liu, Yiwei, Cheng, Gong

The application of formulas is a fundamental ability of humans when addressing numerical reasoning problems. However, existing numerical reasoning datasets seldom explicitly indicate the formulas employed during the reasoning steps. To bridge this gap, we construct a dataset for formula-based numerical reasoning called FormulaReasoning, which consists of 5,420 reasoning-based questions. We employ it to conduct evaluations of LLMs with size ranging from 7B to over 100B parameters utilizing zero-shot and few-shot chain-of-thought methods, and we further explore using retrieval-augmented LLMs provided with an external formula database associated with our dataset. We also experiment with supervised methods where we divide the reasoning process into formula generation, parameter extraction, and numerical calculation, and perform data augmentation. Our empirical findings underscore the significant potential for improvement in existing models when applied to our complex, formula-driven FormulaReasoning.

formula, large language model, machine learning, (18 more...)

2402.12692

Country:

Asia (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (0.94)
Education (0.69)
Leisure & Entertainment > Sports (0.46)
Materials > Metals & Mining (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

arXiv.org Artificial IntelligenceSep-22-2023

Furthest Reasoning with Plan Assessment: Stable Reasoning Path with Retrieval-Augmented Large Language Models

Zhu, Yin, Luo, Zhiling, Cheng, Gong

Large Language Models (LLMs), acting as a powerful reasoner and generator, exhibit extraordinary performance across various natural language tasks, such as question answering (QA). Among these tasks, Multi-Hop Question Answering (MHQA) stands as a widely discussed category, necessitating seamless integration between LLMs and the retrieval of external knowledge. Existing methods employ LLM to generate reasoning paths and plans, and utilize IR to iteratively retrieve related knowledge, but these approaches have inherent flaws. On one hand, Information Retriever (IR) is hindered by the low quality of generated queries by LLM. On the other hand, LLM is easily misguided by the irrelevant knowledge by IR. These inaccuracies, accumulated by the iterative interaction between IR and LLM, lead to a disaster in effectiveness at the end. To overcome above barriers, in this paper, we propose a novel pipeline for MHQA called Furthest-Reasoning-with-Plan-Assessment (FuRePA), including an improved framework (Furthest Reasoning) and an attached module (Plan Assessor). 1) Furthest reasoning operates by masking previous reasoning path and generated queries for LLM, encouraging LLM generating chain of thought from scratch in each iteration. This approach enables LLM to break the shackle built by previous misleading thoughts and queries (if any). 2) The Plan Assessor is a trained evaluator that selects an appropriate plan from a group of candidate plans proposed by LLM. Our methods are evaluated on three highly recognized public multi-hop question answering datasets and outperform state-of-the-art on most metrics (achieving a 10%-12% in answer accuracy).

arxiv preprint arxiv, large language model, machine learning, (17 more...)

2309.12767

Country: North America > United States (1.00)

Genre:

Research Report (0.64)
Personal > Honors (0.46)

Industry:

Media (1.00)
Leisure & Entertainment > Sports (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

arXiv.org Artificial IntelligenceApr-13-2023

From Node Interaction to Hop Interaction: New Effective and Scalable Graph Learning Paradigm

Chen, Jie, Li, Zilong, Zhu, Yin, Zhang, Junping, Pu, Jian

Existing Graph Neural Networks (GNNs) follow the message-passing mechanism that conducts information interaction among nodes iteratively. While considerable progress has been made, such node interaction paradigms still have the following limitation. First, the scalability limitation precludes the broad application of GNNs in large-scale industrial settings since the node interaction among rapidly expanding neighbors incurs high computation and memory costs. Second, the over-smoothing problem restricts the discrimination ability of nodes, i.e., node representations of different classes will converge to indistinguishable after repeated node interactions. In this work, we propose a novel hop interaction paradigm to address these limitations simultaneously. The core idea is to convert the interaction target among nodes to pre-processed multi-hop features inside each node. We design a simple yet effective HopGNN framework that can easily utilize existing GNNs to achieve hop interaction. Furthermore, we propose a multi-task learning strategy with a self-supervised learning objective to enhance HopGNN. We conduct extensive experiments on 12 benchmark datasets in a wide range of domains, scales, and smoothness of graphs. Experimental results show that our methods achieve superior performance while maintaining high scalability and efficiency. The code is at https://github.com/JC-202/HopGNN.

artificial intelligence, interaction, machine learning, (16 more...)

2211.11761

Country:

North America > United States (0.28)
Asia (0.28)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

arXiv.org Artificial IntelligenceNov-22-2022

DyRRen: A Dynamic Retriever-Reranker-Generator Model for Numerical Reasoning over Tabular and Textual Data

Li, Xiao, Zhu, Yin, Liu, Sichen, Ju, Jiangzhou, Qu, Yuzhong, Cheng, Gong

Numerical reasoning over hybrid data containing tables and long texts has recently received research attention from the AI community. To generate an executable reasoning program consisting of math and table operations to answer a question, state-of-the-art methods use a retriever-generator pipeline. However, their retrieval results are static, while different generation steps may rely on different sentences. To attend to the retrieved information that is relevant to each generation step, in this paper, we propose DyRRen, an extended retriever-reranker-generator framework where each generation step is enhanced by a dynamic reranking of retrieved sentences. It outperforms existing baselines on the FinQA dataset.

artificial intelligence, machine learning, natural language, (14 more...)

2211.12668

Country:

North America > United States (1.00)
Europe (0.93)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceNov-4-2022

Binaural Rendering of Ambisonic Signals by Neural Networks

Zhu, Yin, Kong, Qiuqiang, Shi, Junjie, Liu, Shilei, Ye, Xuzhou, Wang, Ju-chiang, Zhang, Junping

Binaural rendering of ambisonic signals is of broad interest to virtual reality and immersive media. Conventional methods often require manually measured Head-Related Transfer Functions (HRTFs). To address this issue, we collect a paired ambisonic-binaural dataset and propose a deep learning framework in an end-to-end manner. Experimental results show that neural networks outperform the conventional method in objective metrics and achieve comparable subjective metrics. To validate the proposed framework, we experimentally explore different settings of the input features, model structures, output features, and loss functions. Our proposed system achieves an SDR of 7.32 and MOSs of 3.83, 3.58, 3.87, 3.58 in quality, timbre, localization, and immersion dimensions.

artificial intelligence, machine learning, rendering, (18 more...)

2211.02301

Genre: Research Report (0.84)

Industry: Health & Medicine (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

AAAI ConferencesApr-19-2016

Instilling Social to Physical: Co-Regularized Heterogeneous Transfer Learning

Wei, Ying (Hong Kong University of Science and Technology) | Zhu, Yin (Hong Kong University of Science and Technology) | Leung, Cane Wing-ki (Wisers Research) | Song, Yangqiu (West Virginia University) | Yang, Qiang (Hong Kong University of Science and Technology)

Ubiquitous computing tasks, such as human activity recognition (HAR), are enabling a wide spectrum of applications, ranging from healthcare to environment monitoring. The success of a ubiquitous computing task relies on sufﬁcient physical sensor data with groundtruth labels, which are always scarce due to the expensive annotating process. Meanwhile, social media platforms provide a lot of social or semantic context information. People share what they are doing and where they are frequently in the messages they post. This rich set of socially shared activities motivates us to transfer knowledge from social media to address the sparsity issue of labelled physical sensor data. In order to transfer the knowledge of social and semantic context, we propose a Co-Regularized Heterogeneous Transfer Learning (CoHTL) model, which builds a common semantic space derived from two heterogeneous domains. Our proposed method outperforms state-of-the-art methods on two ubiquitous computing tasks, namely human activity recognition and region function discovery.

sensor record, social media, text processing, (19 more...)

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States > West Virginia (0.14)

Genre: Research Report (0.88)

Industry:

Health & Medicine (0.66)
Information Technology (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.91)
(2 more...)

AAAI ConferencesJul-14-2014

Source Free Transfer Learning for Text Classification

Lu, Zhongqi (Hong Kong University of Science and Technology) | Zhu, Yin (Hong Kong University of Science and Technology) | Pan, Sinno Jialin (Institute for Infocomm Research) | Xiang, Evan Wei (Baidu Inc.) | Wang, Yujing (Microsoft Research Asia, Beijing) | Yang, Qiang (Hong Kong University of Science and Technology)

Transfer learning uses relevant auxiliary data to help the learning task in a target domain where labeled data is usually insufficient to train an accurate model. Given appropriate auxiliary data, researchers have proposed many transfer learning models. How to find such auxiliary data, however, is of little research so far. In this paper, we focus on the problem of auxiliary data retrieval, and propose a transfer learning framework that effectively selects helpful auxiliary data from an open knowledge space (e.g. the World Wide Web). Because there is no need of manually selecting auxiliary data for different target domain tasks, we call our framework Source Free Transfer Learning (SFTL). For each target domain task, SFTL framework iteratively queries for the helpful auxiliary data based on the learned model and then updates the model using the retrieved auxiliary data. We highlight the automatic constructions of queries and the robustness of the SFTL framework. Our experiments on 20NewsGroup dataset and a Google search snippets dataset suggest that the framework is capable of achieving comparable performance to those state-of-the-art methods with dedicated selections of auxiliary data.

artificial intelligence, auxiliary data, machine learning, (14 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.29)
North America > United States > Wisconsin (0.14)

Genre: Research Report (0.67)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

AAAI ConferencesJul-21-2012

Discovering Spammers in Social Networks

Zhu, Yin (Hong Kong University of Science and Technology (HKUST)) | Wang, Xiao (Renren Inc.) | Zhong, Erheng (Hong Kong University of Science and Technology (HKUST)) | Liu, Nathan N. (Hong Kong University of Science and Technology (HKUST)) | Li, He (Renren Inc.) | Yang, Qiang (Hong Kong University of Science and Technology (HKUST))

As the popularity of the social media increases, as evidenced in Twitter, Facebook and China's Renren, spamming activities also picked up in numbers and variety. On social network sites, spammers often disguise themselves by creating fake accounts and hijacking normal users' accounts for personal gains. Different from the spammers in traditional systems such as SMS and email, spammers in social media behave like normal users and they continue to change their spamming strategies to fool anti spamming systems. However, due to the privacy and resource concerns, many social media websites cannot fully monitor all the contents of users, making many of the previous approaches, such as topology-based and content-classification-based methods, infeasible to use. In this paper, we propose a novel method for spammer detection in social networks that exploits both social activities as well as users' social relations in an innovative and highly scalable manner. The proposed method detects spammers following collective activities based on users' social actions and relations. We have empirically tested our method on data from Renren.com, which is the largest social network in China, and demonstrated that our new method can improve the detection performance significantly.

optimization problem, spammer, survey article, (19 more...)

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.55)

Genre: Research Report (0.34)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Security & Privacy > Spam Filtering (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

AAAI ConferencesAug-4-2011

Heterogeneous Transfer Learning for Image Classification

Zhu, Yin (Hong Kong University of Science and Technology) | Chen, Yuqiang (Shanghai Jiao Tong University) | Lu, Zhongqi (&dagger;Hong Kong University of Science and Technology) | Pan, Sinno Jialin (Institute for Infocomm Research) | Xue, Gui-Rong (Shanghai Jiao Tong University) | Yu, Yong (Shanghai Jiao Tong University) | Yang, Qiang (Hong Kong University of Science and Technology)

Transfer learning as a new machine learning paradigm has gained increasing attention lately. In situations where the training data in a target domain are not sufficient to learn predictive models effectively, transfer learning leverages auxiliary source data from other related source domains for learning. While most of the existing works in this area only focused on using the source data with the same structure as the target data, in this paper, we push this boundary further by proposing a heterogeneous transfer learning framework for knowledge transfer between text and images. We observe that for a target-domain classification problem, some annotated images can be found on many social Web sites, which can serve as a bridge to transfer knowledge from the abundant text documents available over the Web. A key question is how to effectively transfer the knowledge in the source data even though the text can be arbitrarily found. Our solution is to enrich the representation of the target images with semantic concepts extracted from the auxiliary source data through a novel matrix factorization method. By using the latent semantic features generated by the auxiliary data, we are able to build a better integrated image classifier. We empirically demonstrate the effectiveness of our algorithm on the Caltech-256 image dataset.

artificial intelligence, image classification, machine learning, (16 more...)

Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Hong Kong (0.14)
North America > United States > Wisconsin (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)