AITopics | Zhang, Yue

Collaborating Authors

Zhang, Yue

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dynamic neural network with memristive CIM and CAM for 2D and 3D vision

Zhang, Yue, Zhang, Woyu, Wang, Shaocong, Lin, Ning, Yu, Yifei, He, Yangu, Wang, Bo, Jiang, Hao, Lin, Peng, Xu, Xiaoxin, Qi, Xiaojuan, Wang, Zhongrui, Zhang, Xumeng, Shang, Dashan, Liu, Qi, Cheng, Kwang-Ting, Liu, Ming

arXiv.org Artificial IntelligenceJul-12-2024

The brain is dynamic, associative and efficient. It reconfigures by associating the inputs with past experiences, with fused memory and processing. In contrast, AI models are static, unable to associate inputs with past experiences, and run on digital computers with physically separated memory and processing. We propose a hardware-software co-design, a semantic memory-based dynamic neural network (DNN) using memristor. The network associates incoming data with the past experience stored as semantic vectors. The network and the semantic memory are physically implemented on noise-robust ternary memristor-based Computing-In-Memory (CIM) and Content-Addressable Memory (CAM) circuits, respectively. We validate our co-designs, using a 40nm memristor macro, on ResNet and PointNet++ for classifying images and 3D points from the MNIST and ModelNet datasets, which not only achieves accuracy on par with software but also a 48.1% and 15.9% reduction in computational budget. Moreover, it delivers a 77.6% and 93.3% reduction in energy consumption.

artificial intelligence, machine learning, memristor, (17 more...)

arXiv.org Artificial Intelligence

2407.0899

Country: Asia > China (0.48)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(2 more...)

Add feedback

$R^3$: "This is My SQL, Are You With Me?" A Consensus-Based Multi-Agent System for Text-to-SQL Tasks

Xia, Hanchen, Jiang, Feng, Deng, Naihao, Wang, Cunxiang, Zhao, Guojiang, Mihalcea, Rada, Zhang, Yue

arXiv.org Artificial IntelligenceJul-10-2024

Large Language Models (LLMs) have demonstrated strong performance on various tasks. To unleash their power on the Text-to-SQL task, we propose $R^3$ (Review-Rebuttal-Revision), a consensus-based multi-agent system for Text-to-SQL tasks. $R^3$ outperforms the existing single LLM Text-to-SQL systems as well as the multi-agent Text-to-SQL systems by $1.3\%$ to $8.1\%$ on Spider and Bird. Surprisingly, we find that for Llama-3-8B, $R^3$ outperforms chain-of-thought prompting by over 20\%, even outperforming GPT-3.5 on the development set of Spider.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2402.14851

Country:

Asia (0.46)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models

Zhang, Yue, Ma, Ziqiao, Li, Jialu, Qiao, Yanyuan, Wang, Zun, Chai, Joyce, Wu, Qi, Bansal, Mohit, Kordjamshidi, Parisa

arXiv.org Artificial IntelligenceJul-9-2024

Vision-and-Language Navigation (VLN) has gained increasing attention over recent years and many approaches have emerged to advance their development. The remarkable achievements of foundation models have shaped the challenges and proposed methods for VLN research. In this survey, we provide a top-down review that adopts a principled framework for embodied planning and reasoning, and emphasizes the current methods and future opportunities leveraging foundation models to address VLN challenges. We hope our in-depth discussions could provide valuable resources and insights: on one hand, to milestone the progress and explore opportunities and potential roles for foundation models in this field, and on the other, to organize different challenges and solutions in VLN to foundation model researchers.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.07035

Country:

North America > United States (0.46)
Asia > China (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (1.00)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(3 more...)

Add feedback

Evaluating LLMs' Inherent Multi-hop Reasoning Ability

Wu, Jian, Yang, Linyi, Wang, Zhen, Okumura, Manabu, Zhang, Yue

arXiv.org Artificial IntelligenceJul-5-2024

While Large Language Models (LLMs) excel in question-answering (QA) tasks, their multi-step reasoning abilities on multiple evidence integration on Multi-hop QA tasks remain underexplored. LLMs sometimes generate answers that rely on internal memory rather than reasoning given context, which brings concerns about the evaluation quality of real reasoning abilities. The counterfactual QA task can separate internal memory from reasoning abilities, but focusing solely on final-QA performance without evaluating the multi-step reasoning process is insufficient for reporting LLMs' real reasoning abilities. Current Multi-hop QA (MHQA) benchmarks are factual and annotated on open-source corpora such as Wikipedia, although useful for multi-step reasoning evaluation, showing limitations due to potential data contamination in LLMs pre-training stage. To address this issue, we introduce the Inherent Reasoning Evaluation (IRE) method, a novel evaluation way that jointly evaluates the LLMs' chain-of-reasoning performance based on the first knowledge-edited counterfactual multi-hop QA data which involves editing the original Wikipedia passages, reducing data contamination risks. The IRE comprehensively assesses reasoning chains through sub-QA and final-QA evaluations. Our comparisons reveal significant performance gaps for several LLMs between Wikipedia-based benchmarks and IRE, deeming data contamination issues in existing benchmarks. We believe that the IRE benchmark will enhance and facilitate trustworthy LLM evaluations.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.11924

Country:

Asia > India (1.00)
North America (0.68)

Genre: Research Report > New Finding (1.00)

Industry:

Media (1.00)
Leisure & Entertainment > Sports (0.93)
Textiles, Apparel & Luxury Goods (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels

Yan, Jianhao, Yan, Pingchuan, Chen, Yulong, Li, Judy, Zhu, Xianchao, Zhang, Yue

arXiv.org Artificial IntelligenceJul-4-2024

This study comprehensively evaluates the translation quality of Large Language Models (LLMs), specifically GPT-4, against human translators of varying expertise levels across multiple language pairs and domains. Through carefully designed annotation rounds, we find that GPT-4 performs comparably to junior translators in terms of total errors made but lags behind medium and senior translators. We also observe the imbalanced performance across different languages and domains, with GPT-4's translation capability gradually weakening from resource-rich to resource-poor directions. In addition, we qualitatively study the translation given by GPT-4 and human translators, and find that GPT-4 translator suffers from literal translations, but human translators sometimes overthink the background information. To our knowledge, this study is the first to evaluate LLMs against human translators and analyze the systematic differences between their outputs, providing valuable insights into the current state of LLM-based translation and its potential limitations.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.03658

Country:

Europe > Portugal (0.14)
Europe > Bulgaria (0.14)
Europe > Belgium (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Text2TimeSeries: Enhancing Financial Forecasting through Time Series Prediction Updates with Event-Driven Insights from Large Language Models

Kurisinkel, Litton Jose, Mishra, Pruthwik, Zhang, Yue

arXiv.org Artificial IntelligenceJul-4-2024

Time series models, typically trained on numerical data, are designed to forecast future values. These models often rely on weighted averaging techniques over time intervals. However, real-world time series data is seldom isolated and is frequently influenced by non-numeric factors. For instance, stock price fluctuations are impacted by daily random events in the broader world, with each event exerting a unique influence on price signals. Previously, forecasts in financial markets have been approached in two main ways: either as time-series problems over price sequence or sentiment analysis tasks. The sentiment analysis tasks aim to determine whether news events will have a positive or negative impact on stock prices, often categorizing them into discrete labels. Recognizing the need for a more comprehensive approach to accurately model time series prediction, we propose a collaborative modeling framework that incorporates textual information about relevant events for predictions. Specifically, we leverage the intuition of large language models about future changes to update real number time series predictions. We evaluated the effectiveness of our approach on financial market data.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2407.03689

Country:

North America > United States > Ohio (0.28)
North America > Canada > Newfoundland and Labrador > Newfoundland > St. John's (0.14)

Genre:

Financial News (1.00)
Press Release (0.93)
Research Report > New Finding (0.46)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation

Yin, Yongjing, Zeng, Jiali, Li, Yafu, Meng, Fandong, Zhang, Yue

arXiv.org Artificial IntelligenceJul-2-2024

The fine-tuning of open-source large language models (LLMs) for machine translation has recently received considerable attention, marking a shift towards data-centric research from traditional neural machine translation. However, the area of data collection for instruction fine-tuning in machine translation remains relatively underexplored. In this paper, we present LexMatcher, a simple yet effective method for data curation, the design of which is driven by the coverage of senses found in bilingual dictionaries. The construction process comprises data retrieval from an existing corpus and data augmentation that supplements the infrequent senses of polysemous words. Utilizing LLaMA2 as our base model, our approach outperforms the established baselines on the WMT2022 test sets and also exhibits remarkable performance in tasks related to word sense disambiguation and specialized terminology translation. These results underscore the effectiveness of LexMatcher in enhancing LLM-based machine translation. The code, data, and models are available at https://github.com/ARIES-LM/Lexmatcher-MT.git.

large language model, machine learning, translation, (19 more...)

arXiv.org Artificial Intelligence

2406.01441

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

ConU: Conformal Uncertainty in Large Language Models with Correctness Coverage Guarantees

Wang, Zhiyuan, Duan, Jinhao, Cheng, Lu, Zhang, Yue, Wang, Qingni, Shen, Hengtao, Zhu, Xiaofeng, Shi, Xiaoshuang, Xu, Kaidi

arXiv.org Artificial IntelligenceJun-29-2024

Uncertainty quantification (UQ) in natural language generation (NLG) tasks remains an open challenge, exacerbated by the intricate nature of the recent large language models (LLMs). This study investigates adapting conformal prediction (CP), which can convert any heuristic measure of uncertainty into rigorous theoretical guarantees by constructing prediction sets, for black-box LLMs in open-ended NLG tasks. We propose a sampling-based uncertainty measure leveraging self-consistency and develop a conformal uncertainty criterion by integrating the uncertainty condition aligned with correctness into the design of the CP algorithm. Experimental results indicate that our uncertainty measure generally surpasses prior state-of-the-art methods. Furthermore, we calibrate the prediction sets within the model's unfixed answer distribution and achieve strict control over the correctness coverage rate across 6 LLMs on 4 free-form NLG datasets, spanning general-purpose and medical domains, while the small average set size further highlights the efficiency of our method in providing trustworthy guarantees for practical open-ended NLG applications.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.00499

Country: North America > United States > Illinois (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Nash CoT: Multi-Path Inference with Preference Equilibrium

Zhang, Ziqi, Wang, Cunxiang, Xiao, Xiong, Zhang, Yue, Wang, Donglin

arXiv.org Artificial IntelligenceJun-18-2024

Chain-of-thought (CoT) prompting has emerged as a powerful technique for enhancing the reasoning capabilities of Large Language Models (LLMs) on complex problems. Among CoT-related studies, self-consistency (Multi-path inference with answer filtering through voting) involves generating multiple reasoning paths using the CoT framework and then selecting the most frequently produced outputs standing out as a concise yet competitive approach. While self-consistency has indeed led to the improvements in LLM inference, the use of multi-path inference also escalates deployment costs. Therefore, maintaining the performance benefits of self-consistency inherited from multi-path inference while reducing the inference costs holds significant value. In this research, we conceptualize language decoding as a preference consensus game, constructing a bi-player gaming system within each local path, and introduce Nash Chain-of-Thought (Nash CoT). Specifically, for a given question, we leverage LLM to autonomously select the contextually relevant template and generate outputs guided by this template, aiming to reach Nash Equilibrium alongside normal generation in each path. This approach allows us to achieve comparable or improved performance compared to self-consistency while using fewer inference paths on various inference tasks, including Arabic reasoning, Commonsense Question answering, and Symbolic inference.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.07099

Country: Asia > Middle East > Qatar (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

AutoSurvey: Large Language Models Can Automatically Write Surveys

Wang, Yidong, Guo, Qi, Yao, Wenjin, Zhang, Hongbo, Zhang, Xin, Wu, Zhen, Zhang, Meishan, Dai, Xinyu, Zhang, Min, Wen, Qingsong, Ye, Wei, Zhang, Shikun, Zhang, Yue

arXiv.org Artificial IntelligenceJun-17-2024

This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys in rapidly evolving fields like artificial intelligence. Traditional survey paper creation faces challenges due to the vast volume and complexity of information, prompting the need for efficient survey methods. While large language models (LLMs) offer promise in automating this process, challenges such as context window limitations, parametric knowledge constraints, and the lack of evaluation benchmarks remain. AutoSurvey addresses these challenges through a systematic approach that involves initial retrieval and outline generation, subsection drafting by specialized LLMs, integration and refinement, and rigorous evaluation and iteration. Our contributions include a comprehensive solution to the survey problem, a reliable evaluation method, and experimental validation demonstrating AutoSurvey's effectiveness.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2406.10252

Country:

Asia > China (0.46)
North America > United States > Louisiana (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (0.68)
Education (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback