AITopics | Ding, Nai

Collaborating Authors

Ding, Nai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Active Use of Latent Constituency Representation in both Humans and Large Language Models

Liu, Wei, Xiang, Ming, Ding, Nai

arXiv.org Artificial IntelligenceMay-28-2024

Understanding how sentences are internally represented in the human brain, as well as in large language models (LLMs) such as ChatGPT, is a major challenge for cognitive science. Classic linguistic theories propose that the brain represents a sentence by parsing it into hierarchically organized constituents. In contrast, LLMs do not explicitly parse linguistic constituents and their latent representations remains poorly explained. Here, we demonstrate that humans and LLMs construct similar latent representations of hierarchical linguistic constituents by analyzing their behaviors during a novel one-shot learning task, in which they infer which words should be deleted from a sentence. Both humans and LLMs tend to delete a constituent, instead of a nonconstituent word string. In contrast, a naive sequence processing model that has access to word properties and ordinal positions does not show this property. Based on the word deletion behaviors, we can reconstruct the latent constituency tree representation of a sentence for both humans and LLMs. These results demonstrate that a latent tree-structured constituency representation can emerge in both the human brain and LLMs.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2405.18241

Country:

Asia (0.28)
North America > United States (0.28)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Probing the Creativity of Large Language Models: Can models produce divergent semantic association?

Chen, Honghua, Ding, Nai

arXiv.org Artificial IntelligenceOct-17-2023

Large language models possess remarkable capacity for processing language, but it remains unclear whether these models can further generate creative content. The present study aims to investigate the creative thinking of large language models through a cognitive perspective. We utilize the divergent association task (DAT), an objective measurement of creativity that asks models to generate unrelated words and calculates the semantic distance between them. We compare the results across different models and decoding strategies. Our findings indicate that: (1) When using the greedy search strategy, GPT-4 outperforms 96% of humans, while GPT-3.5-turbo exceeds the average human level. (2) Stochastic sampling and temperature scaling are effective to obtain higher DAT scores for models except GPT-4, but face a trade-off between creativity and stability. These results imply that advanced large language models have divergent semantic associations, which is a fundamental process underlying creativity.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2310.11158

Country:

North America > United States > New York (0.14)
Asia > Middle East > Qatar (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CTP:A Causal Interpretable Model for Non-Communicable Disease Progression Prediction

Sun, Zhoujian, Zhang, Wenzhuo, Huang, Zhengxing, Ding, Nai, Luo, Cheng

arXiv.org Artificial IntelligenceSep-22-2023

Non-communicable disease is the leading cause of death, emphasizing the need for accurate prediction of disease progression and informed clinical decision-making. Machine learning (ML) models have shown promise in this domain by capturing non-linear patterns within patient features. However, existing ML-based models cannot provide causal interpretable predictions and estimate treatment effects, limiting their decision-making perspective. In this study, we propose a novel model called causal trajectory prediction (CTP) to tackle the limitation. The CTP model combines trajectory prediction and causal discovery to enable accurate prediction of disease progression trajectories and uncover causal relationships between features. By incorporating a causal graph into the prediction process, CTP ensures that ancestor features are not influenced by the treatment of descendant features, thereby enhancing the interpretability of the model. By estimating the bounds of treatment effects, even in the presence of unmeasured confounders, the CTP provides valuable insights for clinical decision-making. We evaluate the performance of the CTP using simulated and real medical datasets. Experimental results demonstrate that our model achieves satisfactory performance, highlighting its potential to assist clinical decisions. Source code is in \href{https://github.com/DanielSun94/CFPA}{here}.

artificial intelligence, machine learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2308.09735

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Replicating Complex Dialogue Policy of Humans via Offline Imitation Learning with Supervised Regularization

Sun, Zhoujian, Zhao, Chenyang, Huang, Zhengxing, Ding, Nai

arXiv.org Artificial IntelligenceMay-6-2023

Policy learning (PL) is a module of a task-oriented dialogue system that trains an agent to make actions in each dialogue turn. Imitating human action is a fundamental problem of PL. However, both supervised learning (SL) and reinforcement learning (RL) frameworks cannot imitate humans well. Training RL models require online interactions with user simulators, while simulating complex human policy is hard. Performances of SL-based models are restricted because of the covariate shift problem. Specifically, a dialogue is a sequential decision-making process where slight differences in current utterances and actions will cause significant differences in subsequent utterances. Therefore, the generalize ability of SL models is restricted because statistical characteristics of training and testing dialogue data gradually become different. This study proposed an offline imitation learning model that learns policy from real dialogue datasets and does not require user simulators. It also utilizes state transition information, which alleviates the influence of the covariate shift problem. We introduced a regularization trick to make our model can be effectively optimized. We investigated the performance of our model on four independent public dialogue datasets. The experimental result showed that our model performed better in the action prediction task.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2305.03987

Country:

Europe (0.68)
North America > United States (0.47)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Language Cognition and Language Computation -- Human and Machine Language Understanding

Wang, Shaonan, Ding, Nai, Lin, Nan, Zhang, Jiajun, Zong, Chengqing

arXiv.org Artificial IntelligenceJan-11-2023

Language is a multilevel symbolic system that includes multiple levels: phonetics, morphology, syntax, semantics, and pragmatics. The most basic language symbols can be combined to form more complex and endless symbol sequences to allow flexible expression of meaning. As such, language is also considered the carrier of human thought and the most natural tool through which humans exchange ideas and express emotions. Because of the diverse and flexible characteristics of language, it is difficult to study the mechanism of human language understanding and to build a computation model that can understand language. In the early days of computer science, language research pioneers attempted to conduct cross-disciplinary research in computer science, linguistics, and cognitive science. They aimed to establish connections between human language-understanding mechanisms and language-computation models [1, 2, 3, 4, 5, 6]. However, owing to the complexity of the problem, interdisciplinary research has gradually become separated over the decades, forming subfields such as natural language understanding in computer science, psycholinguistics in cognitive psychology, and neurobiology of language research in cognitive neuroscience. In this paper, "cognitive science" mainly refers to the two fields of cognitive psychology and cognitive neuroscience, particularly the branches of psycholinguistics and the cognitive neuroscience of language [7]. Figure 1 shows the relationship between cognitive and computer science in the direction of language understanding. There are substantial differences in the research questions and methods adopted in the two fields.

machine learning, natural language, simulation of human behavior, (22 more...)

arXiv.org Artificial Intelligence

2301.04788

Country: Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Add feedback

The Neural Correlates of Linguistic Structure Building: Comments on Kazanina & Tavano (2022)

Ding, Nai

arXiv.org Artificial IntelligenceDec-8-2022

A recent perspective paper by Kazanina & Tavano (referred to as the KT perspective in the following) argues how neural oscillations cannot provide a potential neural correlate for syntactic structure building. The view that neural oscillations can provide a potential neural correlate for syntactic structure building is largely attributed to a study by Ding, Melloni, Zhang, Tian, and Poeppel in 2016 (referred to as the DMZTP study). The KT perspective is thought provoking, but has severe misinterpretations about the arguments in DMZTP and other studies, and contains contradictory conclusions in different parts of the perspective, making it impossible to understand the position of the authors. In the following, I summarize a few misinterpretations and inconsistent arguments in the KT perspective, and put forward a few suggestions for future studies. Neural activity can only track linear constituents?

artificial intelligence, kt perspective, natural language, (14 more...)

arXiv.org Artificial Intelligence

2212.04219

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.31)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

On Tracking Dialogue State by Inheriting Slot Values in Mentioned Slot Pools

Sun, Zhoujian, Huang, Zhengxing, Ding, Nai

arXiv.org Artificial IntelligenceFeb-14-2022

Dialogue state tracking (DST) is a component of the task-oriented dialogue system. It is responsible for extracting and managing slot values according to dialogue utterances, where each slot represents an essential part of the information to accomplish a task, and slot value is updated recurrently in each dialogue turn. However, many DST models cannot update slot values appropriately. These models may repeatedly inherit wrong slot values extracted in previous turns, resulting in the fail of the entire DST task.They cannot update indirectly mentioned slots well, either. This study designed a model with a mentioned slot pool (MSP) to tackle the update problem. The MSP is a slot-specific memory that records all mentioned slot values that may be inherited, and our model updates slot values according to the MSP and the dialogue context. Our model rejects inheriting the previous slot value when it predicates the value is wrong. Then, it re-extracts the slot value from the current dialogue context. As the contextual information accumulates with the dialogue progress, the new value is more likely to be correct. It also can track the indirectly mentioned slot by picking a value from the MSP. Experimental results showed our model reached state-of-the-art DST performance on MultiWOZ 2.1 and 2.2 datasets.

inheriting slot value, slot pool, tracking dialogue state

arXiv.org Artificial Intelligence

2202.07156

Genre: Research Report (0.89)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.60)

Add feedback

Deep Neural Networks Evolve Human-like Attention Distribution during Reading Comprehension

Zou, Jiajie, Ding, Nai

arXiv.org Artificial IntelligenceJul-12-2021

Attention is a key mechanism for information selection in both biological brains and many state-of-the-art deep neural networks (DNNs). Here, we investigate whether humans and DNNs allocate attention in comparable ways when reading a text passage to subsequently answer a specific question. We analyze 3 transformer-based DNNs that reach human-level performance when trained to perform the reading comprehension task. We find that the DNN attention distribution quantitatively resembles human attention distribution measured by fixation times. Human readers fixate longer on words that are more relevant to the question-answering task, demonstrating that attention is modulated by top-down reading goals, on top of lower-level visual and text features of the stimulus. Further analyses reveal that the attention weights in DNNs are also influenced by both top-down reading goals and lower-level stimulus features, with the shallow layers more strongly influenced by lower-level text features and the deep layers attending more to task-relevant words. Additionally, deep layers' attention to task-relevant words gradually emerges when pre-trained DNN models are fine-tuned to perform the reading comprehension task, which coincides with the improvement in task performance. These results demonstrate that DNNs can evolve human-like attention distribution through task optimization, which suggests that human attention during goal-directed reading comprehension is a consequence of task optimization.

attention weight, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

2107.05799

Country: Asia > China (0.15)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education > Assessment & Standards > Student Performance (1.00)
Education > Educational Setting (0.93)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback