AITopics | Du, Xinya

Plotting

Du, Xinya

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Large Language Models for Automated Open-domain Scientific Hypotheses Discovery

Yang, Zonglin, Du, Xinya, Li, Junxian, Zheng, Jie, Poria, Soujanya, Cambria, Erik

arXiv.org Artificial IntelligenceSep-6-2023

Hypothetical induction is recognized as the main reasoning type when scientists make observations about the world and try to propose hypotheses to explain those observations. Past research on hypothetical induction has a limited setting that (1) the observation annotations of the dataset are not raw web corpus but are manually selected sentences (resulting in a close-domain setting); and (2) the ground truth hypotheses annotations are mostly commonsense knowledge, making the task less challenging. In this work, we propose the first NLP dataset for social science academic hypotheses discovery, consisting of 50 recent papers published in top social science journals. Raw web corpora that are necessary for developing hypotheses in the published papers are also collected in the dataset, with the final goal of creating a system that automatically generates valid, novel, and helpful (to human researchers) hypotheses, given only a pile of raw web corpora. The new dataset can tackle the previous problems because it requires to (1) use raw web corpora as observations; and (2) propose hypotheses even new to humanity. A multi-module framework is developed for the task, as well as three different feedback mechanisms that empirically show performance gain over the base framework. Finally, our framework exhibits high performance in terms of both GPT-4 based evaluation and social science expert evaluation.

automated open-domain scientific hypothesis discovery, large language model, machine learning, (4 more...)

arXiv.org Artificial Intelligence

2309.02726

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

Proto-CLIP: Vision-Language Prototypical Network for Few-Shot Learning

P, Jishnu Jaykumar, Palanisamy, Kamalesh, Chao, Yu-Wei, Du, Xinya, Xiang, Yu

arXiv.org Artificial IntelligenceJul-8-2023

We propose a novel framework for few-shot learning by leveraging large-scale vision-language models such as CLIP. Motivated by the unimodal prototypical networks for few-shot learning, we introduce PROTO-CLIP that utilizes image prototypes and text prototypes for few-shot learning. Specifically, PROTO-CLIP adapts the image encoder and text encoder in CLIP in a joint fashion using few-shot examples. The two encoders are used to compute prototypes of image classes for classification. During adaptation, we propose aligning the image and text prototypes of corresponding classes. Such a proposed alignment is beneficial for few-shot classification due to the contributions from both types of prototypes. We demonstrate the effectiveness of our method by conducting experiments on benchmark datasets for few-shot learning as well as in the real world for robot perception.

artificial intelligence, few-shot learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2307.03073

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations

Li, Ruosen, Patel, Teerth, Du, Xinya

arXiv.org Artificial IntelligenceJul-6-2023

Nowadays, the quality of responses generated by different modern large language models (LLMs) are hard to evaluate and compare automatically. Recent studies suggest and predominantly use LLMs as a reference-free metric for open-ended question answering. More specifically, they use the recognized "strongest" LLM as the evaluator, which conducts pairwise comparisons of candidate models' answers and provides a ranking score. However, this intuitive method has multiple problems, such as bringing in self-enhancement (favoring its own answers) and positional bias. We draw insights and lessons from the educational domain (Cho and MacArthur, 2011; Walsh, 2014) to improve LLM-based evaluations. Specifically, we propose the (1) peer rank (PR) algorithm that takes into account each peer LLM's pairwise preferences of all answer pairs, and outputs a final ranking of models; and (2) peer discussion (PD), where we prompt two LLMs to discuss and try to reach a mutual agreement on preferences of two answers. We conduct experiments on two benchmark datasets. We find that our approaches achieve higher accuracy and align better with human judgments, respectively. Interestingly, PR can induce a relatively accurate self-ranking of models under the anonymous setting, where each model's name is unrevealed. Our work provides space to explore evaluating models that are hard to compare for humans.

information, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2307.02762

Genre:

Research Report > Experimental Study (0.48)
Research Report > New Finding (0.34)

Industry:

Banking & Finance (0.47)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Zero-Shot Classification by Logical Reasoning on Natural Language Explanations

Han, Chi, Pei, Hengzhi, Du, Xinya, Ji, Heng

arXiv.org Artificial IntelligenceMay-25-2023

Humans can classify data of an unseen category by reasoning on its language explanations. This ability is owing to the compositional nature of language: we can combine previously seen attributes to describe the new category. For example, we might describe a sage thrasher as "it has a slim straight relatively short bill, yellow eyes and a long tail", so that others can use their knowledge of attributes "slim straight relatively short bill", "yellow eyes" and "long tail" to recognize a sage thrasher. Inspired by this observation, in this work we tackle zero-shot classification task by logically parsing and reasoning on natural language expla-nations. To this end, we propose the framework CLORE (Classification by LOgical Reasoning on Explanations). While previous methods usually regard textual information as implicit features, CLORE parses explanations into logical structures and then explicitly reasons along thess structures on the input to produce a classification score. Experimental results on explanation-based zero-shot classification benchmarks demonstrate that CLORE is superior to baselines, which we further show mainly comes from higher scores on tasks requiring more logical reasoning. We also demonstrate that our framework can be extended to zero-shot classification on visual modality. Alongside classification decisions, CLORE can provide the logical parsing and reasoning process as a clear form of rationale. Through empirical analysis we demonstrate that CLORE is also less affected by linguistic biases than baselines.

explanation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2211.03252

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Logical Entity Representation in Knowledge-Graphs for Differentiable Rule Learning

Han, Chi, He, Qizheng, Yu, Charles, Du, Xinya, Tong, Hanghang, Ji, Heng

arXiv.org Artificial IntelligenceMay-22-2023

Probabilistic logical rule learning has shown great strength in logical rule mining and knowledge graph completion. It learns logical rules to predict missing edges by reasoning on existing edges in the knowledge graph. However, previous efforts have largely been limited to only modeling chain-like Horn clauses such as $R_1(x,z)\land R_2(z,y)\Rightarrow H(x,y)$. This formulation overlooks additional contextual information from neighboring sub-graphs of entity variables $x$, $y$ and $z$. Intuitively, there is a large gap here, as local sub-graphs have been found to provide important information for knowledge graph completion. Inspired by these observations, we propose Logical Entity RePresentation (LERP) to encode contextual information of entities in the knowledge graph. A LERP is designed as a vector of probabilistic logical functions on the entity's neighboring sub-graph. It is an interpretable representation while allowing for differentiable optimization. We can then incorporate LERP into probabilistic logical rule learning to learn more expressive rules. Empirical results demonstrate that with LERP, our model outperforms other rule learning methods in knowledge graph completion and is comparable or even superior to state-of-the-art black-box methods. Moreover, we find that our model can discover a more expressive family of logical rules. LERP can also be further combined with embedding learning methods like TransE to make it more interpretable.

artificial intelligence, logic & formal reasoning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.12738

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.84)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Association Learning (1.00)
(2 more...)

Add feedback

Logical Reasoning over Natural Language as Knowledge Representation: A Survey

Yang, Zonglin, Du, Xinya, Mao, Rui, Ni, Jinjie, Cambria, Erik

arXiv.org Artificial IntelligenceMar-21-2023

Logical reasoning is central to human cognition and intelligence. Past research of logical reasoning within AI uses formal language as knowledge representation~(and symbolic reasoners). However, reasoning with formal language has proved challenging~(e.g., brittleness and knowledge-acquisition bottleneck). This paper provides a comprehensive overview on a new paradigm of logical reasoning, which uses natural language as knowledge representation~(and pretrained language models as reasoners), including philosophical definition and categorization of logical reasoning, advantages of the new paradigm, benchmarks and methods, challenges of the new paradigm, desirable tasks & methods in the future, and relation to related NLP fields. This new paradigm is promising since it not only alleviates many challenges of formal representation but also has advantages over end-to-end neural methods.

artificial intelligence, expert system, survey article, (15 more...)

arXiv.org Artificial Intelligence

2303.12023

Country:

North America > United States (1.00)
Europe > Austria > Vienna (0.14)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Language Models as Inductive Reasoners

Yang, Zonglin, Dong, Li, Du, Xinya, Cheng, Hao, Cambria, Erik, Liu, Xiaodong, Gao, Jianfeng, Wei, Furu

arXiv.org Artificial IntelligenceDec-21-2022

Inductive reasoning is a core component of human intelligence. In the past research of inductive reasoning within computer science, logic language is used as representations of knowledge (facts and rules, more specifically). However, logic language can cause systematic problems for inductive reasoning such as disability of handling raw input such as natural language, sensitiveness to mislabeled data, and incapacity to handle ambiguous input. To this end, we propose a new task, which is to induce natural language rules from natural language facts, and create a dataset termed DEER containing 1.2k rule-fact pairs for the task, where rules and facts are written in natural language. New automatic metrics are also proposed and analysed for the evaluation of this task. With DEER, we investigate a modern approach for inductive reasoning where we use natural language as representation for knowledge instead of logic language and use pretrained language models as ''reasoners''. Moreover, we provide the first and comprehensive analysis of how well pretrained language models can induce natural language rules from natural language facts. We also propose a new framework drawing insights from philosophy literature for this task, which we show in the experiment section that surpasses baselines in both automatic and human evaluations.

artificial intelligence, logic & formal reasoning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2212.10923

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Retrieval-Augmented Generative Question Answering for Event Argument Extraction

Du, Xinya, Ji, Heng

arXiv.org Artificial IntelligenceNov-13-2022

Event argument extraction has long been studied as a sequential prediction problem with extractive-based methods, tackling each argument in isolation. Although recent work proposes generation-based methods to capture cross-argument dependency, they require generating and post-processing a complicated target sequence (template). Motivated by these observations and recent pretrained language models' capabilities of learning from demonstrations. We propose a retrieval-augmented generative QA model (R-GQA) for event argument extraction. It retrieves the most similar QA pair and augments it as prompt to the current example's context, then decodes the arguments as answers. Our approach outperforms substantially prior methods across various settings (i.e. fully supervised, domain transfer, and fewshot learning). Finally, we propose a clustering-based sampling strategy (JointEnc) and conduct a thorough analysis of how different strategies influence the few-shot learning performance. The implementations are available at https:// github.com/xinyadu/RGQA

machine learning, natural language, question answering, (20 more...)

arXiv.org Artificial Intelligence

2211.07067

Country:

Europe (0.94)
North America > United States > California (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Industry:

Law (1.00)
Government (0.94)
Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Improving Event Duration Prediction via Time-aware Pre-training

Yang, Zonglin, Du, Xinya, Rush, Alexander, Cardie, Claire

arXiv.org Artificial IntelligenceNov-4-2020

End-to-end models in NLP rarely encode external world knowledge about length of time. We introduce two effective models for duration prediction, which incorporate external knowledge by reading temporal-related news sentences (time-aware pre-training). Specifically, one model predicts the range/unit where the duration value falls in (R-pred); and the other predicts the exact duration value E-pred. Our best model -- E-pred, substantially outperforms previous work, and captures duration information more accurately than R-pred. We also demonstrate our models are capable of duration prediction in the unsupervised setting, outperforming the baselines.

computational linguistics, deep learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2011.0261

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.47)

Add feedback