AITopics | Commonsense Reasoning

Collaborating Authors

Commonsense Reasoning

Knowledge that Everyone Knows. "People do not walk on their heads." The assertion comes about 900 statements deep into the 527,308 items that comprise the Open Mind common sense database. It's after "Laws are the rules of society" and before "The sky is blue during the day." This collection of mundane facts, which would take more than 20,000 pages to print out, consists entirely of statements so unremarkable they are barely worth stating. Most of us would correctly dismiss them as common sense.
– from D.C. Denison, Guess who's smarter. Boston Globe Online (page hosted at MIT), May 26, 2003.

News Overviews Instructional Materials AI-Alerts Classics

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Ren, Liliang, Liu, Yang, Lu, Yadong, Shen, Yelong, Liang, Chen, Chen, Weizhu

arXiv.org Artificial IntelligenceJun-11-2024

Efficiently modeling sequences with infinite context length has been a long-standing problem. Past works suffer from either the quadratic computation complexity or the limited extrapolation ability on length generalization. In this work, we present Samba, a simple hybrid architecture that layer-wise combines Mamba, a selective State Space Model (SSM), with Sliding Window Attention (SWA). Samba selectively compresses a given sequence into recurrent hidden states while still maintaining the ability to precisely recall memories with the attention mechanism. We scale Samba up to 3.8B parameters with 3.2T training tokens and show that Samba substantially outperforms the state-of-the-art models based on pure attention or SSMs on a wide range of benchmarks. When trained on 4K length sequences, Samba can be efficiently extrapolated to 256K context length with perfect memory recall and show improved token predictions up to 1M context length. As a linear-time sequence model, Samba enjoys a 3.73x higher throughput compared to Transformers with grouped-query attention when processing user prompts of 128K length, and 3.64x speedup when generating 64K tokens with unlimited streaming. A sample implementation of Samba is publicly available in https://github.com/microsoft/Samba.

architecture, arxiv preprint arxiv, context length, (14 more...)

arXiv.org Artificial Intelligence

2406.07522

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.68)

Add feedback

Improving Commonsense Bias Classification by Mitigating the Influence of Demographic Terms

Lee, JinKyu, Kim, Jihie

arXiv.org Artificial IntelligenceJun-11-2024

Understanding commonsense knowledge is crucial in the field of Natural Language Processing (NLP). However, the presence of demographic terms in commonsense knowledge poses a potential risk of compromising the performance of NLP models. This study aims to investigate and propose methods for enhancing the performance and effectiveness of a commonsense polarization classifier by mitigating the influence of demographic terms. Three methods are introduced in this paper: (1) hierarchical generalization of demographic terms (2) threshold-based augmentation and (3) integration of hierarchical generalization and threshold-based augmentation methods (IHTA). The first method involves replacing demographic terms with more general ones based on a term hierarchy ontology, aiming to mitigate the influence of specific terms. To address the limited bias-related information, the second method measures the polarization of demographic terms by comparing the changes in the model's predictions when these terms are masked versus unmasked. This method augments commonsense sentences containing terms with high polarization values by replacing their predicates with synonyms generated by ChatGPT. The third method combines the two approaches, starting with threshold-based augmentation followed by hierarchical generalization. The experiments show that the first method increases the accuracy over the baseline by 2.33%, and the second one by 0.96% over standard augmentation methods. The IHTA techniques yielded an 8.82% and 9.96% higher accuracy than threshold-based and standard augmentation methods, respectively.

accuracy, augmentation, demographic term, (15 more...)

arXiv.org Artificial Intelligence

2406.07229

Country:

North America > United States > California (0.14)
Asia > South Korea > Seoul > Seoul (0.05)
Asia > Afghanistan (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.55)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

mCSQA: Multilingual Commonsense Reasoning Dataset with Unified Creation Strategy by Language Models and Humans

Sakai, Yusuke, Kamigaito, Hidetaka, Watanabe, Taro

arXiv.org Artificial IntelligenceJun-6-2024

It is very challenging to curate a dataset for language-specific knowledge and common sense in order to evaluate natural language understanding capabilities of language models. Due to the limitation in the availability of annotators, most current multilingual datasets are created through translation, which cannot evaluate such language-specific aspects. Therefore, we propose Multilingual CommonsenseQA (mCSQA) based on the construction process of CSQA but leveraging language models for a more efficient construction, e.g., by asking LM to generate questions/answers, refine answers and verify QAs followed by reduced human efforts for verification. Constructed dataset is a benchmark for cross-lingual language-transfer capabilities of multilingual LMs, and experimental results showed high language-transfer capabilities for questions that LMs could easily solve, but lower transfer capabilities for questions requiring deep knowledge or commonsense. This highlights the necessity of language-specific datasets for evaluation and training. Finally, our method demonstrated that multilingual LMs could create QA including language-specific knowledge, significantly reducing the dataset creation cost compared to manual creation. The datasets are available at https://huggingface.co/datasets/yusuke1997/mCSQA.

computational linguistic, linguistic, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2406.04215

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.04)
North America > Canada > Ontario > Toronto (0.04)
(28 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)

Add feedback

PlanAgent: A Multi-modal Large Language Agent for Closed-loop Vehicle Motion Planning

Zheng, Yupeng, Xing, Zebin, Zhang, Qichao, Jin, Bu, Li, Pengfei, Zheng, Yuhang, Xia, Zhongpu, Zhan, Kun, Lang, Xianpeng, Chen, Yaran, Zhao, Dongbin

arXiv.org Artificial IntelligenceJun-4-2024

Vehicle motion planning is an essential component of autonomous driving technology. Current rule-based vehicle motion planning methods perform satisfactorily in common scenarios but struggle to generalize to long-tailed situations. Meanwhile, learning-based methods have yet to achieve superior performance over rule-based approaches in large-scale closed-loop scenarios. To address these issues, we propose PlanAgent, the first mid-to-mid planning system based on a Multi-modal Large Language Model (MLLM). MLLM is used as a cognitive agent to introduce human-like knowledge, interpretability, and common-sense reasoning into the closed-loop planning. Specifically, PlanAgent leverages the power of MLLM through three core modules. First, an Environment Transformation module constructs a Bird's Eye View (BEV) map and a lane-graph-based textual description from the environment as inputs. Second, a Reasoning Engine module introduces a hierarchical chain-of-thought from scene understanding to lateral and longitudinal motion instructions, culminating in planner code generation. Last, a Reflection module is integrated to simulate and evaluate the generated planner for reducing MLLM's uncertainty. PlanAgent is endowed with the common-sense reasoning and generalization capability of MLLM, which empowers it to effectively tackle both common and complex long-tailed scenarios. Our proposed PlanAgent is evaluated on the large-scale and challenging nuPlan benchmarks. A comprehensive set of experiments convincingly demonstrates that PlanAgent outperforms the existing state-of-the-art in the closed-loop motion planning task. Codes will be soon released.

information, planagent, scenario, (14 more...)

arXiv.org Artificial Intelligence

2406.01587

Country:

Asia > China > Beijing > Beijing (0.05)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Advanced Geothermal System (AGS) (1.00)
Transportation > Ground > Road (0.92)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(3 more...)

Add feedback

Octopi: Object Property Reasoning with Large Tactile-Language Models

Yu, Samson, Lin, Kelvin, Xiao, Anxing, Duan, Jiafei, Soh, Harold

arXiv.org Artificial IntelligenceJun-4-2024

Physical reasoning is important for effective robot manipulation. Recent work has investigated both vision and language modalities for physical reasoning; vision can reveal information about objects in the environment and language serves as an abstraction and communication medium for additional context. Although these works have demonstrated success on a variety of physical reasoning tasks, they are limited to physical properties that can be inferred from visual or language inputs. In this work, we investigate combining tactile perception with language, which enables embodied systems to obtain physical properties through interaction and apply commonsense reasoning. We contribute a new dataset PhysiCLeAR, which comprises both physical/property reasoning tasks and annotated tactile videos obtained using a GelSight tactile sensor. We then introduce Octopi, a system that leverages both tactile representation learning and large vision-language models to predict and reason about tactile inputs with minimal language fine-tuning. Our evaluations on PhysiCLeAR show that Octopi is able to effectively use intermediate physical property predictions to improve its performance on various tactile-related tasks. PhysiCLeAR and Octopi are available at https://github.com/clear-nus/octopi.

physical property, reasoning, tactile video, (13 more...)

arXiv.org Artificial Intelligence

2405.02794

Country:

Asia > Singapore (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Indonesia > Bali (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.68)
(2 more...)

Add feedback

Augmented Commonsense Knowledge for Remote Object Grounding

Mohammadi, Bahram, Hong, Yicong, Qi, Yuankai, Wu, Qi, Pan, Shirui, Shi, Javen Qinfeng

arXiv.org Artificial IntelligenceJun-3-2024

The vision-and-language navigation (VLN) task necessitates an agent to perceive the surroundings, follow natural language instructions, and act in photo-realistic unseen environments. Most of the existing methods employ the entire image or object features to represent navigable viewpoints. However, these representations are insufficient for proper action prediction, especially for the REVERIE task, which uses concise high-level instructions, such as ''Bring me the blue cushion in the master bedroom''. To address enhancing representation, we propose an augmented commonsense knowledge model (ACK) to leverage commonsense information as a spatio-temporal knowledge graph for improving agent navigation. Specifically, the proposed approach involves constructing a knowledge base by retrieving commonsense information from ConceptNet, followed by a refinement module to remove noisy and irrelevant knowledge. We further present ACK which consists of knowledge graph-aware cross-modal and concept aggregation modules to enhance visual representation and visual-textual data alignment by integrating visible objects, commonsense knowledge, and concept history, which includes object and knowledge temporal information. Moreover, we add a new pipeline for the commonsense-based decision-making process which leads to more accurate local action prediction. Experimental results demonstrate our proposed model noticeably outperforms the baseline and archives the state-of-the-art on the REVERIE benchmark.

commonsense knowledge, instruction, knowledge, (15 more...)

arXiv.org Artificial Intelligence

2406.01256

Country: Asia > South Korea > Busan > Busan (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.84)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.68)

Add feedback

Picturing Ambiguity: A Visual Twist on the Winograd Schema Challenge

Park, Brendan, Janecek, Madeline, Ezzati-Jivan, Naser, Li, Yifeng, Emami, Ali

arXiv.org Artificial IntelligenceJun-3-2024

Large Language Models (LLMs) have demonstrated remarkable success in tasks like the Winograd Schema Challenge (WSC), showcasing advanced textual common-sense reasoning. However, applying this reasoning to multimodal domains, where understanding text and images together is essential, remains a substantial challenge. To address this, we introduce WinoVis, a novel dataset specifically designed to probe text-to-image models on pronoun disambiguation within multimodal contexts. Utilizing GPT-4 for prompt generation and Diffusion Attentive Attribution Maps (DAAM) for heatmap analysis, we propose a novel evaluation framework that isolates the models' ability in pronoun disambiguation from other visual processing challenges. Evaluation of successive model versions reveals that, despite incremental advancements, Stable Diffusion 2.0 achieves a precision of 56.7% on WinoVis, only marginally surpassing random guessing. Further error analysis identifies important areas for future research aimed at advancing text-to-image models in their ability to interpret and interact with the complex visual world.

computational linguistic, proceedings, pronoun, (12 more...)

arXiv.org Artificial Intelligence

2405.16277

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(8 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models

Ankner, Zachary, Blakeney, Cody, Sreenivasan, Kartik, Marion, Max, Leavitt, Matthew L., Paul, Mansheej

arXiv.org Artificial IntelligenceMay-30-2024

In this work, we investigate whether small language models can determine high-quality subsets of large-scale text datasets that improve the performance of larger language models. While existing work has shown that pruning based on the perplexity of a larger model can yield high-quality data, we investigate whether smaller models can be used for perplexity-based pruning and how pruning is affected by the domain composition of the data being pruned. We demonstrate that for multiple dataset compositions, perplexity-based pruning of pretraining data can \emph{significantly} improve downstream task performance: pruning based on perplexities computed with a 125 million parameter model improves the average performance on downstream tasks of a 3 billion parameter model by up to 2.04 and achieves up to a $1.45\times$ reduction in pretraining steps to reach commensurate baseline performance. Furthermore, we demonstrate that such perplexity-based data pruning also yields downstream performance gains in the over-trained and data-constrained regimes.

dataset, perplexity, pruning, (14 more...)

arXiv.org Artificial Intelligence

2405.20541

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
(12 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Education (0.47)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning

Artkaew, Phakphum

arXiv.org Artificial IntelligenceMay-28-2024

Commonsense reasoning is one of the important aspect of natural language understanding, with several benchmarks developed to evaluate it. However, only a few of these benchmarks are available in languages other than English. Developing parallel benchmarks facilitates cross-lingual evaluation, enabling a better understanding of different languages. This research introduces a collection of Winograd Schemas in Thai, a novel dataset designed to evaluate commonsense reasoning capabilities in the context of the Thai language. Through a methodology involving native speakers, professional translators, and thorough validation, the schemas aim to closely reflect Thai language nuances, idioms, and cultural references while maintaining ambiguity and commonsense challenges. We evaluate the performance of popular large language models on this benchmark, revealing their strengths, limitations, and providing insights into the current state-of-the-art. Results indicate that while models like GPT-4 and Claude-3-Opus achieve high accuracy in English, their performance significantly drops in Thai, highlighting the need for further advancements in multilingual commonsense reasoning.

benchmark, language model, winograd schema, (13 more...)

arXiv.org Artificial Intelligence

2405.18375

Country:

North America > United States > New York (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback

Acquiring and Modelling Abstract Commonsense Knowledge via Conceptualization

He, Mutian, Fang, Tianqing, Wang, Weiqi, Song, Yangqiu

arXiv.org Artificial IntelligenceMay-18-2024

Conceptualization, or viewing entities and situations as instances of abstract concepts in mind and making inferences based on that, is a vital component in human intelligence for commonsense reasoning. Despite recent progress in artificial intelligence to acquire and model commonsense attributed to neural language models and commonsense knowledge graphs (CKGs), conceptualization is yet to be introduced thoroughly, making current approaches ineffective to cover knowledge about countless diverse entities and situations in the real world. To address the problem, we thoroughly study the role of conceptualization in commonsense reasoning, and formulate a framework to replicate human conceptual induction by acquiring abstract knowledge about events regarding abstract concepts, as well as higher-level triples or inferences upon them. We then apply the framework to ATOMIC, a large-scale human-annotated CKG, aided by the taxonomy Probase. We annotate a dataset on the validity of contextualized conceptualizations from ATOMIC on both event and triple levels, develop a series of heuristic rules based on linguistic features, and train a set of neural models to generate and verify abstract knowledge. Based on these components, a pipeline to acquire abstract knowledge is built. A large abstract CKG upon ATOMIC is then induced, ready to be instantiated to infer about unseen entities or situations. Finally, we empirically show the benefits of augmenting CKGs with abstract knowledge in downstream tasks like commonsense inference and zero-shot commonsense QA.

conceptualization, knowledge, proceedings, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.artint.2024.104149

2206.01532

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.14)
(52 more...)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (1.00)
Education (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(2 more...)

Add feedback