AITopics | aser

Collaborating Authors

aser

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TASER: Table Agents for Schema-guided Extraction and Recommendation

Cho, Nicole, Fielding, Kirsty, Watson, William, Ganesh, Sumitra, Veloso, Manuela

arXiv.org Artificial IntelligenceOct-16-2025

Real-world financial documents report essential information about an entity's financial holdings that can span millions of different financial instrument types. Yet, these details are often buried in messy, multi-page, fragmented tables - for example, 99.4% of the tables in our dataset have no bounding boxes with the maximum number of rows amounting to 426 per table across 44 pages. To tackle these unique challenges from real-world tables, we present a continuously learning, agentic table extraction system, TASER (Table Agents for Schema-guided Extraction and Recommendation) that extracts highly unstructured, multi-page, heterogeneous tables into normalized, schema-conforming outputs. Our table agents execute on table detection, classification, extraction, and recommendations by leveraging an initial schema. Then, our Recommender Agent reviews the outputs, recommends schema revisions, and decides on the final recommendations, enabling TASER to outperform existing table detection models such as Table Transformer by 10.1%. Within this continuous learning process, we highlight that larger batch sizes result in a 104.3% increase in schema recommendations that are actionable and utilized, resulting in a 9.8% increase in extracted holdings - highlighting the importance of a continuous learning process. To train TASER, we have manually labeled 22,584 pages (28,150,449 tokens), 3,213 tables for $731,685,511,687 of holdings culminating in one of the first real financial table datasets. We release our dataset TASERTab to enable the research community to access real-world financial tables and outputs. Our results highlight the promise of agentic, schema-guided extraction systems for robust understanding of real-world financial tables.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2508.13404

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)
Asia > Middle East > UAE (0.28)

Genre:

Research Report (0.70)
Overview (0.46)

Industry:

Information Technology (1.00)
Banking & Finance > Trading (1.00)
Materials > Metals & Mining (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization

Zhao, Weibo, Shi, Yubin, Lyu, Xinyu, Sui, Wanchen, Li, Shen, Li, Yong

arXiv.org Artificial IntelligenceDec-11-2024

Quantization stands as a pivotal technique for large language model (LLM) serving, yet it poses significant challenges particularly in achieving effective low-bit quantization. The limited numerical mapping makes the quantized model produce a non-trivial error, bringing out intolerable performance degration. This paper is anchored in the basic idea of model compression objectives, and delves into the layer-wise error distribution of LLMs during post-training quantization. Subsequently, we introduce ASER, an algorithm consisting of (1) Error Reconstruction: low-rank compensation for quantization error with LoRA-style matrices constructed by whitening SVD; (2) Activation Smoothing: outlier extraction to gain smooth activation and better error compensation. ASER is capable of quantizing typical LLMs to low-bit ones, particularly preserving accuracy even in W4A8 per-channel setup. Experimental results show that ASER is competitive among the state-of-the-art quantization algorithms, showing potential to activation quantization, with minor overhead.

activation, quantization, quantization error, (15 more...)

arXiv.org Artificial Intelligence

2411.07762

Country:

North America > United States > New Jersey (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Automatic Extraction of Relationships among Motivations, Emotions and Actions from Natural Language Texts

Yang, Fei

arXiv.org Artificial IntelligenceAug-1-2024

We propose a new graph-based framework to reveal relationships among motivations, emotions and actions explicitly given natural language texts. A directed acyclic graph is designed to describe human's nature. Nurture beliefs are incorporated to connect outside events and the human's nature graph. No annotation resources are required due to the power of large language models. Amazon Fine Foods Reviews dataset is used as corpus and food-related motivations are focused. Totally 92,990 relationship graphs are generated, of which 63% make logical sense. We make further analysis to investigate error types for optimization direction in future research.

motivation, nature design, node, (15 more...)

arXiv.org Artificial Intelligence

2408.00966

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.50)

Add feedback

IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce

Ding, Wenxuan, Wang, Weiqi, Kwok, Sze Heng Douglas, Liu, Minghao, Fang, Tianqing, Bai, Jiaxin, He, Junxian, Song, Yangqiu

arXiv.org Artificial IntelligenceJun-14-2024

Enhancing Language Models' (LMs) ability to understand purchase intentions in E-commerce scenarios is crucial for their effective assistance in various downstream tasks. However, previous approaches that distill intentions from LMs often fail to generate meaningful and human-centric intentions applicable in real-world E-commerce contexts. This raises concerns about the true comprehension and utilization of purchase intentions by LMs. In this paper, we present IntentionQA, a double-task multiple-choice question answering benchmark to evaluate LMs' comprehension of purchase intentions in E-commerce. Specifically, LMs are tasked to infer intentions based on purchased products and utilize them to predict additional purchases. IntentionQA consists of 4,360 carefully curated problems across three difficulty levels, constructed using an automated pipeline to ensure scalability on large E-commerce platforms. Human evaluations demonstrate the high quality and low false-negative rate of our benchmark. Extensive experiments across 19 language models show that they still struggle with certain scenarios, such as understanding products and intentions accurately, jointly reasoning with products and intentions, and more, in which they fall far behind human performances. Our code and data are publicly available at https://github.com/HKUST-KnowComp/IntentionQA.

customer, intention, language model, (16 more...)

arXiv.org Artificial Intelligence

2406.10173

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Singapore (0.05)
North America > Canada > Ontario > Toronto (0.04)
(15 more...)

Genre: Research Report (0.64)

Industry: Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.90)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)

Add feedback

Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation Dataset

Fang, Tianqing, Wang, Weiqi, Choi, Sehyun, Hao, Shibo, Zhang, Hongming, Song, Yangqiu, He, Bin

arXiv.org Artificial IntelligenceSep-15-2021

Reasoning over commonsense knowledge bases (CSKB) whose elements are in the form of free-text is an important yet hard task in NLP. While CSKB completion only fills the missing links within the domain of the CSKB, CSKB population is alternatively proposed with the goal of reasoning unseen assertions from external resources. In this task, CSKBs are grounded to a large-scale eventuality (activity, state, and event) graph to discriminate whether novel triples from the eventuality graph are plausible or not. However, existing evaluations on the population task are either not accurate (automatic evaluation with randomly sampled negative examples) or of small scale (human annotation). In this paper, we benchmark the CSKB population task with a new large-scale dataset by first aligning four popular CSKBs, and then presenting a high-quality human-annotated evaluation set to probe neural models' commonsense reasoning ability. We also propose a novel inductive commonsense reasoning model that reasons over graphs. Experimental results show that generalizing commonsense reasoning on unseen assertions is inherently a hard task. Models achieving high accuracy during training perform poorly on the evaluation set, with a large gap between human performance. We will make the data publicly available for future contributions. Codes and data are available at https://github.com/HKUST-KnowComp/CSKB-Population.

aser, personx, relation, (14 more...)

arXiv.org Artificial Intelligence

2109.07679

Country:

Asia > China > Hong Kong (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)

Add feedback

ASER: Towards Large-scale Commonsense Knowledge Acquisition via Higher-order Selectional Preference over Eventualities

Zhang, Hongming, Liu, Xin, Pan, Haojie, Ke, Haowen, Ou, Jiefu, Fang, Tianqing, Song, Yangqiu

arXiv.org Artificial IntelligenceApr-5-2021

Commonsense knowledge acquisition and reasoning have long been a core artificial intelligence problem. However, in the past, there has been a lack of scalable methods to collect commonsense knowledge. In this paper, we propose to develop principles for collecting commonsense knowledge based on selectional preference. We generalize the definition of selectional preference from one-hop linguistic syntactic relations to higher-order relations over linguistic graphs. Unlike previous commonsense knowledge definition (e.g., ConceptNet), the selectional preference (SP) knowledge only relies on statistical distribution over linguistic graphs, which can be efficiently and accurately acquired from the unlabeled corpus with modern tools. Following this principle, we develop a large-scale eventuality (a linguistic term covering activity, state, and event)-based knowledge graph ASER, where each eventuality is represented as a dependency graph, and the relation between them is a discourse relation defined in shallow discourse parsing. The higher-order selectional preference over collected linguistic graphs reflects various kinds of commonsense knowledge. Moreover, motivated by the observation that humans understand events by abstracting the observed events to a higher level and can thus transferring their knowledge to new events, we propose a conceptualization module to significantly boost the coverage of ASER. In total, ASER contains 438 million eventualities and 648 million edges between eventualities. After conceptualization with Probase, a selectional preference based concept-instance relational knowledge base, our concept graph contains 15 million conceptualized eventualities and 224 million edges between them. Detailed analysis is provided to demonstrate its quality. All the collected data, APIs, and tools are available at https://github.com/HKUST-KnowComp/ASER.

eventuality, knowledge, relation, (17 more...)

arXiv.org Artificial Intelligence

2104.02137

Country:

Oceania > Australia > Victoria > Melbourne (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
(23 more...)

Genre: Research Report (0.63)

Industry:

Media (0.67)
Education (0.67)
Consumer Products & Services (0.67)
Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)

Add feedback

DISCOS: Bridging the Gap between Discourse Knowledge and Commonsense Knowledge

Fang, Tianqing, Zhang, Hongming, Wang, Weiqi, Song, Yangqiu, He, Bin

arXiv.org Artificial IntelligenceDec-31-2020

Commonsense knowledge is crucial for artificial intelligence systems to understand natural language. Previous commonsense knowledge acquisition approaches typically rely on human annotations (e.g., ATOMIC) or text generation models (e.g., COMET). Human annotation could provide high-quality commonsense knowledge, yet its high cost often results in relatively small scale and low coverage. On the other hand, generation models have the potential to automatically generate more knowledge. Nonetheless, machine learning models often fit the training data too well to generate novel knowledge in high quality, thus still suffering from coverage problems. To address the limitations of previous approaches, in this paper, we propose an alternative commonsense knowledge acquisition framework DISCOS (from DIScourse to COmmonSense), which automatically mines expensive complex commonsense knowledge from more affordable linguistic knowledge resources. Experiments demonstrate that we can successfully convert discourse knowledge over eventualities from ASER, a large-scale discourse knowledge graph, into inferential if-then commonsense knowledge defined in ATOMIC without any additional annotation effort. Further study suggests that DISCOS significantly outperforms previous supervised approaches in terms of novelty and diversity with comparable quality. In total, we can acquire 3.4M ATOMIC-like inferential commonsense knowledge by populating ATOMIC on the core part of ASER. Codes and data are available at https://github.com/HKUST-KnowComp/DISCOS-commonsense.

commonsense knowledge, knowledge, relation, (16 more...)

arXiv.org Artificial Intelligence

2101.00154

Country:

Asia > China (0.05)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)

Add feedback

On the Role of Conceptualization in Commonsense Knowledge Graph Construction

He, Mutian, Song, Yangqiu, Xu, Kun, Yu, Dong

arXiv.org Artificial IntelligenceApr-7-2020

Commonsense knowledge graphs (CKGs) like Atomic and ASER are substantially different from conventional KGs as they consist of much larger number of nodes formed by loosely-structured text, which, though, enables them to handle highly diverse queries in natural language related to commonsense, leads to unique challenges for automatic KG construction methods. Besides identifying relations absent from the KG between nodes, such methods are also expected to explore absent nodes represented by text, in which different real-world things, or entities, may appear. To deal with the innumerable entities involved with commonsense in the real world, we introduce to CKG construction methods conceptualization, i.e., to view entities mentioned in text as instances of specific concepts or vice versa. We build synthetic triples by conceptualization, and further formulate the task as triple classification, handled by a discriminatory model with knowledge transferred from pretrained language models and fine-tuned by negative sampling. Experiments demonstrate that our methods can effectively identify plausible triples and expand the KG by triples of both new nodes and edges of high diversity and novelty.

conceptualization, node, relation, (13 more...)

arXiv.org Artificial Intelligence

2003.03239

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.62)

Add feedback

ASER: A Large-scale Eventuality Knowledge Graph

Zhang, Hongming, Liu, Xin, Pan, Haojie, Song, Yangqiu, Wing-Ki, Cane, Leung, null

arXiv.org Artificial IntelligenceMay-1-2019

Understanding human's language requires complex world knowledge. However, existing large-scale knowledge graphs mainly focus on knowledge about entities while ignoring knowledge about activities, states, or events, which are used to describe how entities or things act in the real world. To fill this gap, we develop ASER (activities, states, events, and their relations), a large-scale eventuality knowledge graph extracted from more than 11-billion-token unstructured textual data. ASER contains 15 relation types belonging to five categories, 194-million unique eventualities, and 64-million unique edges among them. Both human and extrinsic evaluations demonstrate the quality and effectiveness of ASER.

artificial intelligence, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

1905.0027

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.95)
(3 more...)

Add feedback