AITopics | ta 3

Collaborating Authors

ta 3

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

"See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models

Gu, Jihao, Wang, Yingyao, Bu, Pi, Wang, Chen, Wang, Ziming, Song, Tengtao, Wei, Donglai, Yuan, Jiale, Zhao, Yingxiu, He, Yancheng, Li, Shilong, Liu, Jiaheng, Cao, Meng, Song, Jun, Tan, Yingshui, Li, Xiang, Su, Wenbo, Zheng, Zhicheng, Zhu, Xiaoyong, Zheng, Bo

arXiv.org Artificial IntelligenceFeb-17-2025

The evaluation of factual accuracy in large vision language models (LVLMs) has lagged behind their rapid development, making it challenging to fully reflect these models' knowledge capacity and reliability. In this paper, we introduce the first factuality-based visual question-answering benchmark in Chinese, named ChineseSimpleVQA, aimed at assessing the visual factuality of LVLMs across 8 major topics and 56 subtopics. The key features of this benchmark include a focus on the Chinese language, diverse knowledge types, a multi-hop question construction, high-quality data, static consistency, and easy-to-evaluate through short answers. Moreover, we contribute a rigorous data construction pipeline and decouple the visual factuality into two parts: seeing the world (i.e., object recognition) and discovering knowledge. This decoupling allows us to analyze the capability boundaries and execution mechanisms of LVLMs. Subsequently, we evaluate 34 advanced open-source and closed-source models, revealing critical performance gaps within this field.

artificial intelligence, data mining, natural language, (14 more...)

arXiv.org Artificial Intelligence

2502.11718

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (0.89)
Information Technology > Artificial Intelligence > Natural Language (0.89)
Information Technology > Data Science > Data Mining > Knowledge Discovery (0.40)

Add feedback

RelaMiX: Exploring Few-Shot Adaptation in Video-based Action Recognition

Peng, Kunyu, Wen, Di, Schneider, David, Zhang, Jiaming, Yang, Kailun, Sarfraz, M. Saquib, Stiefelhagen, Rainer, Roitberg, Alina

arXiv.org Artificial IntelligenceOct-28-2023

Domain adaptation is essential for activity recognition to ensure accurate and robust performance across diverse environments, sensor types, and data sources. Unsupervised domain adaptation methods have been extensively studied, yet, they require large-scale unlabeled data from the target domain. In this work, we address Few-Shot Domain Adaptation for video-based Activity Recognition (FSDA-AR), which leverages a very small amount of labeled target videos to achieve effective adaptation. This setting is attractive and promising for applications, as it requires recording and labeling only a few, or even a single example per class in the target domain, which often includes activities that are rare yet crucial to recognize. We construct FSDA-AR benchmarks using five established datasets considering diverse domain types: UCF101, HMDB51, EPIC-KITCHEN, Sims4Action, and ToyotaSmartHome. Our results demonstrate that FSDA-AR performs comparably to unsupervised domain adaptation with significantly fewer (yet labeled) target domain samples. We further propose a novel approach, RelaMiX, to better leverage the few labeled target domain samples as knowledge guidance. RelaMiX encompasses a temporal relational attention network with relation dropout, alongside a cross-domain information alignment mechanism. Furthermore, it integrates a mechanism for mixing features within a latent space by using the few-shot target domain samples. The proposed RelaMiX solution achieves state-of-the-art performance on all datasets within the FSDA-AR benchmark. To encourage future research of few-shot domain adaptation for video-based activity recognition, our benchmarks and source code are made publicly available at https://github.com/KPeng9510/RelaMiX.

domain adaptation, recognition, target domain, (14 more...)

arXiv.org Artificial Intelligence

2305.0842

Country:

Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Hunan Province > Changsha (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Building Concise Logical Patterns by Constraining Tsetlin Machine Clause Size

Abeyrathna, K. Darshana, Abouzeid, Ahmed Abdulrahem Othman, Bhattarai, Bimal, Giri, Charul, Glimsdal, Sondre, Granmo, Ole-Christoffer, Jiao, Lei, Saha, Rupsa, Sharma, Jivitesh, Tunheim, Svein Anders, Zhang, Xuan

arXiv.org Artificial IntelligenceJan-19-2023

Tsetlin machine (TM) is a logic-based machine learning approach with the crucial advantages of being transparent and hardware-friendly. While TMs match or surpass deep learning accuracy for an increasing number of applications, large clause pools tend to produce clauses with many literals (long clauses). As such, they become less interpretable. Further, longer clauses increase the switching activity of the clause logic in hardware, consuming more power. This paper introduces a novel variant of TM learning - Clause Size Constrained TMs (CSC-TMs) - where one can set a soft constraint on the clause size. As soon as a clause includes more literals than the constraint allows, it starts expelling literals. Accordingly, oversized clauses only appear transiently. To evaluate CSC-TM, we conduct classification, clustering, and regression experiments on tabular data, natural language text, images, and board games. Our results show that CSC-TM maintains accuracy with up to 80 times fewer literals. Indeed, the accuracy increases with shorter clauses for TREC, IMDb, and BBC Sports. After the accuracy peaks, it drops gracefully as the clause size approaches a single literal. We finally analyze CSC-TM power consumption and derive new convergence properties.

machine learning, natural language, ta 3, (19 more...)

arXiv.org Artificial Intelligence

2301.0819

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Norway > Southern Norway > Agder > Kristiansand (0.04)

Genre: Research Report > New Finding (0.86)

Industry: Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

On the Convergence of Tsetlin Machines for the AND and the OR Operators

Jiao, Lei, Zhang, Xuan, Granmo, Ole-Christoffer

arXiv.org Artificial IntelligenceSep-17-2021

The Tsetlin Machine (TM) is a novel machine-learning algorithm based on propositional logic, which has obtained state-of-the-art performance on several pattern recognition problems. In previous studies, the convergence properties of TM for 1-bit operation and XOR operation have been analyzed. To make the analyses for the basic digital operations complete, in this article, we analyze the convergence when input training samples follow AND and OR operators respectively. Our analyses reveal that the TM can converge almost surely to reproduce AND and OR operators, which are learnt from training data over an infinite time horizon. The analyses on AND and OR operators, together with the previously analysed 1-bit and XOR operations, complete the convergence analyses on basic operators in Boolean algebra.

ta 1, ta 3, ta 4, (17 more...)

arXiv.org Artificial Intelligence

2109.09488

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.34)

Add feedback