AITopics | capital city

Collaborating Authors

capital city

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sparse Feature Coactivation Reveals Causal Semantic Modules in Large Language Models

Deng, Ruixuan, Hu, Xiaoyang, Gilberti, Miles, Storks, Shane, Taxali, Aman, Angstadt, Mike, Sripada, Chandra, Chai, Joyce

arXiv.org Artificial IntelligenceOct-22-2025

We identify semantically coherent, context-consistent network components in large language models (LLMs) using coactivation of sparse autoencoder (SAE) features collected from just a handful of prompts. Focusing on concept-relation prediction tasks, we show that ablating these components for concepts (e.g., countries and words) and relations (e.g., capital city and translation language) changes model outputs in predictable ways, while amplifying these components induces counterfactual responses. Notably, composing relation and concept components yields compound counterfactual outputs. Further analysis reveals that while most concept components emerge from the very first layer, more abstract relation components are concentrated in later layers. Lastly, we show that extracted components more comprehensively capture concepts and relations than individual features while maintaining specificity. Overall, our findings suggest a modular organization of knowledge accessed through compositional operations, and advance methods for efficient, targeted LLM manipulation.

large language model, machine learning, relation, (17 more...)

arXiv.org Artificial Intelligence

2506.18141

Country:

Europe (1.00)
Asia > Middle East (0.46)
North America > United States (0.46)
(3 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

775226eaa2a36c543e2bd6cc9eae1b6a-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 06:31:31 GMT

factual association, knowledge, language model, (15 more...)

Neural Information Processing Systems

Country:

Europe > France (0.05)
North America > Canada > Ontario > Toronto (0.04)
Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Taming Knowledge Conflicts in Language Models

Li, Gaotang, Chen, Yuzhong, Tong, Hanghang

arXiv.org Artificial IntelligenceMar-13-2025

Language Models (LMs) often encounter knowledge conflicts when parametric memory contradicts contextual knowledge. Previous works attribute this conflict to the interplay between "memory heads" and "context heads", attention heads assumed to promote either memory or context exclusively. In this study, we go beyond this fundamental assumption by uncovering a critical phenomenon we term the "superposition of contextual information and parametric memory", where highly influential attention heads could simultaneously contribute to both memory and context. Building upon this insight, we propose Just Run Twice (JUICE), a test-time attention intervention method that steers LMs toward either parametric beliefs or contextual knowledge without requiring fine-tuning. JUICE identifies a set of reliable attention heads and leverages a dual-run approach to mitigate the superposition effects. Extensive experiments across 11 datasets and 6 model architectures demonstrate that JUICE sets the new state-of-the-art performance and robust generalization, achieving significant and consistent improvement across different domains under various conflict types. Finally, we theoretically analyze knowledge conflict and the superposition of contextual information and parametric memory in attention heads, which further elucidates the effectiveness of JUICE in these settings.

arxiv preprint arxiv, conflict, knowledge conflict, (10 more...)

arXiv.org Artificial Intelligence

2503.10996

Country:

Europe > France (0.04)
Asia > China > Beijing > Beijing (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ReAgent: Reversible Multi-Agent Reasoning for Knowledge-Enhanced Multi-Hop QA

Xinjie, Zhao, Gao, Fan, Yang, Rui, Chen, Yingjian, Wang, Yuyang, Zhu, Ying, Tang, Jiacheng, Li, Irene

arXiv.org Artificial IntelligenceMar-10-2025

Recent advances in large language models (LLMs) have significantly improved multi-hop question answering (QA) through direct Chain-of-Thought (CoT) reasoning. However, the irreversible nature of CoT leads to error accumulation, making it challenging to correct mistakes in multi-hop reasoning. This paper introduces ReAgent: a Reversible multi-Agent collaborative framework augmented with explicit backtracking mechanisms, enabling reversible multi-hop reasoning. By incorporating text-based retrieval, information aggregation and validation, our system can detect and correct errors mid-reasoning, leading to more robust and interpretable QA outcomes. The framework and experiments serve as a foundation for future work on error-tolerant QA systems. Empirical evaluations across three benchmarks indicate ReAgent's efficacy, yielding average about 6\% improvements against baseline models.

agent, california, conflict, (14 more...)

arXiv.org Artificial Intelligence

2503.06951

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.32)
North America > United States > California > Sacramento County > Sacramento (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(8 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.67)
Leisure & Entertainment > Sports > Olympic Games (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Probing Language Models on Their Knowledge Source

Tighidet, Zineddine, Mogini, Andrea, Mei, Jiali, Piwowarski, Benjamin, Gallinari, Patrick

arXiv.org Artificial IntelligenceNov-9-2024

Large Language Models (LLMs) often encounter conflicts between their learned, internal (parametric knowledge, PK) and external knowledge provided during inference (contextual knowledge, CK). Understanding how LLMs models prioritize one knowledge source over the other remains a challenge. In this paper, we propose a novel probing framework to explore the mechanisms governing the selection between PK and CK in LLMs. Using controlled prompts designed to contradict the model's PK, we demonstrate that specific model activations are indicative of the knowledge source employed. We evaluate this framework on various LLMs of different sizes and demonstrate that mid-layer activations, particularly those related to relations in the input, are crucial in predicting knowledge source selection, paving the way for more reliable models capable of handling knowledge conflicts effectively.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.05817

Country:

Europe > Croatia (0.14)
North America > United States > Virginia (0.05)
Europe > Italy (0.05)
(17 more...)

Genre: Research Report > New Finding (0.68)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Co-occurrence is not Factual Association in Language Models

Zhang, Xiao, Li, Miao, Wu, Ji

arXiv.org Artificial IntelligenceSep-21-2024

Pretrained language models can encode a large amount of knowledge and utilize it for various reasoning tasks, yet they can still struggle to learn novel factual knowledge effectively from finetuning on limited textual demonstrations. In this work, we show that the reason for this deficiency is that language models are biased to learn word co-occurrence statistics instead of true factual associations. We identify the differences between two forms of knowledge representation in language models: knowledge in the form of co-occurrence statistics is encoded in the middle layers of the transformer model and does not generalize well to reasoning scenarios beyond simple question answering, while true factual associations are encoded in the lower layers and can be freely utilized in various reasoning tasks. Based on these observations, we propose two strategies to improve the learning of factual associations in language models. We show that training on text with implicit rather than explicit factual associations can force the model to learn factual associations instead of co-occurrence statistics, significantly improving the generalization of newly learned knowledge. We also propose a simple training method to actively forget the learned co-occurrence statistics, which unblocks and enhances the learning of factual associations when training on plain narrative text. On both synthetic and real-world corpora, the two proposed strategies improve the generalization of the knowledge learned during finetuning to reasoning scenarios such as indirect and multi-hop question answering.

factual association, knowledge, language model, (14 more...)

arXiv.org Artificial Intelligence

2409.14057

Country:

Europe > France (0.05)
North America > Canada > Ontario > Toronto (0.04)
Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)

Genre: Research Report (0.82)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Wyoming man running as bot concedes race, launches 'alliance' to inject AI into politics

FOX NewsAug-22-2024, 16:01:38 GMT

Wyoming man Victor Miller, who filed mayoral candidacy as AI bot "VIC," speaks out after OpenAI shuts down his account. Victor Miller, who had been running as an artificial intelligence-powered bot named "VIC" [Virtual Integrated Citizen] in Wyoming's capital city, conceded his bid to make technological political history on Wednesday. Miller received 327 votes, or about 3% of the total cast, in Cheyenne's nonpartisan mayoral primary on Tuesday night, according to Laramie County records. On Wednesday, Fox News Digital obtained a statement from Miller saying that he and VIC came up short in their bid to change the definition of political machine in the Cowboy State's capital city: "Today, I, Victor Miller, concede the Cheyenne mayoral race. As the first person to put artificial intelligence directly on the ballot, offering voters the novel choice of AI governance, our campaign has marked a historic moment in politics and technology," Miller said.

bot, fox new digital, miller, (11 more...)

FOX News

Country:

North America > United States > Wyoming > Laramie County > Cheyenne (0.05)
North America > United States > Pennsylvania (0.05)
Europe > United Kingdom > England > East Sussex > Brighton (0.05)

Industry: Government > Voting & Elections (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Fox News AI Newsletter: US leads world in fastest AI development: report

FOX NewsAug-21-2024, 16:00:49 GMT

Fox News chief political anchor Bret Baier has the latest on the pros and cons of the bombshell developments on'Special Report.' TOP OF THE CHARTS: The U.S. topped another study that looked at the fastest-developing artificial intelligence industries in the world, according to a new report. AI ON THE BALLOT: A librarian running as a nonpartisan candidate for mayor of Cheyenne, Wyoming, promises to allow an artificial intelligence bot created by OpenAI to govern the state's capital city. AI POWER PLAY: Google has its eye on the prize -- artificial intelligence -- and it's making a bold power play in the tech arena. The company's recent Made by Google event was more than just showcasing new technology.

fastest ai development, fox new ai newsletter, us lead world, (3 more...)

FOX News

Country:

North America > United States > Wyoming > Laramie County > Cheyenne (0.27)
Asia > China (0.07)

Industry:

Media > News (0.93)
Government (0.77)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.31)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.31)

Add feedback

How In-Context Learning Emerges from Training on Unstructured Data: On the Role of Co-Occurrence, Positional Information, and Noise Structures

Wibisono, Kevin Christian, Wang, Yixin

arXiv.org Machine LearningMay-31-2024

Large language models (LLMs) like transformers have impressive in-context learning (ICL) capabilities; they can generate predictions for new queries based on input-output sequences in prompts without parameter updates. While many theories have attempted to explain ICL, they often focus on structured training data similar to ICL tasks, such as regression. In practice, however, these models are trained in an unsupervised manner on unstructured text data, which bears little resemblance to ICL tasks. To this end, we investigate how ICL emerges from unsupervised training on unstructured data. The key observation is that ICL can arise simply by modeling co-occurrence information using classical language models like continuous bag of words (CBOW), which we theoretically prove and empirically validate. Furthermore, we establish the necessity of positional information and noise structure to generalize ICL to unseen data. Finally, we present instances where ICL fails and provide theoretical explanations; they suggest that the ICL ability of LLMs to identify certain tasks can be sensitive to the structure of the training data.

icl, in-context learning, scenario, (12 more...)

arXiv.org Machine Learning

2406.00131

Country:

South America > Suriname > Paramaribo District > Paramaribo (0.04)
North America > United States > Michigan (0.04)
Europe > Liechtenstein (0.04)
(6 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

UniArk: Improving Generalisation and Consistency for Factual Knowledge Extraction through Debiasing

Yang, Yijun, He, Jie, Chen, Pinzhen, Gutiérrez-Basulto, Víctor, Pan, Jeff Z.

arXiv.org Artificial IntelligenceApr-1-2024

Several recent papers have investigated the potential of language models as knowledge bases as well as the existence of severe biases when extracting factual knowledge. In this work, we focus on the factual probing performance over unseen prompts from tuning, and using a probabilistic view we show the inherent misalignment between pre-training and downstream tuning objectives in language models for probing knowledge. We hypothesize that simultaneously debiasing these objectives can be the key to generalisation over unseen prompts. We propose an adapter-based framework, UniArk, for generalised and consistent factual knowledge extraction through simple methods without introducing extra parameters. Extensive experiments show that UniArk can significantly improve the model's out-of-domain generalisation as well as consistency under various prompts. Additionally, we construct ParaTrex, a large-scale and diverse dataset for measuring the inconsistency and out-of-domain generation of models. Further, ParaTrex offers a reference method for constructing paraphrased datasets using large language models.

prompt template, relation, template, (17 more...)

arXiv.org Artificial Intelligence

2404.01253

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Pakistan > Azad Kashmir (0.04)
Europe > United Kingdom > Scotland (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.66)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback