AITopics | wikipedia article

Collaborating Authors

wikipedia article

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning

Neural Information Processing SystemsJun-19-2026, 18:31:11 GMT

Finetuning provides a scalable and cost-effective means of customizing language models for specific tasks or response styles, with greater reliability than prompting or in-context learning. In contrast, the conventional wisdom is that injecting knowledge via finetuning results in brittle performance and poor generalization. We argue that the dichotomy of "task customization" (e.g., instruction tuning) and "knowledge injection" (e.g., teaching new facts) is a distinction without a difference. We instead identify concrete factors that explain the heterogeneous effectiveness observed with finetuning. To this end, we conduct a large-scale experimental study of finetuning the frontier Gemini v1.5 model family on a spectrum of datasets that are artificially engineered to interpolate between the strengths and failure modes of finetuning. Our findings indicate that question-answer training data formats provide much stronger knowledge generalization than document/articlestyle training data, numerical information can be harder for finetuning to retain than categorical information, and models struggle to apply finetuned knowledge during multi-step reasoning even when trained on similar examples--all factors that render "knowledge injection" to be especially difficult, even after controlling for considerations like data augmentation and information volume. On the other hand, our findings also indicate that it is not fundamentally more difficult to finetune information about a real-world event than information about writing style.

information, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment > Sports (0.93)
Law (0.93)
Media (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Add feedback

Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages

Michelle Yuan, Benjamin Van Durme, Jordan L. Ying

Neural Information Processing SystemsFeb-12-2026, 11:38:41 GMT

Neural Information Processing Systems http://nips.cc/

anchor word, multilingual, topic model, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.96)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.54)

Add feedback

Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages

Michelle Yuan, Benjamin Van Durme, Jordan L. Ying

Neural Information Processing SystemsNov-20-2025, 23:23:28 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.96)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.34)

Add feedback

What's Grokipedia, Musk's AI-powered rival to Wikipedia?

Al JazeeraNov-16-2025, 13:10:15 GMT

US shutdown ends: What happens next? New Epstein emails: What do they say about Trump? Last month, tech billionaire Elon Musk launched Grokipedia, an AI-powered platform, to rival online encyclopedia Wikipedia. "Grokipedia will exceed Wikipedia by several orders of magnitude in breadth, depth and accuracy," Musk posted on X the day after his site went live on October 27. Grokipedia will exceed Wikipedia by several orders of magnitude in breadth, depth and accuracy https://t.co/Nt4M6vqEZu

artificial intelligence, grokipedia, social media, (18 more...)

Al Jazeera

Country: North America > United States (0.71)

Genre: Personal > Honors (0.31)

Industry: Government > Regional Government > North America Government > United States Government (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Wikipedia-based Datasets in Russian Information Retrieval Benchmark RusBEIR

Kovalev, Grigory, Loukachevitch, Natalia, Tikhomirov, Mikhail, Babina, Olga, Mamaev, Pavel

arXiv.org Artificial IntelligenceNov-10-2025

In this paper, we present a novel series of Russian information retrieval datasets constructed from the "Did you know..." section of Russian Wikipedia. Our datasets support a range of retrieval tasks, including fact-checking, retrieval-augmented generation, and full-document retrieval, by leveraging interesting facts and their referenced Wikipedia articles annotated at the sentence level with graded relevance. We describe the methodology for dataset creation that enables the expansion of existing Russian Information Retrieval (IR) resources. Through extensive experiments, we extend the RusBEIR research by comparing lexical retrieval models, such as BM25, with state-of-the-art neural architectures fine-tuned for Russian, as well as multilingual models. Results of our experiments show that lexical methods tend to outperform neural models on full-document retrieval, while neural approaches better capture lexical semantics in shorter texts, such as in fact-checking or fine-grained retrieval. Using our newly created datasets, we also analyze the impact of document length on retrieval performance and demonstrate that combining retrieval with neural reranking consistently improves results. Our contribution expands the resources available for Russian information retrieval research and highlights the importance of accurate evaluation of retrieval models to achieve optimal performance. All datasets are publicly available at HuggingFace. To facilitate reproducibility and future research, we also release the full implementation on GitHub.

information retrieval, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2511.05079

Country: Europe > Russia (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

LuxIT: A Luxembourgish Instruction Tuning Dataset from Monolingual Seed Data

Valline, Julian, Lothritz, Cedric, Cabot, Jordi

arXiv.org Artificial IntelligenceOct-29-2025

The effectiveness of instruction-tuned Large Language Models (LLMs) is often limited in low-resource linguistic settings due to a lack of high-quality training data. We introduce LuxIT, a novel, monolingual instruction tuning dataset for Luxembourgish developed to mitigate this challenge. We synthesize the dataset from a corpus of native Luxembourgish texts, utilizing DeepSeek-R1-0528, chosen for its shown proficiency in Luxembourgish. Following generation, we apply a quality assurance process, employing an LLM-as-a-judge approach. To investigate the practical utility of the dataset, we fine-tune several smaller-scale LLMs on LuxIT. Subsequent benchmarking against their base models on Luxembourgish language proficiency examinations, however, yields mixed results, with performance varying significantly across different models. LuxIT represents a critical contribution to Luxembourgish natural language processing and offers a replicable monolingual methodology, though our findings highlight the need for further research to optimize its application.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.24434

Country:

Europe (0.46)
North America > United States (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Education > Curriculum > Subject-Specific Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

How Grounded is Wikipedia? A Study on Structured Evidential Support and Retrieval

Walden, William, Ricci, Kathryn, Wanner, Miriam, Jiang, Zhengping, May, Chandler, Zhou, Rongkun, Van Durme, Benjamin

arXiv.org Artificial IntelligenceOct-10-2025

Wikipedia is a critical resource for modern NLP, serving as a rich repository of up-to-date and citation-backed information on a wide variety of subjects. The reliability of Wikipedia -- its groundedness in its cited sources -- is vital to this purpose. This work analyzes both how grounded Wikipedia is and how readily fine-grained grounding evidence can be retrieved. To this end, we introduce PeopleProfiles -- a large-scale, multi-level dataset of claim support annotations on biographical Wikipedia articles. We show that: (1) ~22% of claims in Wikipedia lead sections are unsupported by the article body; (2) ~30% of claims in the article body are unsupported by their publicly accessible sources; and (3) real-world Wikipedia citation practices often differ from documented standards. Finally, we show that complex evidence retrieval remains a challenge -- even for recent reasoning rerankers.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.12637

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.46)
Media > Film (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

B.2 Hierarchical k -means for semantic identifier The pseudo code of hierarchical k -means is detailed in in Algorithm 1. Algorithm 1: Hierarchical k -means.

Add feedback

Inferring Prerequisite Knowledge Concepts in Educational Knowledge Graphs: A Multi-criteria Approach

Alatrash, Rawaa, Chatti, Mohamed Amine, Wibowo, Nasha, Ain, Qurat Ul

arXiv.org Artificial IntelligenceSep-9-2025

Educational Knowledge Graphs (EduKGs) organize various learning entities and their relationships to support structured and adaptive learning. Prerequisite relationships (PRs) are critical in EduKGs for defining the logical order in which concepts should be learned. However, the current EduKG in the MOOC platform CourseMapper lacks explicit PR links, and manually annotating them is time-consuming and inconsistent. To address this, we propose an unsupervised method for automatically inferring concept PRs without relying on labeled data. We define ten criteria based on document-based, Wikipedia hyperlink-based, graph-based, and text-based features, and combine them using a voting algorithm to robustly capture PRs in educational content. Experiments on benchmark datasets show that our approach achieves higher precision than existing methods while maintaining scalability and adaptability, thus providing reliable support for sequence-aware learning in CourseMapper.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2509.05393

Genre:

Instructional Material (0.89)
Research Report (0.64)

Industry:

Education > Educational Technology (0.34)
Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.61)

Add feedback

Filters

Collaborating Authors

wikipedia article

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning

Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages

a46156bd3579c3b268108ea6aca71d13-Supplemental-Conference.pdf

Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages

What's Grokipedia, Musk's AI-powered rival to Wikipedia?

Wikipedia-based Datasets in Russian Information Retrieval Benchmark RusBEIR

LuxIT: A Luxembourgish Instruction Tuning Dataset from Monolingual Seed Data

How Grounded is Wikipedia? A Study on Structured Evidential Support and Retrieval

A Related work

Inferring Prerequisite Knowledge Concepts in Educational Knowledge Graphs: A Multi-criteria Approach