AITopics | wiki

Collaborating Authors

wiki

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

f18a6d1cde4b205199de8729a6637b42-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 21:06:55 GMT

asymmetric multi-head attention, multi-head attention, original multi-head attention, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Palantir Defends Work With ICE to Staff Following Killing of Alex Pretti

WIREDJan-26-2026, 22:09:16 GMT

"In my opinion ICE are the bad guys. I am not proud that the company I enjoy so much working for is part of this," one worker wrote on Slack. After federal agents shot and killed Minneapolis nurse Alex Pretti on Saturday, Palantir workers pressed for answers from leadership on the company's work with Immigration and Customs Enforcement (ICE) --and many questioned whether Palantir should be involved with the agency at all. Leadership defended its work as in part improving "ICE's operational effectiveness." Internal Slack messages reviewed by WIRED reveal growing frustration within Palantir over its relationship with the Department of Homeland Security (DHS), and in particular, ICE's enforcement and investigations teams.

agent, palantir, wired, (10 more...)

WIRED

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.62)
South America > Venezuela (0.05)
North America > United States > California (0.04)
(3 more...)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Immigration & Customs (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Communications > Social Media (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

f18a6d1cde4b205199de8729a6637b42-Supplemental.pdf

Neural Information Processing SystemsAug-18-2025, 19:22:47 GMT

artificial intelligence, machine learning, multi-head attention, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.56)

Add feedback

CASCADE Your Datasets for Cross-Mode Knowledge Retrieval of Language Models

Zhou, Runlong, Zhang, Yi

arXiv.org Artificial IntelligenceJul-15-2025

Language models often struggle with cross-mode knowledge retrieval -- the ability to access knowledge learned in one format (mode) when queried in another. We demonstrate that models trained on multiple data sources (e.g., Wikipedia and TinyStories) exhibit significantly reduced accuracy when retrieving knowledge in a format different from its original training mode. This paper quantitatively investigates this phenomenon through a controlled study of random token sequence memorization across different modes. We first explore dataset rewriting as a solution, revealing that effective cross-mode retrieval requires prohibitively extensive rewriting efforts that follow a sigmoid-like relationship. As an alternative, we propose CASCADE, a novel pretraining algorithm that uses cascading datasets with varying sequence lengths and computing losses on only the second half of each training sequence to capture knowledge at different scales. Our experiments demonstrate that CASCADE outperforms dataset rewriting approaches, even when compressed into a single model with a unified loss function. This work provides both qualitative evidence of cross-mode retrieval limitations and a practical solution to enhance language models' ability to access knowledge independently of its presentational format.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2504.0145

Country:

North America > Mexico > Baja California Sur > La Paz (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SciCom Wiki: Fact-Checking and FAIR Knowledge Distribution for Scientific Videos and Podcasts

Wittenborg, Tim, Tremel, Constantin Sebastian, Stehr, Niklas, Karras, Oliver, Stocker, Markus, Auer, Sören

arXiv.org Artificial IntelligenceMay-14-2025

Democratic societies need accessible, reliable information. Videos and Podcasts have established themselves as the medium of choice for civic dissemination, but also as carriers of misinformation. The emerging Science Communication Knowledge Infrastructure (SciCom KI) curating non-textual media is still fragmented and not adequately equipped to scale against the content flood. Our work sets out to support the SciCom KI with a central, collaborative platform, the SciCom Wiki, to facilitate FAIR (findable, accessible, interoperable, reusable) media representation and the fact-checking of their content, particularly for videos and podcasts. Building an open-source service system centered around Wikibase, we survey requirements from 53 stakeholders, refine these in 11 interviews, and evaluate our prototype based on these requirements with another 14 participants. To address the most requested feature, fact-checking, we developed a neurosymbolic computational fact-checking approach, converting heterogenous media into knowledge graphs. This increases machine-readability and allows comparing statements against equally represented ground-truth. Our computational fact-checking tool was iteratively evaluated through 10 expert interviews, a public user survey with 43 participants verified the necessity and usability of our tool. Overall, our findings identified several needs to systematically support the SciCom KI. The SciCom Wiki, as a FAIR digital library complementing our neurosymbolic computational fact-checking framework, was found suitable to address the raised requirements. Further, we identified that the SciCom KI is severely underdeveloped regarding FAIR knowledge and related systems facilitating its collaborative creation and curation. Our system can provide a central knowledge node, yet a collaborative effort is required to scale against the imminent (mis-)information flood.

artificial intelligence, knowledge management, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.07912

Country:

Europe > Germany > Lower Saxony > Hanover (0.04)
Europe > Germany > Lower Saxony > Gottingen (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.46)
Personal > Interview (0.34)
Research Report > New Finding (0.34)

Industry: Media > News (0.49)

Technology:

Information Technology > Knowledge Management (1.00)
Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.94)

Add feedback

An Identity and Interaction Based Network Forensic Analysis

Clarke, Nathan, Alotibi, Gaseb, Joy, Dany, Li, Fudong, Furnell, Steven, Alshumrani, Ali, Mohammed, Hussan

arXiv.org Artificial IntelligenceMar-24-2025

In todays landscape of increasing electronic crime, network forensics plays a pivotal role in digital investigations. It aids in understanding which systems to analyse and as a supplement to support evidence found through more traditional computer based investigations. However, the nature and functionality of the existing Network Forensic Analysis Tools (NFATs) fall short compared to File System Forensic Analysis Tools (FS FATs) in providing usable data. The analysis tends to focus upon IP addresses, which are not synonymous with user identities, a point of significant interest to investigators. This paper presents several experiments designed to create a novel NFAT approach that can identify users and understand how they are using network based applications whilst the traffic remains encrypted. The experiments build upon the prior art and investigate how effective this approach is in classifying users and their actions. Utilising an in-house dataset composed of 50 million packers, the experiments are formed of three incremental developments that assist in improving performance. Building upon the successful experiments, a proposed NFAT interface is presented to illustrate the ease at which investigators would be able to ask relevant questions of user interactions. The experiments profiled across 27 users, has yielded an average 93.3% True Positive Identification Rate (TPIR), with 41% of users experiencing 100% TPIR. Skype, Wikipedia and Hotmail services achieved a notably high level of recognition performance. The study has developed and evaluated an approach to analyse encrypted network traffic more effectively through the modelling of network traffic and to subsequently visualise these interactions through a novel network forensic analysis tool.

artificial intelligence, machine learning, university, (17 more...)

arXiv.org Artificial Intelligence

2503.18542

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.06)
Europe > United Kingdom > England > Devon > Plymouth (0.05)
Europe > United Kingdom > England > Dorset > Bournemouth (0.05)
(4 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Provence: efficient and robust context pruning for retrieval-augmented generation

Chirkova, Nadezhda, Formal, Thibault, Nikoulina, Vassilina, Clinchant, Stéphane

arXiv.org Artificial IntelligenceJan-27-2025

Retrieval-augmented generation improves various aspects of large language models (LLMs) generation, but suffers from computational overhead caused by long contexts as well as the propagation of irrelevant retrieved information into generated responses. Context pruning deals with both aspects, by removing irrelevant parts of retrieved contexts before LLM generation. Existing context pruning approaches are however limited, and do not provide a universal model that would be both efficient and robust in a wide range of scenarios, e.g., when contexts contain a variable amount of relevant information or vary in length, or when evaluated on various domains. In this work, we close this gap and introduce Provence (Pruning and Reranking Of retrieVEd relevaNt ContExts), an efficient and robust context pruner for Question Answering, which dynamically detects the needed amount of pruning for a given context and can be used out-of-the-box for various domains. The three key ingredients of Provence are formulating the context pruning task as sequence labeling, unifying context pruning capabilities with context reranking, and training on diverse data. Our experimental results show that Provence enables context pruning with negligible to no drop in performance, in various domains and settings, at almost no cost in a standard RAG pipeline. We also conduct a deeper analysis alongside various ablations to provide insights into training context pruners for future work.

large language model, machine learning, provence, (21 more...)

arXiv.org Artificial Intelligence

2501.16214

Country:

Europe > Netherlands (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
South America > Brazil (0.04)
(11 more...)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.89)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Why Companies "Democratise" Artificial Intelligence: The Case of Open Source Software Donations

Osborne, Cailean

arXiv.org Artificial IntelligenceSep-26-2024

Companies claim to "democratise" artificial intelligence (AI) when they donate AI open source software (OSS) to non-profit foundations or release AI models, among others, but what does this term mean and why do they do it? As the impact of AI on society and the economy grows, understanding the commercial incentives behind AI democratisation efforts is crucial for ensuring these efforts serve broader interests beyond commercial agendas. Towards this end, this study employs a mixed-methods approach to investigate commercial incentives for 43 AI OSS donations to the Linux Foundation. It makes contributions to both research and practice. It contributes a taxonomy of both individual and organisational social, economic, and technological incentives for AI democratisation. In particular, it highlights the role of democratising the governance and control rights of an OSS project (i.e., from one company to open governance) as a structural enabler for downstream goals, such as attracting external contributors, reducing development costs, and influencing industry standards, among others. Furthermore, OSS donations are often championed by individual developers within companies, highlighting the importance of the bottom-up incentives for AI democratisation. The taxonomy provides a framework and toolkit for discerning incentives for other AI democratisation efforts, such as the release of AI models. The paper concludes with a discussion of future research directions.

donation, incentive, lfaidata, (12 more...)

arXiv.org Artificial Intelligence

2409.17876

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
(23 more...)

Genre:

Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.93)

Industry:

Information Technology > Software (1.00)
Information Technology > Services (1.00)
Social Sector (0.88)
(2 more...)

Technology:

Information Technology > Software (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Add feedback

New difficulty mod in Stardew Valley will purge your saves if you use a guide

EngadgetJun-24-2024, 17:55:21 GMT

A great number of us have played games in extra-difficult modes (or in the case of Kingdom Hearts, Proud Mode) to challenge ourselves. Now, a Stardew Valley player has created a "hardcore" option for the otherwise chill game, one that will delete the save files of any player who uses a guide while playing the game on PC. According to GamesRadar, software engineer Sylvie Nightshade created the high difficulty mod on June 21 after reading an article published the day before on the satirical website Hard Drive, the gaming version of The Onion. The article in question joked about a "hardcore mode" in Stardew Valley that will delete players' hard grown farms if they dare read the wiki at any point during gameplay. That same day, Nightshade quote-tweeted the article on X with the link to the mod in GitHub announcing that she turned the joke into reality.

difficulty mod, new difficulty mod, stardew valley, (5 more...)

Engadget

Technology: Information Technology > Artificial Intelligence > Games > Computer Games (1.00)

Add feedback

Narrowing the Gap between Supervised and Unsupervised Sentence Representation Learning with Large Language Model

Li, Mingxin, Zhang, Richong, Nie, Zhijie, Mao, Yongyi

arXiv.org Artificial IntelligenceDec-19-2023

Sentence Representation Learning (SRL) is a fundamental task in Natural Language Processing (NLP), with the Contrastive Learning of Sentence Embeddings (CSE) being the mainstream technique due to its superior performance. An intriguing phenomenon in CSE is the significant performance gap between supervised and unsupervised methods, with their only difference lying in the training data. Previous works attribute this performance gap to differences in two representation properties (alignment and uniformity). However, since alignment and uniformity only measure the results, they fail to answer "What aspects of the training data contribute to the performance gap?" and "How can the performance gap be narrowed?", In this paper, we conduct empirical experiments to answer these "What" and "How" questions. We first answer the "What" question by thoroughly comparing the behavior of supervised and unsupervised CSE during their respective training processes. From the comparison, we identify the similarity pattern as a key factor to the performance gap, and introduce a metric, called Relative Fitting Difficulty (RFD), to measure the complexity of the similarity pattern. Then, based on the insights gained from the "What" question, we tackle the "How" question by increasing the pattern complexity of the training data. We achieve this by leveraging the In-Context Learning (ICL) capability of the Large Language Model (LLM) to generate data that simulates complex patterns. By utilizing the hierarchical patterns in the LLM-generated data, we effectively narrow the gap between supervised and unsupervised CSE. We release our codes and appendix at https://github.com/BDBC-KG-NLP/NGCSE.

dataset, nli, training data, (15 more...)

arXiv.org Artificial Intelligence

2309.06453

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Asia > China > Beijing > Beijing (0.04)
(22 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.94)
Media > Radio (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback