AITopics | nielsen

Collaborating Authors

nielsen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improved Inference for CSDID Using the Cluster Jackknife

Karim, Sunny R., Nielsen, Morten Ørregaard, MacKinnon, James G., Webb, Matthew D.

arXiv.org Machine LearningFeb-13-2026

Obtaining reliable inferences with traditional difference-in-differences (DiD) methods can be difficult. Problems can arise when both outcomes and errors are serially correlated, when there are few clusters or few treated clusters, when cluster sizes vary greatly, and in various other cases. In recent years, recognition of the ``staggered adoption'' problem has shifted the focus away from inference towards consistent estimation of treatment effects. One of the most popular new estimators is the CSDID procedure of Callaway and Sant'Anna (2021). We find that the issues of over-rejection with few clusters and/or few treated clusters are at least as severe for CSDID as for traditional DiD methods. We also propose using a cluster jackknife for inference with CSDID, which simulations suggest greatly improves inference. We provide software packages in Stata csdidjack and R didjack to calculate cluster-jackknife standard errors easily.

artificial intelligence, att, machine learning, (17 more...)

arXiv.org Machine Learning

2602.12043

Country:

North America > United States > Indiana (0.05)
North America > United States > Wisconsin (0.04)
North America > United States > South Carolina (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

4746bb91bd073ec7eef930d5775122ba-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-12-2026, 07:26:31 GMT

computational linguistic, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Sweden > Östergötland County > Linköping (0.04)
Europe > Iceland > Capital Region > Reykjavik (0.04)
(21 more...)

Genre: Research Report (0.67)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Does Privacy Always Harm Fairness? Data-Dependent Trade-offs via Chernoff Information Neural Estimation

Nichani, Arjun, Hsu, Hsiang, Chun-Fu, null, Chen, null, Jeong, Haewon

arXiv.org Machine LearningJan-21-2026

Fairness and privacy are two vital pillars of trustworthy machine learning. Despite extensive research on these individual topics, the relationship between fairness and privacy has received significantly less attention. In this paper, we utilize the information-theoretic measure Chernoff Information to highlight the data-dependent nature of the relationship among the triad of fairness, privacy, and accuracy. We first define Noisy Chernoff Difference, a tool that allows us to analyze the relationship among the triad simultaneously. We then show that for synthetic data, this value behaves in 3 distinct ways (depending on the distribution of the data). We highlight the data distributions involved in these cases and explore their fairness and privacy implications. Additionally, we show that Noisy Chernoff Difference acts as a proxy for the steepness of the fairness-accuracy curves. Finally, we propose a method for estimating Chernoff Information on data from unknown distributions and utilize this framework to examine the triad dynamic on real datasets. This work builds towards a unified understanding of the fairness-privacy-accuracy relationship and highlights its data-dependent nature.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2601.13698

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(4 more...)

Add feedback

DaLA: Danish Linguistic Acceptability Evaluation Guided by Real World Errors

Barmina, Gianluca, Norman, Nathalie Carmen Hau, Schneider-Kamp, Peter, Poech, Lukas Galke

arXiv.org Artificial IntelligenceDec-9-2025

We present an enhanced benchmark for evaluating linguistic acceptability in Danish. We first analyze the most common errors found in written Danish. Based on this analysis, we introduce a set of fourteen corruption functions that generate incorrect sentences by systematically introducing errors into existing correct Danish sentences. To ensure the accuracy of these corruptions, we assess their validity using both manual and automatic methods. The results are then used as a benchmark for evaluating Large Language Models on a linguistic acceptability judgement task. Our findings demonstrate that this extension is both broader and more comprehensive than the current state of the art. By incorporating a greater variety of corruption types, our benchmark provides a more rigorous assessment of linguistic acceptability, increasing task difficulty, as evidenced by the lower performance of LLMs on our benchmark compared to existing ones. Our results also suggest that our benchmark has a higher discriminatory power which allows to better distinguish well-performing models from low-performing ones.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2512.04799

Country:

North America > United States (0.46)
Europe > Denmark (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.55)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.47)

Add feedback

Catching UX Flaws in Code: Leveraging LLMs to Identify Usability Flaws at the Development Stage

Platt, Nolan, Luchs, Ethan, Nizamani, Sehrish

arXiv.org Artificial IntelligenceDec-5-2025

Usability evaluations are essential for ensuring that modern interfaces meet user needs, yet traditional heuristic evaluations by human experts can be time-consuming and subjective, especially early in development. This paper investigates whether large language models (LLMs) can provide reliable and consistent heuristic assessments at the development stage. By applying Jakob Nielsen's ten usability heuristics to thirty open-source websites, we generated over 850 heuristic evaluations in three independent evaluations per site using a pipeline of OpenAI's GPT-4o. For issue detection, the model demonstrated moderate consistency, with an average pairwise Cohen's Kappa of 0.50 and an exact agreement of 84%. Severity judgments showed more variability: weighted Cohen's Kappa averaged 0.63, but exact agreement was just 56%, and Krippendorff's Alpha was near zero. These results suggest that while GPT-4o can produce internally consistent evaluations, especially for identifying the presence of usability issues, its ability to judge severity varies and requires human oversight in practice. Our findings highlight the feasibility and limitations of using LLMs for early-stage, automated usability testing, and offer a foundation for improving consistency in automated User Experience (UX) evaluation. To the best of our knowledge, our work provides one of the first quantitative inter-rater reliability analyses of automated heuristic evaluation and highlights methods for improving model consistency.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/VL-HCC65237.2025.00024

2512.04262

Country: North America > United States > Virginia (0.16)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Learning-Based Regional White Matter Hyperintensity Mapping as a Robust Biomarker for Alzheimer's Disease

Machnio, Julia, Nielsen, Mads, Ghazi, Mostafa Mehdipour

arXiv.org Artificial IntelligenceNov-19-2025

White matter hyperintensities (WMH) are key imaging markers in cognitive aging, Alzheimer's disease (AD), and related dementias. Although automated methods for WMH segmentation have advanced, most provide only global lesion load and overlook their spatial distribution across distinct white matter regions. We propose a deep learning framework for robust WMH segmentation and localization, evaluated across public datasets and an independent Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. Our results show that the predicted lesion loads are in line with the reference WMH estimates, confirming the robustness to variations in lesion load, acquisition, and demographics. Beyond accurate segmentation, we quantify WMH load within anatomically defined regions and combine these measures with brain structure volumes to assess diagnostic value. Regional WMH volumes consistently outperform global lesion burden for disease classification, and integration with brain atrophy metrics further improves performance, reaching area under the curve (AUC) values up to 0.97. Several spatially distinct regions, particularly within anterior white matter tracts, are reproducibly associated with diagnostic status, indicating localized vulnerability in AD. These results highlight the added value of regional WMH quantification. Incorporating localized lesion metrics alongside atrophy markers may enhance early diagnosis and stratification in neurodegenerative disorders.

alzheimer, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2511.14588

Country:

Europe > Denmark (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Scandinavian Embedding Benchmarks: Evaluating Multilingual and Monolingual Text Embedding for Scandinavian languages

Neural Information Processing SystemsOct-10-2025, 01:07:17 GMT

However, this is not the case for multilingual text embeddings due to a lack of available benchmarks.

computational linguistic, dataset, proceedings, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Sweden > Östergötland County > Linköping (0.04)
Europe > Iceland > Capital Region > Reykjavik (0.04)
(21 more...)

Genre: Research Report (0.67)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

The real reason our weather is going to the dogs

New ScientistSep-17-2025, 18:00:00 GMT

Feedback was amazed to hear that dog ownership could cause a hurricane across the other side of the world. Or are we barking up the wrong tree? Kristian Steensen Nielsen seems like a sensible type. A researcher at the Copenhagen Business School in Denmark, he studies "the role of behavior change in mitigating climate change and conserving biodiversity". In other words, how can we make our lives more environmentally friendly, and how and when do those changes scale up to become truly effective?

extreme weather, real reason, weather, (15 more...)

New Scientist

Country:

Europe > Denmark > Capital Region > Copenhagen (0.25)
South America (0.05)
North America > United States > Texas > Travis County > Austin (0.05)
North America > Guatemala (0.05)

Industry:

Health & Medicine (0.71)
Education > Health & Safety > School Nutrition (0.33)
Media > News (0.32)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

MultiWikiQA: A Reading Comprehension Benchmark in 300+ Languages

Smart, Dan Saattrup

arXiv.org Artificial IntelligenceSep-8-2025

We introduce a new reading comprehension dataset, dubbed MultiWikiQA, which covers 306 languages. The context data comes from Wikipedia articles, with questions generated by an LLM and the answers appearing verbatim in the Wikipedia articles. We conduct a crowdsourced human evaluation of the fluency of the generated questions across 30 of the languages, providing evidence that the questions are of good quality. We evaluate 6 different language models, both decoder and encoder models of varying sizes, showing that the benchmark is sufficiently difficult and that there is a large performance discrepancy amongst the languages. The dataset and survey evaluations are freely available.

computational linguistic, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2509.04111

Country: Europe (0.28)

Genre: Research Report (0.64)

Industry: Education > Assessment & Standards > Student Performance (0.61)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Dynaword: From One-shot to Continuously Developed Datasets

Enevoldsen, Kenneth, Jensen, Kristian Nørgaard, Kostkan, Jan, Szabó, Balázs, Kardos, Márton, Vad, Kirten, Heinsen, Johan, Núñez, Andrea Blasi, Barmina, Gianluca, Nielsen, Jacob, Larsen, Rasmus, Vahlstrup, Peter, Dalum, Per Møldrup, Elliott, Desmond, Galke, Lukas, Schneider-Kamp, Peter, Nielbo, Kristoffer

arXiv.org Artificial IntelligenceAug-6-2025

Large-scale datasets are foundational for research and development in natural language processing. However, current approaches face three key challenges: (1) reliance on ambiguously licensed sources restricting use, sharing, and derivative works; (2) static dataset releases that prevent community contributions and diminish longevity; and (3) quality assurance processes restricted to publishing teams rather than leveraging community expertise. To address these limitations, we introduce two contributions: the Dynaword approach and Danish Dynaword. The Dynaword approach is a framework for creating large-scale, open datasets that can be continuously updated through community collaboration. Danish Dynaword is a concrete implementation that validates this approach and demonstrates its potential. Danish Dynaword contains over four times as many tokens as comparable releases, is exclusively openly licensed, and has received multiple contributions across industry and research. The repository includes light-weight tests to ensure data formatting, quality, and documentation, establishing a sustainable framework for ongoing community contributions and dataset evolution.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2508.02271

Country:

Europe > Denmark (0.94)
North America (0.93)

Genre: Research Report (0.64)

Industry:

Law (1.00)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback