AITopics | gold

Collaborating Authors

gold

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CITE: Anytime-Valid Statistical Inference in LLM Self-Consistency

Ota, Hirofumi, Iwase, Naoto, Ichihara, Yuki, Komiyama, Junpei, Imaizumi, Masaaki

arXiv.org Machine LearningMay-8-2026

Large language models often improve reasoning by sampling multiple outputs and aggregating their final answers, but precise and efficient control of error levels remains a challenging task. In particular, deciding when to stop sampling remains difficult when the stopping rule is data-dependent and the set of possible response labels is not known in advance. We study anytime-valid certification of a prespecified target answer as the unique mode of the model's response distribution, a guarantee distinct from answer correctness. We propose the Certification by Intersection-union Testing with Eprocesses (CITE) algorithm, which provably controls false certification at any prescribed level under arbitrary data-driven stopping, without requiring prior knowledge of the answer category set. We also prove a category-set-size-free stopping-time rate, establish matching minimax lower bounds up to constants in the main regime, and extend the construction to confidence-weighted voting. Simulations and LLM self-consistency experiments show empirical error control and improved certification in diffuse-tail settings.

category, large language model, natural language, (17 more...)

arXiv.org Machine Learning

2605.05873

Genre: Research Report (1.00)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

'Kill the people': How men were left to starve in a South African gold mine

Al JazeeraMar-14-2026, 08:34:37 GMT

How men were left to starve in a South African gold mine. This image was created by Mohamed Hussein using the artificial intelligence (AI) tool Midjourney. Ayanda Ndabeni watched the faint glow from his headlamp fight the vast darkness 1,500 metres (4,920 feet) below ground. His miner's lamp had lasted for more than a week after he was lowered down into the shaft of the gold mine. But now the batteries were dying. He gently flipped the plastic switch of his lamp, turning it off, and the trapped men around him became shadows. In the stifling heat and humidity, their anxiety pressed in from all sides. Ayanda had descended into Shaft 10 of the Buffelsfontein mine in late September 2024, lowered by a team of nearly 20 men operating ropes and a pulley above ground. That day, he'd spotted police vehicles near the mine's entrance. The 36-year-old assumed it was just routine patrols around the mine system, which is 2km (1.2 miles) deep. But then the rope pulley, via which food, water, batteries and other items arrived, stopped moving. The shouting that usually indicated the rope operators were sending down a man or supplies also fell silent. When huge rocks came crashing down the shaft, they knew it was a warning. The men whispered of their growing fears that something was very wrong on the surface. Patrick Ntsokolo was also in Shaft 10. He was a few hundred metres higher up than Ayanda and had arrived in late July. Patrick was new to the mines. Tasked by the leaders of the artisanal miners with collecting the food, water and alcohol lowered down by the rope pulley, he hauled supplies along the slippery tunnels to small shops.

artificial intelligence, miner, shaft, (16 more...)

Al Jazeera

Country:

South America (0.40)
North America > United States (0.40)
North America > Central America (0.40)
(14 more...)

Industry: Materials > Metals & Mining > Gold (1.00)

Technology: Information Technology > Artificial Intelligence (0.54)

Add feedback

Interactive map reveals your nearest nuclear shelter and states that are MOST exposed... amid fears of US attack: Make an emergency plan now

Daily Mail - Science & techMar-3-2026, 20:33:40 GMT

Horrifying next twist in the Alexander brothers case: MAUREEN CALLAHAN exposes an unthinkable perversion that's been hiding in plain sight Alexander brothers' alleged HIGH SCHOOL gang rape video: Classmates speak out on sick'taking turns' footage... as creepy unseen photos are exposed Model Cindy Crawford, 60, mocked for her'out of touch' morning routine: 'Nothing about this is normal' Kentucky mother and daughter turn down $26.5MILLION to sell their farms to secretive tech giant that wants to build data center there Live Nation executives mocked'stupid' concert-goers in emails where they bragged about how to best rip them off: '$60 for closer grass' NFL superstar Xavier Worthy spills all on Travis Kelce, the Chiefs' struggles... and having Taylor Swift as his No 1 fan Heartbreaking video shows very elderly DoorDash driver shuffle down customer's driveway with coffee order because he is too poor to retire Amber Valletta, 52, was a '90s Vogue model who made movies with Sandra Bullock and Kate Hudson, see her now Nancy Mace throws herself into Iran warzone as she goes rogue on Middle East rescue mission: 'I AM that person' Hidden toxins in kids' treats EXPOSED: Health guru Jillian Michaels' sit-down with Casey DeSantis reveals dangers lurking in popular foods Interactive map reveals your nearest nuclear shelter and states that are MOST exposed... amid fears of US attack: Make an emergency plan now The fear of a nuclear apocalypse has reached levels not seen in decades as the US and Israel launch a deadly new conflict with Iran, raising alarms across capitals and prompting emergency diplomatic efforts to prevent a wider war. For Americans, the pressing question may soon shift from geopolitics to personal preparedness, including where the nearest fallout shelter is located and how to protect themselves if tensions escalate further. There is currently no public list of active shelters available for everyday Americans, since most are defunct or privately owned. But survival expert and Air Force veteran Sean Gold has built his own fallout shelter map, revealing that the vast majority of these radiation bunkers are scattered throughout America's largest cities. The map can be found on his survival guide website, TruePrepper .

artificial intelligence, shelter, social media, (15 more...)

Daily Mail - Science & tech

Country:

Asia > Middle East > Iran (1.00)
Asia > Middle East > Israel (0.34)
North America > United States > Kentucky (0.24)
(25 more...)

Genre: Personal > Obituary (0.46)

Industry:

Media > Television (1.00)
Media > Music (1.00)
Media > Film (1.00)
(7 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Mobile (0.69)

Add feedback

Fine-tuningLanguageModelsoverSlowNetworks usingActivationQuantizationwithGuarantees

Neural Information Processing SystemsFeb-10-2026, 00:53:54 GMT

Communication compression isacrucial technique formodern distributedlearning systems to alleviate their communication bottlenecks over slower networks.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > South Dakota (0.04)
Asia > China (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(4 more...)

Industry:

Semiconductors & Electronics (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Communications (0.68)

Add feedback

Ornate medieval ring discovered in Norway's oldest town

Ornate medieval ring discovered in Norway's oldest town Scientists are still investigating if the ring's center stone is a sapphire or colored glass. Breakthroughs, discoveries, and DIY tips sent every weekday. Last summer, Linda Åsheim found a ring so beautiful it looks like it could have been made yesterday. But Åsheim is an archaeologist, and she found the rare artifact while excavating in a Norwegian town believed to be the oldest in the country. The gorgeous golden ring is decorated with a gemstone and filigree décor--and is over 800 years old.

andrew paul, laura baisa, ornate medieval ring, (13 more...)

Popular Science

Country:

Europe > Norway > Eastern Norway > Vestfold > Tønsberg (0.06)
Europe > Switzerland (0.05)
Asia > Indonesia (0.05)

Genre: Research Report > New Finding (0.37)

Technology: Information Technology > Artificial Intelligence (0.53)

Add feedback

An Agentic AI System for Multi-Framework Communication Coding

Yang, Bohao, Yang, Rui, Biro, Joshua M., Wang, Haoyuan, Handley, Jessica L., Richardson, Brianna, Bessias, Sophia, Economou-Zavlanos, Nicoleta, Bedoya, Armando D., Agrawal, Monica, Zavlanos, Michael M., Chowdhury, Anand, Ratwani, Raj M., Sun, Kai, Pollak, Kathryn I., Pencina, Michael J., Hong, Chuan

arXiv.org Artificial IntelligenceDec-10-2025

Clinical communication is central to patient outcomes, yet large-scale human annotation of patient-provider conversation remains labor-intensive, inconsistent, and difficult to scale. Existing approaches based on large language models typically rely on single-task models that lack adaptability, interpretability, and reliability, especially when applied across various communication frameworks and clinical domains. In this study, we developed a Multi-framework Structured Agentic AI system for Clinical Communication (MOSAIC), built on a LangGraph-based architecture that orchestrates four core agents, including a Plan Agent for codebook selection and workflow planning, an Update Agent for maintaining up-to-date retrieval databases, a set of Annotation Agents that applies codebook-guided retrieval-augmented generation (RAG) with dynamic few-shot prompting, and a Verification Agent that provides consistency checks and feedback. To evaluate performance, we compared MOSAIC outputs against gold-standard annotations created by trained human coders. We developed and evaluated MOSAIC using 26 gold standard annotated transcripts for training and 50 transcripts for testing, spanning rheumatology and OB/GYN domains. On the test set, MOSAIC achieved an overall F1 score of 0.928. Performance was highest in the Rheumatology subset (F1 = 0.962) and strongest for Patient Behavior (e.g., patients asking questions, expressing preferences, or showing assertiveness). Ablations revealed that MOSAIC outperforms baseline benchmarking.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2512.08659

Country: North America > United States (0.47)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

LLM-Cave: A benchmark and light environment for large language models reasoning and decision-making system

Li, Huanyu, Li, Zongyuan, Huang, Wei, Guo, Xian

arXiv.org Artificial IntelligenceDec-1-2025

Large language models (LLMs) such as ChatGPT o1, ChatGPT o3, and DeepSeek R1 have shown great potential in solving difficult problems. However, current LLM evaluation benchmarks are limited to one-step interactions. Some of the existing sequence decision-making environments, such as TextStarCraftII and LLM-PySC2, are too complicated and require hours of interaction to complete a game. In this paper, we introduce LLM-Cave, a benchmark and light environment for LLM reasoning and decision-making systems. This environment is a classic instance in the era of Symbolism. Artificial intelligence enables the agent to explore the environment and avoid potential losses by reasoning about nearby dangers using partial observable state information. In the experiment, we evaluated the sequential reasoning ability, decision-making performance and computational efficiency of mainstream large language models (LLMs) such as GPT-4o-mini, o1-mini, and DeepSeek-R1. Experiments show that while Deepseek-R1 achieved the highest success rate on complex reasoning tasks, smaller models like 4o-mini significantly narrowed the performance gap on challenges by employing Chain of Speculation and Planner-Critic strategies, at the expense of reduced computational efficiency. This indicates that structured, multi-step reasoning combined with an LLM-based feedback mechanism can substantially enhance an LLM's decision-making capabilities, providing a promising direction for improving reasoning in weaker models and suggesting a new reasoning-centered benchmark for LLM assessment. Our code is open-sourced in https://github.com/puleya1277/CaveEnv.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2511.22598

Country: Asia > China (0.15)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Walmart's Black Friday Dyson deals are here: Save up to 300 on vacuums and air purifiers

Gear Home Walmart's Black Friday Dyson deals are here: Save up to $300 on vacuums and air purifiers Dyson gear is never cheap, but Walmart has fans, air purifiers, and vacuums for their lowest prices of the year for Black Friday. We may earn revenue from the products available on this page and participate in affiliate programs. Dyson makes impressive home appliances, but they're not cheap. Walmart just dropped its full-on Black Friday deals and that includes year-low prices on Dyson vacuums and air purifiers . These prices likely won't get any lower if you wait, so you might as well just grab what you want now and make your home more comfortable with the power of engineering.

artificial intelligence, real time system, stan horaczek, (12 more...)

Popular Science

Industry:

Retail > Online (1.00)
Health & Medicine > Health Care Equipment & Supplies (1.00)

Technology:

Information Technology > Artificial Intelligence (0.49)
Information Technology > Architecture > Real Time Systems (0.30)

Add feedback

Data-Efficient Adaptation and a Novel Evaluation Method for Aspect-based Sentiment Analysis

Hua, Yan Cathy, Denny, Paul, Wicker, Jörg, Taškova, Katerina

arXiv.org Artificial IntelligenceNov-6-2025

Aspect-based Sentiment Analysis (ABSA) is a fine-grained opinion mining approach that identifies and classifies opinions associated with specific entities (aspects) or their categories within a sentence. Despite its rapid growth and broad potential, ABSA research and resources remain concentrated in commercial domains, leaving analytical needs unmet in high-demand yet low-resource areas such as education and healthcare. Domain adaptation challenges and most existing methods' reliance on resource-intensive in-training knowledge injection further hinder progress in these areas. Moreover, traditional evaluation methods based on exact matches are overly rigid for ABSA tasks, penalising any boundary variations which may misrepresent the performance of generative models. This work addresses these gaps through three contributions: 1) We propose a novel evaluation method, Flexible Text Similarity Matching and Optimal Bipartite Pairing (FTS-OBP), which accommodates realistic extraction boundary variations while maintaining strong correlation with traditional metrics and offering fine-grained diagnostics. 2) We present the first ABSA study of small decoder-only generative language models (SLMs; <7B parameters), examining resource lower bounds via a case study in education review ABSA. We systematically explore data-free (in-context learning and weight merging) and data-light fine-tuning methods, and propose a multitask fine-tuning strategy that significantly enhances SLM performance, enabling 1.5-3.8 B models to surpass proprietary large models and approach benchmark results with only 200-1,000 examples on a single GPU. 3) We release the first public set of education review ABSA resources to support future research in low-resource domains.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.03034

Country:

North America > United States (1.00)
Asia (1.00)
Europe (0.67)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Patient-Centered Summarization Framework for AI Clinical Summarization: A Mixed-Methods Design

Jimenez, Maria Lizarazo, Claros, Ana Gabriela, Green, Kieran, Toro-Tobon, David, Larios, Felipe, Asthana, Sheena, Wenczenovicz, Camila, Maldonado, Kerly Guevara, Vilatuna-Andrango, Luis, Proano-Velez, Cristina, Bandi, Satya Sai Sri, Bagewadi, Shubhangi, Branda, Megan E., Zahidy, Misk Al, Luz, Saturnino, Lapata, Mirella, Brito, Juan P., Ponce-Ponte, Oscar J.

arXiv.org Artificial IntelligenceNov-3-2025

Large Language Models (LLMs) are increasingly demonstrating the potential to reach human-level performance in generating clinical summaries from patient-clinician conversations. However, these summaries often focus on patients' biology rather than their preferences, values, wishes, and concerns. To achieve patient-centered care, we propose a new standard for Artificial Intelligence (AI) clinical summarization tasks: Patient-Centered Summaries (PCS). Our objective was to develop a framework to generate PCS that capture patient values and ensure clinical utility and to assess whether current open-source LLMs can achieve human-level performance in this task. We used a mixed-methods process. Two Patient and Public Involvement groups (10 patients and 8 clinicians) in the United Kingdom participated in semi-structured interviews exploring what personal and contextual information should be included in clinical summaries and how it should be structured for clinical use. Findings informed annotation guidelines used by eight clinicians to create gold-standard PCS from 88 atrial fibrillation consultations. Sixteen consultations were used to refine a prompt aligned with the guidelines. Five open-source LLMs (Llama-3.2-3B, Llama-3.1-8B, Mistral-8B, Gemma-3-4B, and Qwen3-8B) generated summaries for 72 consultations using zero-shot and few-shot prompting, evaluated with ROUGE-L, BERTScore, and qualitative metrics. Patients emphasized lifestyle routines, social support, recent stressors, and care values. Clinicians sought concise functional, psychosocial, and emotional context. The best zero-shot performance was achieved by Mistral-8B (ROUGE-L 0.189) and Llama-3.1-8B (BERTScore 0.673); the best few-shot by Llama-3.1-8B (ROUGE-L 0.206, BERTScore 0.683). Completeness and fluency were similar between experts and models, while correctness and patient-centeredness favored human PCS.

information, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2510.27535

Country:

Europe > United Kingdom (0.66)
North America > United States > Minnesota > Olmsted County > Rochester (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback