AITopics | Technology

Collaborating Authors

Technology

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

News Overviews Instructional Materials AI-Alerts Classics

VisualSync: Multi-Camera Synchronization via Cross-View Object Motion

Neural Information Processing SystemsJun-15-2026, 14:10:59 GMT

Today, people can easily record memorable moments, ranging from concerts, sports events, lectures, family gatherings, and birthday parties with multiple consumer cameras. However, synchronizing these cross-camera streams remains challenging. Existing methods assume controlled settings, specific targets, manual correction, or costly hardware. We present VisualSync, an optimization framework based on multi-view dynamics that aligns unposed, unsynchronized videos at millisecond accuracy. Our key insight is that any moving 3D point, when co-visible in two cameras, obeys epipolar constraints once properly synchronized.

artificial intelligence, machine learning, synchronization, (16 more...)

Neural Information Processing Systems

Country: Europe (0.93)

Genre: Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment (0.87)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Add feedback

LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding

Neural Information Processing SystemsJun-15-2026, 14:10:43 GMT

Diffusion transformers (DiTs) struggle to generate images at resolutions higher than their training resolutions. The primary obstacle is that the explicit positional encodings (PE), such as RoPE, need extrapolating to unseen positions which degrades performance when the inference resolution differs from training. In this paper, We propose a Length-Extrapolatable Diffusion Transformer (LEDiT) to overcome this limitation. LEDiT needs no explicit PEs, thereby avoiding PE extrapolation. The key innovation of LEDiT lies in the use of causal attention. We demonstrate that causal attention can implicitly encode global positional information and show that such information facilitates extrapolation. We further introduce a locality enhancement module, which captures fine-grained local information to complement the global coarse-grained position information encoded by causal attention. Experimental results on both conditional and text-to-image generation tasks demonstrate that LEDiT supports up to 4 resolution scaling (e.g., from 256 256 to 512 512), achieving better image quality compared to the state-of-the-art length extrapolation methods. We believe that LEDiT marks a departure from the standard RoPE-based methods and offers a promising insight into length extrapolation.

machine learning, natural language, resolution, (21 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

vHector and HeisenVec: Scalable Vector Graphics Generation Through Large Language Models Supplementary Material 1 Tokens list and Commands details

Neural Information Processing SystemsJun-15-2026, 14:02:05 GMT

LLMs typically respond quickly, but their SVG outputs tend to be short and structurally simplistic.

artificial intelligence, large language model, natural language, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

vHector and Scalable Vector Graphics Generation Through Large Language Models

Neural Information Processing SystemsJun-15-2026, 14:02:01 GMT

We introduce HeisenVec, a large-scale dataset designed to advance research in vector graphics generation from natural language descriptions. Unlike conventional image generation datasets that focus on raster images, HeisenVec targets the structured and symbolic domain of Scalable Vector Graphics (SVG), where images are represented as sequences of drawing commands and style attributes. The dataset comprises 2.2 million SVGs collected from different online sources, each paired with four complementary textual descriptions generated by multi-modal models. To ensure structural consistency and efficiency for autoregressive modeling, all SVGs are standardized through a pre-processing pipeline that unifies geometric primitives as paths, applies affine transformations, and compresses syntax via custom tokens set. HeisenVec exhibits broad coverage among visual styles and sequence lengths, with a substantial portion of samples exceeding 8,000 tokens, making it particularly well-suited for benchmarking long-context language models. Our benchmark enables rigorous evaluation of text-conditioned SVG generation, encourages progress on sequence modeling with symbolic outputs, and bridges the gap between vision, graphics, and language. We release the dataset, tokenization tools, and evaluation pipeline to foster further research in this emerging domain. The dataset and the code for testing our parsing, standardization, and tokenization method are available at this link. The image depicts a desktop with a central emblem that features a stylized tree.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Topology-Aware Learning of Tubular Manifolds via SE(3)-Equivariant Network on Ball B-Spline Curve

Neural Information Processing SystemsJun-15-2026, 14:01:44 GMT

Tubular-like system shape analysis is quite difficult in geometry and topology, while it is widely used in plants and organs analysis in practice. However, traditional discrete representations such as voxels and point clouds often require substantial storage and may lead to the loss of fine-grained geometric and topological details. To address these challenges, we propose SE(3)-BBSCformerGCN, a novel framework for learning shape-aware representations from continuous tubular topological manifolds with equivariance to rotations and translations. Our approach leverages Ball B-Spline Curve (BBSC) to define tubular manifolds and its functional space. We provide a formal mathematical definition and analysis of the resulting manifolds and the BBSC functional space, and incorporate an equivariant mapping that preserves geometric and topological stability. Compared to the point cloud and voxel based representations, our manifold-based formulation significantly reduces data complexity while preserving geometric attributes together with topological features.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > Canada (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.93)
Health & Medicine > Health Care Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.67)

Add feedback

Segway Navimow i210 LiDAR review: Easy setup, effortless mowing

PCWorldJun-15-2026, 14:00:00 GMT

When you purchase through links in our articles, we may earn a small commission. Can't handle very complex lawns The Segway Navimow i210 LiDAR combines smart navigation with an exceptionally simple user experience. The result is a robot lawnmower that simply gets the job done day after day. In just a few years, robotic lawnmowers have evolved from requiring buried boundary cables to navigating with satellites, cameras, and advanced sensors. The problem is that many of the cordless models still involve a fair amount of installation and configuration.

artificial intelligence, digital magazine, home robotic performance privacy productivity, (10 more...)

PCWorld

Industry:

Transportation > Ground > Road (0.99)
Information Technology > Security & Privacy (0.98)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Fair Representation Learning with Controllable High Confidence Guarantees via Adversarial Inference

Neural Information Processing SystemsJun-15-2026, 13:52:41 GMT

Representation learning is increasingly applied to generate representations that generalize well across multiple downstream tasks. Ensuring fairness guarantees in representation learning is crucial to prevent unfairness toward specific demographic groups in downstream tasks. In this work, we formally introduce the task of learning representations that achieve high-confidence fairness. We aim to guarantee that demographic disparity in every downstream prediction remains bounded by a user-defined error threshold ε, with controllable high probability. To this end, we propose the Fair Representation learning with high-confidence Guarantees (FRG) framework, which provides these high-confidence fairness guarantees by leveraging an optimized adversarial model. We empirically evaluate FRG on three real-world datasets, comparing its performance to six state-of-the-art fair representation learning methods. Our results demonstrate that FRG consistently bounds unfairness across a range of downstream models and tasks. The source code for FRG is available at: https://github.com/JamesLuoyh/FRG.

data mining, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (0.67)
Information Technology > Security & Privacy (0.46)
Banking & Finance (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers

Neural Information Processing SystemsJun-15-2026, 13:52:21 GMT

As generative AI systems become competent and democratized in science, business, and government, deeper insight into their failure modes now poses an acute need. The occasional volatility in their behavior, such as the propensity of transformer models to hallucinate, impedes trust and adoption of emerging AI solutions in high-stakes areas. In the present work, we establish how and when hallucinations arise in pre-trained transformer models through concept representations captured by sparse autoencoders, under scenarios with experimentally controlled uncertainty in the input space. Our systematic experiments reveal that the number of semantic concepts used by the transformer model grows as the input information becomes increasingly unstructured. In the face of growing uncertainty in the input space, the transformer model becomes prone to activate coherent yet input-insensitive semantic features, leading to hallucinated output. At its extreme, for pure-noise inputs, we identify a wide variety of robustly triggered and meaningful concepts in the intermediate activations of pre-trained transformer models, whose functional integrity we confirm through targeted steering. We also show that hallucinations in the output of a transformer model can be reliably predicted from the concept patterns embedded in transformer layer activations. This collection of insights on transformer internal processing mechanics has immediate consequences for aligning AI models with human values, AI safety, opening the attack surface for potential adversarial attacks, and providing a basis for automatic quantification of a model's hallucination risk.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.67)
Europe > United Kingdom (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.46)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)

Add feedback

What does the US-Iran deal mean for Lebanon and Israel?

BBC NewsJun-15-2026, 13:50:39 GMT

Watch: What does the US-Iran deal to end war mean for Lebanon and Israel? A deal has been agreed between the US and Iran to end the war they are in. The deal includes an end to military operations in Lebanon, but Israel says it forces will remain in the country indefinitely. As some Beirut residents attempt to return to their homes after previously fleeing, the BBC's international editor Jeremy Bowen takes a look at what the agreed deal might mean for those involved. Travelling with a humanitarian convoy, BBC's Hugo Bachega has been given rare access to a part of Lebanon under Israeli occupation.

artificial intelligence, football 2026, home news football 2026, (11 more...)

BBC News

Country:

Asia > Middle East > Iran (1.00)
Asia > Middle East > Israel (0.96)
Asia > Middle East > Lebanon > Beirut Governorate > Beirut (0.26)

Industry:

Leisure & Entertainment (1.00)
Government > Regional Government > Asia Government > Middle East Government (0.50)

Technology: Information Technology > Artificial Intelligence (0.49)

Add feedback

207be3da143f1043336627c5d25aae50-Paper-Conference.pdf

Neural Information Processing SystemsJun-15-2026, 13:47:14 GMT

Multi-modal Large Language Models (LLM) have advanced conversational abilities but struggle with providing live, interactive step-by-step guidance, a key capability for future AI assistants. Effective guidance requires not only delivering instructions but also detecting their successful execution, as well as identifying and alerting users to mistakes, all of which has to happen in real-time. This requires models that are not turn-based, but that can react asynchronously to a video stream, as well as video data showing users performing tasks including mistakes and their corrections. To this end, we introduce Qualcomm Interactive Cooking, a new benchmark and dataset built upon CaptainCook4D, which contains user mistakes during task execution. Our dataset and benchmark features densely annotated, timed instructions and feedback messages, specifically including mistake alerts precisely timestamped to their visual occurrence in the video. We evaluate state-ofthe-art multi-modal LLMs on the Qualcomm Interactive Cooking benchmark and introduce LIVEMAMBA, a streaming multi-modal LLM designed for interactive instructional guidance. This work provides the first dedicated benchmark and a strong baseline for developing and evaluating on live, situated coaching.

large language model, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback