AITopics | Industry

Collaborating Authors

Industry

RFMPose: Generative Category-level Object Pose Estimation via Riemannian Flow Matching

Neural Information Processing SystemsJun-17-2026, 18:35:30 GMT

We introduce RFMPose, a novel generative framework for category-level 6D object pose estimation that learns deterministic pose trajectories through Riemannian Flow Matching (RFM). Existing discriminative approaches struggle with multihypothesis predictions (e.g., symmetry ambiguities) and often require specialized network architectures. RFMPose advances this paradigm through three key innovations: (1) Ensuring geometric consistency via geodesic interpolation on Riemannian manifolds combined with bi-invariant metric constraints; (2) Alleviating symmetryinduced ambiguities through Riemannian Optimal Transport for probability mass redistribution without ad-hoc design; (3) Enabling end-to-end likelihood estimation through Hutchinson trace approximation, thereby eliminating auxiliary model dependencies. Extensive experiments on the Omni6DPose demonstrate state-ofthe-art performance of the proposed method, with significant improvements of +4.1 in IoU25 and +2.4 in 5 2cm metrics compared to prior generative approaches. Furthermore, the proposed RFM framework exhibits robust sim-to-real transfer capabilities and facilitates pose tracking extensions with minimal architectural adaptation.

artificial intelligence, estimation, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia > China (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry: Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Why You're Seeing a PA or NP--But Not a Doctor

TIME - TechJun-17-2026, 18:33:09 GMT

Follow this section to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW? Smart Alerts: Get notified about major news as it happens. Follow this tag to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW?

artificial intelligence, nurse practitioner, physician, (12 more...)

TIME - Tech

Country: North America > United States > California (0.14)

Genre: Research Report (0.69)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area (0.96)
Health & Medicine > Health Care Providers & Services > Nursing (0.79)
Government > Regional Government > North America Government > United States Government (0.47)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.42)

Add feedback

C-LoRA: Contextual Low-Rank Adaptation for Uncertainty Estimation in Large Language Models

Neural Information Processing SystemsJun-17-2026, 18:24:08 GMT

Low-Rank Adaptation (LoRA) offers a cost-effective solution for fine-tuning large language models (LLMs), but it often produces overconfident predictions in datascarce few-shot settings. To address this issue, several classical statistical learning approaches have been repurposed for scalable uncertainty-aware LoRA fine-tuning. However, these approaches neglect how input characteristics affect the predictive uncertainty estimates. To address this limitation, we propose Contextual Low-Rank Adaptation (C-LoRA) as a novel uncertainty-aware and parameter efficient finetuning approach, by developing new lightweight LoRA modules contextualized to each input data sample to dynamically adapt uncertainty estimates. Incorporating data-driven contexts into the parameter posteriors, C-LoRA mitigates overfitting, achieves well-calibrated uncertainties, and yields robust predictions.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.92)
North America > United States > Texas (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

No, I Don't Want to Watch Your Straight Hockey Show

WIREDJun-17-2026, 18:23:30 GMT

From Amazon's to Netflix's upcoming, the recent spate of hetero hockey romances shows Hollywood learned the wrong lessons from The streaming industry has gotten a lot of flak over the past few years, but there is one thing that Hollywood studios are undeniably good at: recycling the same idea, over and over and over again until the world ends (or until everyone finally decides they're sick of, whichever comes first). This tried-and-true formula is now playing out in real time with Prime Video's and Netflix's upcoming series Icebreaker shows that, like are hockey-themed romances about polar opposites who just can't seem to keep their hands off each other. But there's one key difference: and are about heterosexual romances, while is about a secret gay relationship. And considering how much queerness played a role in's explosive popularity, it seems like the clamor for straight horny hockey content is another example of Hollywood just not getting the message. The forthcoming which Netflix announced this week, is about a figure skater who falls in love with a hockey player after they're forced to practice on the same rink.

artificial intelligence, main content security politics, wired, (9 more...)

WIRED

Country: North America > United States (0.97)

Industry:

Media > Television (1.00)
Leisure & Entertainment > Sports > Hockey (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

FINERS: Fine-grained Reasoning and Segmentation of Small Objects with Reinforcement Learning

Neural Information Processing SystemsJun-17-2026, 18:22:29 GMT

Multi-modal Large Language Models (MLLMs) have shown remarkable capabilities across a wide range of vision-language tasks. However, due to the restricted input resolutions, MLLMs face significant challenges in precisely understanding and localizing visual details in high-resolution images--particularly when dealing with extra-small objects embedded in cluttered contexts. To address this issue, we propose FINERS, a two-stage MLLM-based reinforcement learning framework for jointly reasoning and segmenting extremely small objects within high-resolution scenes. FINERS adopts a coarse-to-fine pipeline comprising Global Semantic Exploration (GSE) and Localized Perceptual Refinement (LPR). Specifically, GSE performs instruction-guided reasoning to generate a textural response and a coarse target region, while LPR refines this region to produce an accurate bounding box and segmentation mask. To couple the two stages, we introduce a locate-informed retrospective reward, where LPR's outputs are used to optimize GSE for more robust coarse region exploration.

large language model, machine learning, segmentation, (19 more...)

Neural Information Processing Systems

Country:

Asia > China (0.46)
Europe (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Right Question is Already Half the Answer: Fully Unsupervised LLMReasoning Incentivization

Neural Information Processing SystemsJun-17-2026, 18:18:01 GMT

Existing methods to enhance the reasoning capability of large language models predominantly rely on supervised fine-tuning (SFT) followed by reinforcement learning (RL) on reasoning-specific data. These approaches critically depend on external supervisions-such as labeled reasoning traces, verified golden answers, or pre-trained reward models. In this work, we propose Entropy Minimized Policy Optimization (EMPO), which makes an early attempt at fully unsupervised LLM reasoning incentivization. By minimizing the semantic entropy of LLMs on unlabeled questions, EMPO achieves competitive performance compared to supervised counterparts. Specifically, without any external supervision, EMPO boosts the accuracy of Qwen2.5-Math-7BBase from 33.7% to 51.6% on math benchmarks and improves the accuracy of Qwen2.5-7BBase from 32.1% to 50.1% on MMLU-Pro. Primary analysis are also provided to interpret the effectiveness of EMPO.

large language model, machine learning, qwen2, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving

Neural Information Processing SystemsJun-17-2026, 18:16:19 GMT

Vision-Language-Action (VLA) models are increasingly used for end-to-end driving due to their world knowledge and reasoning ability. Most prior work, however, inserts textual chains-of-thought (CoT) as intermediate steps tailored to the current scene. Such symbolic compressions can blur spatio-temporal relations and discard fine visual cues, creating a cross-modal gap between perception and planning. We propose FSDrive, a visual spatio-temporal CoT framework that enables VLAs to think in images. The model first acts as a world model to generate a unified future frame that overlays coarse but physically-plausible priors--future lane dividers and 3D boxes--on the predicted future image. This unified frame serves as the visual CoT, capturing both spatial structure and temporal evolution.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry:

Transportation > Ground > Road (0.53)
Automobiles & Trucks (0.53)
Energy (0.46)
Information Technology > Robotics & Automation (0.44)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

3DPE-Gaze: Unlocking the Potential of 3DFacial Priors for Generalized Gaze Estimation

Neural Information Processing SystemsJun-17-2026, 18:07:03 GMT

In recent years, face-based deep-learning gaze estimation methods have achieved significant advancements. However, while face images provide supplementary information beneficial for gaze inference, the substantial extraneous information they contain also increases the risk of overfitting during model training and compromises generalization capability. To alleviate this problem, we propose the 3DPE-Gaze framework, explicitly modeling 3D facial priors for feature decoupling and generalized gaze estimation. The 3DPE-Gaze framework consists of two core modules: the 3DGeometric Prior Module (3DGP) incorporating the FLAME model to parameterize facial structures and gaze-irrelevant facial appearances while extracting gaze features; the Semantic Concept Alignment Module (SCAM) separates gaze-related and unrelated concepts through CLIP-guided contrastive learning. Finally, the 3DPE-Gaze framework combines 3D facial landmark as prior for generalized gaze estimation. Experimental results show that 3DPE-Gaze outperforms existing state-of-the-art methods on four major cross-domain tasks, with particularly outstanding performance in challenging scenarios such as lighting variations, extreme head poses, and glasses occlusion.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

See through the Dark: Learning Illumination-affined Representations for Nighttime Occupancy Prediction

Neural Information Processing SystemsJun-17-2026, 18:06:47 GMT

Occupancy prediction aims to estimate the 3D spatial distribution of occupied regions along with their corresponding semantic labels. Existing vision-based methods perform well on daytime benchmarks but struggle in nighttime scenarios due to limited visibility and challenging lighting conditions. To address these challenges, we propose LIAR, a novel framework that learns illumination-affined representations. LIAR first introduces Selective Low-light Image Enhancement (SLLIE), which leverages the illumination priors from daytime scenes to adaptively determine whether a nighttime image is genuinely dark or sufficiently well-lit, enabling more targeted global enhancement. Building on the illumination maps generated by SLLIE, LIAR further incorporates two illumination-aware components: 2DIllumination-guided Sampling (2D-IGS) and 3DIllumination-driven Projection (3D-IDP), to respectively tackle local underexposure and overexposure. Specifically, 2D-IGS modulates feature sampling positions according to illumination maps, assigning larger offsets to darker regions and smaller ones to brighter regions, thereby alleviating feature degradation in underexposed areas. Subsequently, 3D-IDP enhances semantic understanding in overexposed regions by constructing illumination intensity fields and supplying refined residual queries to the BEV context refinement process. Extensive experiments on both real and synthetic datasets demonstrate the superior performance of LIAR under challenging nighttime scenarios. The source code and pretrained models are available here.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia (0.67)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

New Scientist recommends an excellent look at the future of work

New ScientistJun-17-2026, 18:00:00 GMT

Sarah O'Connor's We Are Not Machines explores how we are contorting ourselves to fit AI into our working lives - and what to do about it, finds Tom Knowles Employers wanting staff to be more like machines isn't new, says O'Connor If you are a fan of translated films, you may have noticed the subtitles on streaming platforms have changed in recent years. They aren't wrong exactly, but they can come across as a bit, well, flat. "You get the meaning, but the language? It's not as rich," Petr Čermoch, a translator in the Czech Republic, tells Sarah O'Connor in We Are Not Machines, which explores how artificial intelligence is changing the way we work. That lack of richness is usually because the streaming platform has used AI to translate a script, then had a professional translator like Čermoch finesse it.

artificial intelligence, connor, social media, (14 more...)

New Scientist

Country: Europe > Czechia (0.25)

Genre: Research Report > New Finding (0.42)

Industry: Health & Medicine > Therapeutic Area (0.98)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Robots (0.68)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.52)

Add feedback