AITopics | Industry

Collaborating Authors

Industry

Caption This, Reason That: VLMs Caught in the Middle

Neural Information Processing SystemsJun-16-2026, 10:06:02 GMT

Vision-Language Models (VLMs) have shown remarkable progress in visual understanding in recent years. Yet, they still lag behind human capabilities in specific visual tasks such as counting or relational reasoning. To understand the underlying limitations, we adopt methodologies from cognitive science, analyzing VLM performance along core cognitive axes: Perception, Attention, and Memory. Using a suite of tasks targeting these abilities, we evaluate state-of-the-art VLMs, including GPT-4o. Our analysis reveals distinct cognitive profiles: while advanced models approach ceiling performance on some tasks (e.g.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

The Pain and Promise of Summer Camp

TIME - TechJun-16-2026, 10:00:03 GMT

Follow this section to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW? Smart Alerts: Get notified about major news as it happens. Follow this tag to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW?

advertisement, artificial intelligence, open follow modal personalized content, (9 more...)

TIME - Tech

Country: North America > United States (0.30)

Genre: Research Report (0.48)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.48)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.42)

Add feedback

Streamer IShowSpeed Is Gen Z's ESPN

WIREDJun-16-2026, 10:00:00 GMT

At 21, Speed has pushed the limits of streaming by transforming a distinctly solo format into a global group chat. His song for this year's World Cup is becoming the tournament's unofficial anthem. Streamer IShowSpeed is a huge soccer fan who plans to bring this year's World Cup to his millions of followers. In the days leading up to the 2026 World Cup, the streamer IShowSpeed--one of the most watched people on the planet, who occasionally moonlights as a rapper--released the music video " World Cup (Champions)," a song about flexing national pride where he mentions all 48 teams. As with everything the 21-year-old born Darren Watkins Jr. does, the video was instantly everywhere. The song racked up over 7 million views on YouTube in under 24 hours. The internet rushed to christen it as the anthem of the tournament, even though the World Cup already has one. FIFA, following a ridiculous outpouring from fans and perhaps realizing the massive instant exposure he could bring, added the song to its official album.

artificial intelligence, social media, world cup, (13 more...)

WIRED

Country:

North America > United States > California (0.15)
North America > United States > Ohio (0.14)

Industry:

Media (1.00)
Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

All that structure matches does not glitter

Neural Information Processing SystemsJun-16-2026, 09:57:45 GMT

Generative models for materials, especially inorganic crystals, hold potential to transform the theoretical prediction of novel compounds and structures. Advancement in this field depends critically on robust benchmarks and minimal, information-rich datasets that enable meaningful model evaluation. This paper critically examines common datasets and reported metrics for a crystal structure prediction task--generating the most likely structures given the chemical composition of a material. We focus on three key issues: First, materials datasets should contain unique crystal structures; for example, we show that the widely-utilized carbon-24 dataset only contains 40%unique structures. Second, materials datasets should not be split randomly if polymorphs of many different compositions are numerous, which we find to be the case for the perov-5 and MP-20 datasets.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine (0.93)
Materials > Chemicals (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Reinforcing Image Generation with Collaborative Semantic level and Token level CoT

Neural Information Processing SystemsJun-16-2026, 09:56:31 GMT

Recent advancements in large language models have demonstrated how chain-ofthought (CoT) and reinforcement learning (RL) can improve performance. However, applying such reasoning strategies to the visual generation domain remains largely unexplored. In this paper, we present T2I-R1, a novel reasoning-enhanced text-to-image generation model, powered by RL with a bi-level CoT reasoning process. Specifically, we identify two levels of CoT that can be utilized to enhance different stages of generation: (1) the semantic-level CoT for high-level planning of the prompt and (2) the token-level CoT for low-level pixel processing during patch-by-patch generation. To better coordinate these two levels of CoT, we introduce BiCoT-GRPO with an ensemble of generation rewards, which seamlessly optimizes both generated CoTs within the same training step. By applying our reasoning strategies to the baseline model, Janus-Pro, we achieve superior performance with 13% improvement on T2I-CompBench and 19% improvement on the WISE benchmark, even surpassing the state-of-the-art model FLUX.1. All the training code and data are available at https://github.com/CaraJ7/T2I-R1.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Transportation > Ground > Rail (0.46)
Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Generalized and Invariant Single-Neuron In-Vivo Activity Representation Learning

Neural Information Processing SystemsJun-16-2026, 09:47:17 GMT

In neuroscience, models that learn representations of single-neuron in-vivo activity are essential for understanding the functional identities of individual neurons. The primary goal of these models--spanning Transformer-based, contrastive, and variational autoencoder frameworks, is not to predict neural activity, but to distill it into a stable, low-dimensional embedding that captures a neuron's intrinsic features. These learned identity embeddings should be invariant to changing experimental conditions while reflecting the neuron's molecular type and anatomical location, thus enabling downstream tasks like in-vivo cell type prediction. However, current models suffer from limited generalizability due to batch effects: non-biological variations arising from differences in experimental design, animal subjects, or recording platforms. These batch effects cause overfitting, reducing model robustness and utility.

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

KnowMol: Advancing Molecular Large Language Models with Multi-Level Chemical Knowledge

Neural Information Processing SystemsJun-16-2026, 09:46:35 GMT

Tthesehe challenges, we introduce cKnoarbwMol-100K,oxylate group and the polarizable sulfur atom, methylsulfanyl group attaalarchge-scaed tole tdatasethe sixwithth c100Karbofine-grainedn and molecular annotations Theacross polamriultiplety of the molecule is increased by the polar verum with data available.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Asia > Middle East > UAE (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry:

Materials > Chemicals > Commodity Chemicals > Petrochemicals (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Don't Just Chase " Highlighted Tokens " in MLLMs: Revisiting Visual Holistic Context Retention

Neural Information Processing SystemsJun-16-2026, 09:37:29 GMT

Despite their powerful capabilities, Multimodal Large Language Models (MLLMs) suffer from considerable computational overhead due to their reliance on massive visual tokens. Recent studies have explored token pruning to alleviate this problem, which typically uses text-vision cross-attention or [CLS] attention to assess and discard redundant visual tokens. In this work, we identify a critical limitation of such attention-first pruning approaches, i.e., they tend to preserve semantically similar tokens, resulting in pronounced performance drops under high pruning ratios. To this end, we propose HoloV, a simple yet effective, plug-and-play visual token pruning framework for efficient inference.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry:

Information Technology (0.46)
Health & Medicine (0.46)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Efficient Fairness-Performance Pareto Front Computation

Neural Information Processing SystemsJun-16-2026, 09:34:51 GMT

There is a well known intrinsic trade-off between the fairness of a representation and the performance of classifiers derived from the representation. In this paper we propose a new method to compute the optimal Pareto front of this trade off. In contrast to the existing methods, this approach does not require the training of complex fair representation models. Our approach is derived through three main steps: We analyze fair representations theoretically, and derive several structural properties of optimal representations. We then show that these properties enable a reduction of the computation of the Pareto Front to a compact discrete problem. Finally, we show that these compact approximating problems can be efficiently solved via off-the shelf concave-convex programming methods.

artificial intelligence, machine learning, representation, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Asia > Middle East (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)

Add feedback

'Pretty Crazy' Token Usage Is Testing Bosses' Bet on AI

WIREDJun-16-2026, 09:30:00 GMT

'Pretty Crazy' Token Usage Is Testing Bosses' Bet on AI A Silicon Valley software maker and an ecommerce company reveal to WIRED how they are navigating the emerging challenge of "tokenomics." At the software company 8x8, employees are using Anthropic's Claude to draft emails, analyze customer feedback, and write code, but so far, their growing reliance on the artificial intelligence chatbot hasn't troubled the finance team. While other Silicon Valley companies, such as Meta, Uber, and Salesforce, have publicly expressed concerns about the growing cost of generative AI tools and have begun introducing usage caps in some cases, 8x8 says it finds itself in the black. Over the past 18 months, the company estimates it has saved about $5 million in annual costs by canceling subscriptions to dozens of software and educational tools it deemed unnecessary in part because Claude could provide similar capabilities. So far, 8x8's annualized bill for Claude is "well below" that figure, says Joel Neeb, the company's chief transformation and business operations officer.

claude, large language model, machine learning, (16 more...)

WIRED

Country: North America > United States > California (0.69)

Industry: Information Technology > Software (0.89)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.37)

Add feedback