AITopics | puppy

Collaborating Authors

puppy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning

Neural Information Processing SystemsJun-17-2026, 07:01:47 GMT

Recent large language models (LLMs) have demonstrated strong reasoning capabilities that benefits from online reinforcement learning (RL). These capabilities have primarily been demonstrated within the left-to-right autoregressive (AR) generation paradigm. In contrast, non-autoregressive paradigms based on diffusion generate text in a coarse-to-fine manner. Although recent diffusion-based large language models (dLLMs) have achieved competitive language modeling performance compared to their AR counterparts, it remains unclear if dLLMs can also leverage recent advances in LLM reasoning. To this end, we propose d1, a framework to adapt pre-trained masked dLLMs into reasoning models via a combination of supervised finetuning (SFT) and RL. Specifically, we develop and extend techniques to improve reasoning in pretrained dLLMs: (a) we utilize a masked SFT technique to distill knowledge and instill self-improvement behavior directly from existing datasets, and (b) we introduce a novel critic-free, policygradient based RL algorithm called diffu-GRPO, the first integration of policy gradient methods to masked dLLMs. Through empirical studies, we investigate the performance of different post-training recipes on multiple mathematical and planning benchmarks. We find that d1 yields the best performance and significantly improves performance of a state-of-the-art dLLM. Our code is released at https://dllm-reasoning.github.io/.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

How Doodles Became the Dog du Jour

The New YorkerMar-16-2026, 10:00:00 GMT

Poodle crossbreeds have grown overwhelmingly popular, sparking controversy in dog parks and kennel clubs alike. The features of doodles such as Peaches (above), a goldendoodle, have become the canine equivalent of Instagram face. Meet the Breeds, the American Kennel Club's annual showcase of purebred dogs, took place over two eye-wateringly cold days in early February at the Javits Center, in Manhattan. About a hundred and fifty of the two hundred and five varieties recognized as official breeds by the A.K.C., the long-standing authority in the U.S. dog world, were in attendance for the public to ogle, fondle, and coo "So cute!" to, including the basset fauve de Bretagne, a hunting hound from France that's one of three newly recognized breeds recently allowed into the purebred pantheon. Some of the dogs had competed in the Westminster Kennel Club Dog Show earlier in the week, and past champions had their ribbons on display. In spite of the frigid weather, pavilions hosting the more popular breeds--the pug, the Doberman pinscher, the Great Dane, the St. Bernard--were packed. Lesser-known varieties, such as the saluki, the Löwchen, and the Lapponian herder, drew sparser crowds. There were exhibition spaces for each breed, and on the back walls were three adjectives supposedly describing that particular type of dog's temperament. There is, in fact, no evidence that temperament is consistent within a breed, but the idea is deeply rooted in dogdom. I stopped to caress the velvety ear leather of a pharaoh hound ("Friendly, Smart, Noble"), a sprinting breed once used to hunt rabbits in Malta; accept kisses from a Portuguese water dog, bred to assist with retrieving tackle ("Affectionate, Adventurous, Athletic"); and have my photograph taken with a Leonberger, a German breed from the town of Leonberg, in southwest Germany ("Friendly, Gentle, Playful"). No one was supposed to be openly selling dogs, but, if you asked, the breeders would share their information. Excluding what are known as companion dogs, like the Leonberger, most of the animals at the show were designed for a purpose that is no longer required of them. In Great Britain, foxhounds are legally barred from chasing foxes. Consider the fate of the otterhound, an ancient variety with a noble heritage which was once used in the U.K. to hunt river otters, which were prized for their thick fur and disliked by wealthy landowners because they ate fish in their stocked ponds.

artificial intelligence, breed, social media, (15 more...)

The New Yorker

Country:

Europe > France (0.24)
Europe > Middle East > Malta (0.24)
Europe > Germany (0.24)
(18 more...)

Genre: Personal (0.46)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Leisure & Entertainment (0.93)
Government > Regional Government > North America Government > United States Government (0.93)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.93)

Add feedback

Our Son Just Discovered a Rude Hand Gesture. My Husband Is Thoroughly Amused. I Am Not.

SlateMar-15-2026, 12:00:00 GMT

My Husband Is Thoroughly Amused. After he spends time with his dad, we're back at square one. Have a question for Care and Feeding? My 5-year-old son, "Jasper," has recently discovered flipping the bird. He loves to do it at every opportunity, which has made for some rather embarrassing situations, to put it mildly.

artificial intelligence, slate shop game newsletter sign, social media, (8 more...)

Slate

Industry: Marketing (1.00)

Technology:

Information Technology > Communications > Social Media (0.49)
Information Technology > Artificial Intelligence > Vision > Gesture Recognition (0.41)

Add feedback

d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning

Zhao, Siyan, Gupta, Devaansh, Zheng, Qinqing, Grover, Aditya

arXiv.org Artificial IntelligenceJun-4-2025

Recent large language models (LLMs) have demonstrated strong reasoning capabilities that benefits from online reinforcement learning (RL). These capabilities have primarily been demonstrated within the left-to-right autoregressive (AR) generation paradigm. In contrast, non-autoregressive paradigms based on diffusion generate text in a coarse-to-fine manner. Although recent diffusion-based large language models (dLLMs) have achieved competitive language modeling performance compared to their AR counterparts, it remains unclear if dLLMs can also leverage recent advances in LLM reasoning. To this end, we propose d1, a framework to adapt pre-trained masked dLLMs into reasoning models via a combination of supervised finetuning (SFT) and RL. Specifically, we develop and extend techniques to improve reasoning in pretrained dLLMs: (a) we utilize a masked SFT technique to distill knowledge and instill self-improvement behavior directly from existing datasets, and (b) we introduce a novel critic-free, policy-gradient based RL algorithm called diffu-GRPO, the first integration of policy gradient methods to masked dLLMs. Through empirical studies, we investigate the performance of different post-training recipes on multiple mathematical and planning benchmarks. We find that d1 yields the best performance and significantly improves performance of a state-of-the-art dLLM. Our code is released at https://dllm-reasoning.github.io/.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2504.12216

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

Jiang, Dongzhi, Zhang, Renrui, Guo, Ziyu, Li, Yanwei, Qi, Yu, Chen, Xinyan, Wang, Liuhui, Jin, Jianhan, Guo, Claire, Yan, Shen, Zhang, Bo, Fu, Chaoyou, Gao, Peng, Li, Hongsheng

arXiv.org Artificial IntelligenceFeb-13-2025

Answering questions with Chain-of-Thought (CoT) has significantly enhanced the reasoning capabilities of Large Language Models (LLMs), yet its impact on Large Multimodal Models (LMMs) still lacks a systematic assessment and in-depth investigation. In this paper, we introduce MME-CoT, a specialized benchmark evaluating the CoT reasoning performance of LMMs, spanning six domains: math, science, OCR, logic, space-time, and general scenes. As the first comprehensive study in this area, we propose a thorough evaluation suite incorporating three novel metrics that assess the reasoning quality, robustness, and efficiency at a fine-grained level. Leveraging curated high-quality data and a unique evaluation strategy, we conduct an in-depth analysis of state-of-the-art LMMs, uncovering several key insights: 1) Models with reflection mechanism demonstrate a superior CoT quality, with Kimi k1.5 outperforming GPT-4o and demonstrating the highest quality results; 2) CoT prompting often degrades LMM performance on perception-heavy tasks, suggesting a potentially harmful overthinking behavior; and 3) Although the CoT quality is high, LMMs with reflection exhibit significant inefficiency in both normal response and self-correction phases. We hope MME-CoT serves as a foundation for advancing multimodal reasoning in LMMs. Project Page: https://mmecot.github.io/

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.09621

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre:

Workflow (0.68)
Research Report (0.50)

Industry:

Health & Medicine > Consumer Health (1.00)
Education > Health & Safety > School Nutrition (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Robotic dog helps those facing mental health and cognitive challenges

FOX NewsJan-24-2025, 11:00:47 GMT

Jennie the artificial intelligence-powered robotic dog is designed to provide comfort and companionship to those with mental health challenges. U.S. robotics company Tombot has introduced Jennie, an innovative AI-powered robotic pet designed to provide comfort and companionship to those facing cognitive health challenges. This groundbreaking creation is set to transform the lives of millions struggling with dementia, mild cognitive impairment and various mental health issues. Jennie's inception stems from a personal tragedy experienced by Tombot CEO Tom Stevens. When his mother, Nancy, was diagnosed with Alzheimer's, the family had to make the heart-wrenching decision to rehome her beloved dog, Golden Bear.

artificial intelligence, jennie, tombot, (13 more...)

FOX News

Country: North America > United States (0.16)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Trust but Verify: Programmatic VLM Evaluation in the Wild

Prabhu, Viraj, Purushwalkam, Senthil, Yan, An, Xiong, Caiming, Xu, Ran

arXiv.org Artificial IntelligenceOct-16-2024

Vision-Language Models (VLMs) often generate plausible but incorrect responses to visual queries. However, reliably quantifying the effect of such hallucinations in free-form responses to open-ended queries is challenging as it requires visually verifying each claim within the response. To construct PROVE, we provide a large language model (LLM) with a high-fidelity scene-graph representation constructed from a hyper-detailed image caption, and prompt it to generate diverse question-answer (QA) pairs, as well as programs that can be executed over the scene graph object to verify each QA pair. We thus construct a benchmark of 10.5k challenging but visually grounded QA pairs. Next, to evaluate free-form model responses to queries in PROVE, we propose a programmatic evaluation strategy that measures both the helpfulness and truthfulness of a response within a unified scene graph-based framework. We benchmark the helpfulness-truthfulness trade-offs of a range of VLMs on PROVE, finding that very few are in-fact able to achieve a good balance between the two. Vision-language models (VLMs) have emerged as an effective solution for generating responses to queries about visual content. This has led to a flurry of research on reliably benchmarking VLM performance (Liu et al., 2024a), by measuring not just the helpfulness but also the truthfulness of their responses. Existing benchmarks fall into two categories - discriminative (Hu et al., 2023; Lovenia et al., 2023; Li et al., 2023), which evaluate the model's responses to close-ended, existence-based queries ("Is there a man in this image?"), While discriminative benchmarks ease evaluation, they do not realistically simulate in-the-wild usage.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.13121

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

The 5 Best Prime Day Vacuum Deals We've Found (2024)

WIREDJul-16-2024, 09:01:04 GMT

I have a perhaps inappropriate, anthropomorphic relationship with whatever robot vacuum is running in my house. No matter how much trouble they cause me--if they get trapped in the ledge by the fireplace or lost under the couch--I never forget that it's here to help me battle the chaotic mess that my two kids and two dogs perpetrate upon me daily. Have I convinced you that you need one, too? You're in luck because the Amazon Prime Day vacuum deals lineup includes five of my top picks. Whether you need an all-in-one cleaning station, a simple picker-upper after dinner, or one with an air freshener, we have you covered.

robot vacuum, vacuum, wet head, (9 more...)

WIRED

Industry: Retail (0.36)

Technology: Information Technology > Artificial Intelligence > Robots (0.76)

Add feedback

Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation

Dahary, Omer, Patashnik, Or, Aberman, Kfir, Cohen-Or, Daniel

arXiv.org Artificial IntelligenceMar-25-2024

Text-to-image diffusion models have an unprecedented ability to generate diverse and high-quality images. However, they often struggle to faithfully capture the intended semantics of complex input prompts that include multiple subjects. Recently, numerous layout-to-image extensions have been introduced to improve user control, aiming to localize subjects represented by specific tokens. Yet, these methods often produce semantically inaccurate images, especially when dealing with multiple semantically or visually similar subjects. In this work, we study and analyze the causes of these limitations. Our exploration reveals that the primary issue stems from inadvertent semantic leakage between subjects in the denoising process. This leakage is attributed to the diffusion model's attention layers, which tend to blend the visual features of different subjects. To address these issues, we introduce Bounded Attention, a training-free method for bounding the information flow in the sampling process. Bounded Attention prevents detrimental leakage among subjects and enables guiding the generation to promote each subject's individuality, even with complex multi-subject conditioning. Through extensive experimentation, we demonstrate that our method empowers the generation of multiple subjects that better align with given prompts and layouts.

bounded attention, diffusion model, leakage, (16 more...)

arXiv.org Artificial Intelligence

2403.1699

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.84)

Add feedback

Luwu Dynamics XGO-Mini2 Review: Programmable Robotic Rover

WIREDJun-17-2023, 13:00:00 GMT

The XGO is a lap-size robot dog, marketed as "a metal pet on your desk," but it's primarily sold as a learning tool for programmers with an interest in machine vision and robotic automation. Robot pet fans should know, however, that this metallic mutt has more in common with Boston Dynamics' ominously-styled Spot than with Sony's consciously cute Aibo, with a remarkably well-made and solidly engineered metal body. Luwu Dynamics is clear that the XGO-Mini2 is more of a tool than a companion. Also, at $849, it is much more affordable and considerably more open to tinkering than Sony's $2,900 robot pet and more than $73,000 cheaper than Boston Dynamics' robotic quadruped. The XGO range, as it's sold, is fundamentally a robot body peripheral for a Raspberry Pi compute module.

boston dynamic, luwu dynamic xgo-mini2 review, programmable robotic rover, (8 more...)

WIRED

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback