AITopics | toaster

Collaborating Authors

toaster

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gear News of the Week: The iPhone Air Is Surprisingly Repairable, and Gemini Comes to Google TV

WIREDSep-27-2025, 10:30:00 GMT

Plus: Withings collabs with Clue to offer advanced women's cycle tracking, there's a new Balmuda toaster, and Shokz shows off Dolby Audio-powered open earbuds. All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links. Thinner, smaller gadgets are usually harder to repair due to their constrained space, but surprise, surprise, Apple's 5.6 mm-thin iPhone Air has earned a respectable 7/10 repair score from iFixit . A key factor in this was Apple relocating the logic board to create more space for the battery, making it easier to access.

apple, courtesy, iphone 17, (15 more...)

WIRED

Country:

North America > United States > California (0.04)
Europe > Slovakia (0.04)
Europe > Czechia (0.04)
Asia > China (0.04)

Industry:

Health & Medicine (0.94)
Semiconductors & Electronics (0.71)
Government > Space Agency (0.47)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

Kim, Geon-Hyeong, Jang, Youngsoo, Kim, Yu Jin, Kim, Byoungjip, Lee, Honglak, Bae, Kyunghoon, Lee, Moontae

arXiv.org Artificial IntelligenceMay-27-2025

As Large Language Models (LLMs) continue to advance and find applications across a growing number of fields, ensuring the safety of LLMs has become increasingly critical. To address safety concerns, recent studies have proposed integrating safety constraints into Reinforcement Learning from Human Feedback (RLHF). However, these approaches tend to be complex, as they encompass complicated procedures in RLHF along with additional steps required by the safety constraints. Inspired by Direct Preference Optimization (DPO), we introduce a new algorithm called SafeDPO, which is designed to directly optimize the safety alignment objective in a single stage of policy learning, without requiring relaxation. SafeDPO introduces only one additional hyperparameter to further enhance safety and requires only minor modifications to standard DPO. As a result, it eliminates the need to fit separate reward and cost models or to sample from the language model during fine-tuning, while still enhancing the safety of LLMs. Finally, we demonstrate that SafeDPO achieves competitive performance compared to state-of-the-art safety alignment algorithms, both in terms of aligning with human preferences and improving safety.

helpfulness, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2505.20065

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.68)
Law Enforcement & Public Safety (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

WildIFEval: Instruction Following in the Wild

Lior, Gili, Yehudai, Asaf, Gera, Ariel, Ein-Dor, Liat

arXiv.org Artificial IntelligenceMar-9-2025

Recent LLMs have shown remarkable success in following user instructions, yet handling instructions with multiple constraints remains a significant challenge. In this work, we introduce WildIFEval - a large-scale dataset of 12K real user instructions with diverse, multi-constraint conditions. Unlike prior datasets, our collection spans a broad lexical and topical spectrum of constraints, in natural user prompts. We categorize these constraints into eight high-level classes to capture their distribution and dynamics in real-world scenarios. Leveraging WildIFEval, we conduct extensive experiments to benchmark the instruction-following capabilities of leading LLMs. Our findings reveal that all evaluated models experience performance degradation with an increasing number of constraints. Thus, we show that all models have a large room for improvement on such tasks. Moreover, we observe that the specific type of constraint plays a critical role in model performance. We release our dataset to promote further research on instruction-following under complex, realistic conditions.

constraint, ild ife val, instruction, (12 more...)

arXiv.org Artificial Intelligence

2503.06573

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Taiwan > Taiwan Province > Taipei (0.04)
Asia > Singapore (0.04)
(9 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Leisure & Entertainment (0.93)
Education > Educational Setting (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Non-literal Understanding of Number Words by Language Models

Tsvilodub, Polina, Gandhi, Kanishk, Zhao, Haoran, Fränken, Jan-Philipp, Franke, Michael, Goodman, Noah D.

arXiv.org Artificial IntelligenceFeb-10-2025

Humans naturally interpret numbers non-literally, effortlessly combining context, world knowledge, and speaker intent. We investigate whether large language models (LLMs) interpret numbers similarly, focusing on hyperbole and pragmatic halo effects. Through systematic comparison with human data and computational models of pragmatic reasoning, we find that LLMs diverge from human interpretation in striking ways. By decomposing pragmatic reasoning into testable components, grounded in the Rational Speech Act framework, we pinpoint where LLM processing diverges from human cognition -- not in prior knowledge, but in reasoning with it. This insight leads us to develop a targeted solution -- chain-of-thought prompting inspired by an RSA model makes LLMs' interpretations more human-like. Our work demonstrates how computational cognitive models can both diagnose AI-human differences and guide development of more human-like language understanding capabilities.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.06204

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints

Ferraz, Thomas Palmeira, Mehta, Kartik, Lin, Yu-Hsiang, Chang, Haw-Shiuan, Oraby, Shereen, Liu, Sijia, Subramanian, Vivek, Chung, Tagyoung, Bansal, Mohit, Peng, Nanyun

arXiv.org Artificial IntelligenceOct-8-2024

Instruction following is a key capability for LLMs. However, recent studies have shown that LLMs often struggle with instructions containing multiple constraints (e.g. a request to create a social media post "in a funny tone" with "no hashtag"). Despite this, most evaluations focus solely on synthetic data. To address this, we introduce RealInstruct, the first benchmark designed to evaluate LLMs' ability to follow real-world multi-constrained instructions by leveraging queries real users asked AI assistants. We also investigate model-based evaluation as a cost-effective alternative to human annotation for this task. Our findings reveal that even the proprietary GPT-4 model fails to meet at least one constraint on over 21% of instructions, highlighting the limitations of state-of-the-art models. To address the performance gap between open-source and proprietary models, we propose the Decompose, Critique and Refine (DeCRIM) self-correction pipeline, which enhances LLMs' ability to follow constraints. DeCRIM works by decomposing the original instruction into a list of constraints and using a Critic model to decide when and where the LLM's response needs refinement. Our results show that DeCRIM improves Mistral's performance by 7.3% on RealInstruct and 8.0% on IFEval even with weak feedback. Moreover, we demonstrate that with strong feedback, open-source LLMs with DeCRIM can outperform GPT-4 on both benchmarks.

constraint, evaluation, instruction, (13 more...)

arXiv.org Artificial Intelligence

2410.06458

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Singapore (0.04)
Asia > China (0.04)
(12 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment (1.00)
Education (1.00)
Health & Medicine (0.68)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)

Add feedback

A Dialogue Game for Eliciting Balanced Collaboration

Jeknić, Isidora, Schlangen, David, Koller, Alexander

arXiv.org Artificial IntelligenceJul-11-2024

Collaboration is an integral part of human dialogue. Typical task-oriented dialogue games assign asymmetric roles to the participants, which limits their ability to elicit naturalistic role-taking in collaboration and its negotiation. We present a novel and simple online setup that favors balanced collaboration: a two-player 2D object placement game in which the players must negotiate the goal state themselves. We show empirically that human players exhibit a variety of role distributions, and that balanced collaboration improves task performance. We also present an LLM-based baseline agent which demonstrates that automatic playing of our game is an interesting challenge for artificial systems.

agent, collaboration, user 1, (14 more...)

arXiv.org Artificial Intelligence

2406.08202

Country:

Europe > Germany > Brandenburg > Potsdam (0.05)
Europe > Germany > Saarland > Saarbrücken (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(6 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models

Sarch, Gabriel, Wu, Yue, Tarr, Michael J., Fragkiadaki, Katerina

arXiv.org Artificial IntelligenceNov-20-2023

Pre-trained and frozen large language models (LLMs) can effectively map simple scene rearrangement instructions to programs over a robot's visuomotor functions through appropriate few-shot example prompting. To parse open-domain natural language and adapt to a user's idiosyncratic procedures, not known during prompt engineering time, fixed prompts fall short. In this paper, we introduce HELPER, an embodied agent equipped with an external memory of language-program pairs that parses free-form human-robot dialogue into action programs through retrieval-augmented LLM prompting: relevant memories are retrieved based on the current dialogue, instruction, correction, or VLM description, and used as in-context prompt examples for LLM querying. The memory is expanded during deployment to include pairs of user's language and action plans, to assist future inferences and personalize them to the user's language and routines. HELPER sets a new state-of-the-art in the TEACh benchmark in both Execution from Dialog History (EDH) and Trajectory from Dialogue (TfD), with a 1.7x improvement over the previous state-of-the-art for TfD. Our models, code, and video results can be found in our project's website: https://helper-agent-llm.github.io.

agent, dialogue, interactionobject, (15 more...)

arXiv.org Artificial Intelligence

2310.15127

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Amazon's new AI tool conjures fake backgrounds for real products

EngadgetOct-25-2023, 16:20:13 GMT

Amazon is rolling out a new beta feature that lets advertisers create AI-generated image backgrounds for products. The company describes it as "a generative AI solution designed to remove creative barriers" while boosting ad performance. "It's a perfect use for generative AI -- less effort and better outcomes," Colleen Aubrey, senior vice president of Amazon Ads Products and Technology, wrote Wednesday in an announcement blog post. The company views the feature as an ideal alternative to product shots in front of generic white backgrounds (or bad Photoshop jobs). Amazon says the process is easy and requires no technical expertise.

advertiser, amazon, background, (7 more...)

Engadget

Industry: Retail (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.51)

Add feedback

Integrating Symbolic Reasoning into Neural Generative Models for Design Generation

Jacobson, Maxwell Joseph, Xue, Yexiang

arXiv.org Artificial IntelligenceOct-13-2023

Design generation requires tight integration of neural and symbolic reasoning, as good design must meet explicit user needs and honor implicit rules for aesthetics, utility, and convenience. Current automated design tools driven by neural networks produce appealing designs, but cannot satisfy user specifications and utility requirements. Symbolic reasoning tools, such as constraint programming, cannot perceive low-level visual information in images or capture subtle aspects such as aesthetics. We introduce the Spatial Reasoning Integrated Generator (SPRING) for design generation. SPRING embeds a neural and symbolic integrated spatial reasoning module inside the deep generative network. The spatial reasoning module decides the locations of objects to be generated in the form of bounding boxes, which are predicted by a recurrent neural network and filtered by symbolic constraint satisfaction. Embedding symbolic reasoning into neural generation guarantees that the output of SPRING satisfies user requirements. Furthermore, SPRING offers interpretability, allowing users to visualize and diagnose the generation process through the bounding boxes. SPRING is also adept at managing novel user specifications not encountered during its training, thanks to its proficiency in zero-shot constraint transfer. Quantitative evaluations and a human study reveal that SPRING outperforms baseline generative models, excelling in delivering high design quality and better meeting user specifications.

constraint, microwave, symbolic reasoning & neural model, (10 more...)

arXiv.org Artificial Intelligence

2310.09383

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Raising the steaks! World's first AI-powered grill promises to cook the perfect steak in just 90 seconds - but it has an eye-watering $3,500 price tag

Daily Mail - Science & techOct-3-2023, 15:17:49 GMT

Whether it's too tough, burnt to a crisp or just dripping in fat, cooking steak on the outside grill rarely does the cut of meat justice. Thankfully, a British firm has created an artificial intelligence (AI)-powered grill that it claims makes a perfect steak in just 90 seconds under controlled conditions. Perfecta, from Birmingham-based firm Seergrills, cooks the meat as it's held in place vertically, like a piece of bread in a toaster, with ultra-hot grills on either side. It has AI-powered software called NeuralFire, which relies on data gathered from sensors inside the machine and cooking preferences inputted by the user. However, if you want to get hold of one you'd better start saving - the device has an eye-watering $3,500 price tag.

perfect steak, perfecta, university, (15 more...)

Daily Mail - Science & tech

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.06)
North America > United States (0.05)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback