AITopics | slider

Collaborating Authors

slider

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FouRA: Fourier Low Rank Adaptation

Neural Information Processing SystemsFeb-16-2026, 06:36:37 GMT

While Low-Rank Adaptation (LoRA) has proven beneficial for efficiently fine-tuning large models, LoRA fine-tuned text-to-image diffusion models lack diversity in the generated images, as the model tends to copy data from the observed training samples.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

factor of 1000, opening the road for gigapixel image generation at no additional cost. Our cascading method uses the image generated at the lowest resolution as a

Neural Information Processing SystemsFeb-12-2026, 10:12:21 GMT

In this work, we introduce Pixelsmith, a zero-shot text-to-image generative framework to sample images at higher-resolutions with a single GPU.

machine learning, natural language, resolution, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.04)
Europe > Switzerland > Zug > Zug (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Industry: Media > Photography (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Perceptual adjustment queries and an inverted measurement paradigm for low-rank metric learning Austin Xu

Neural Information Processing SystemsFeb-10-2026, 10:42:53 GMT

Cardinal data are numerical scores.

artificial intelligence, machine learning, matrix, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Semantic Document Derendering: SVG Reconstruction via Vision-Language Modeling

Hazimeh, Adam, Wang, Ke, Collier, Mark, Baechler, Gilles, Kokiopoulou, Efi, Frossard, Pascal

arXiv.org Artificial IntelligenceNov-18-2025

Multimedia documents such as slide presentations and posters are designed to be interactive and easy to modify. Yet, they are often distributed in a static raster format, which limits editing and customization. Restoring their editability requires converting these raster images back into structured vector formats. However, existing geometric raster-vectorization methods, which rely on low-level primitives like curves and polygons, fall short at this task. Specifically, when applied to complex documents like slides, they fail to preserve the high-level structure, resulting in a flat collection of shapes where the semantic distinction between image and text elements is lost. To overcome this limitation, we address the problem of semantic document derendering by introducing SliDer, a novel framework that uses Vision-Language Models (VLMs) to derender slide images as compact and editable Scalable Vector Graphic (SVG) representations. SliDer detects and extracts attributes from individual image and text elements in a raster input and organizes them into a coherent SVG format. Crucially, the model iteratively refines its predictions during inference in a process analogous to human design, generating SVG code that more faithfully reconstructs the original raster upon rendering. Furthermore, we introduce Slide2SVG, a novel dataset comprising raster-SVG pairs of slide documents curated from real-world scientific presentations, to facilitate future research in this domain. Our results demonstrate that SliDer achieves a reconstruction LPIPS of 0.069 and is favored by human evaluators in 82.9% of cases compared to the strongest zero-shot VLM baseline.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.13478

Genre: Research Report > New Finding (0.86)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Two-Layer Electrostatic Film Actuator with High Actuation Stress and Integrated Brake

Wang, Huacen, Wang, Hongqiang

arXiv.org Artificial IntelligenceNov-12-2025

Robotic systems driven by conventional motors often suffer from challenges such as large mass, complex control algorithms, and the need for additional braking mechanisms, which limit their applications in lightweight and compact robotic platforms. Electrostatic film actuators offer several advantages, including thinness, flexibility, lightweight construction, and high open-loop positioning accuracy. However, the actuation stress exhibited by conventional actuators in air still needs improvement, particularly for the widely used three-phase electrode design. To enhance the output performance of actuators, this paper presents a two-layer electrostatic film actuator with an integrated brake. By alternately distributing electrodes on both the top and bottom layers, a smaller effective electrode pitch is achieved under the same fabrication constraints, resulting in an actuation stress of approximately 241~N/m$^2$, representing a 90.5\% improvement over previous three-phase actuators operating in air. Furthermore, its integrated electrostatic adhesion mechanism enables load retention under braking mode. Several demonstrations, including a tug-of-war between a conventional single-layer actuator and the proposed two-layer actuator, a payload operation, a one-degree-of-freedom robotic arm, and a dual-mode gripper, were conducted to validate the actuator's advantageous capabilities in both actuation and braking modes.

actuation force, actuator, artificial intelligence, (14 more...)

arXiv.org Artificial Intelligence

2511.08005

Country: Asia > China (0.29)

Genre: Research Report (0.82)

Industry: Automobiles & Trucks (0.86)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

How AI Fails: An Interactive Pedagogical Tool for Demonstrating Dialectal Bias in Automated Toxicity Models

Ghimire, Subhojit

arXiv.org Artificial IntelligenceNov-11-2025

Now that AI-driven moderation has become pervasive in everyday life, we often hear claims that "the AI is biased". While this is often said jokingly, the light-hearted remark reflects a deeper concern. How can we be certain that an online post flagged as "inappropriate" was not simply the victim of a biased algorithm? This paper investigates this problem using a dual approach. First, I conduct a quantitative benchmark of a widely used toxicity model (unitary/toxic-bert) to measure performance disparity between text in African-American English (AAE) and Standard American English (SAE). The benchmark reveals a clear, systematic bias: on average, the model scores AAE text as 1.8 times more toxic and 8.8 times higher for "identity hate". Second, I introduce an interactive pedagogical tool that makes these abstract biases tangible. The tool's core mechanic, a user-controlled "sensitivity threshold," demonstrates that the biased score itself is not the only harm; instead, the more-concerning harm is the human-set, seemingly neutral policy that ultimately operationalises discrimination. This work provides both statistical evidence of disparate impact and a public-facing tool designed to foster critical AI literacy.

artificial intelligence, demonstrating dialectal bias, interactive pedagogical tool, (8 more...)

arXiv.org Artificial Intelligence

2511.06676

Genre: Research Report (0.89)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

FreeSliders: Training-Free, Modality-Agnostic Concept Sliders for Fine-Grained Diffusion Control in Images, Audio, and Video

Ezra, Rotem, Zisling, Hedi, Berman, Nimrod, Naiman, Ilan, Gorkor, Alexey, Nochumsohn, Liran, Nachmani, Eliya, Azencot, Omri

arXiv.org Artificial IntelligenceNov-4-2025

Diffusion models have become state-of-the-art generative models for images, audio, and video, yet enabling fine-grained controllable generation, i.e., continuously steering specific concepts without disturbing unrelated content, remains challenging. Concept Sliders (CS) offer a promising direction by discovering semantic directions through textual contrasts, but they require per-concept training and architecture-specific fine-tuning (e.g., LoRA), limiting scalability to new modalities. In this work we introduce FreeSliders, a simple yet effective approach that is fully training-free and modality-agnostic, achieved by partially estimating the CS formula during inference. To support modality-agnostic evaluation, we extend the CS benchmark to include both video and audio, establishing the first suite for fine-grained concept generation control with multiple modalities. We further propose three evaluation properties along with new metrics to improve evaluation quality. Finally, we identify an open problem of scale selection and non-linear traversals and introduce a two-stage procedure that automatically detects saturation points and reparameterizes traversal for perceptually uniform, semantically meaningful edits. Extensive experiments demonstrate that our method enables plug-and-play, training-free concept control across modalities, improves over existing baselines, and establishes new tools for principled controllable generation. An interactive presentation of our benchmark and method is available at: https://azencot-group.github.io/FreeSliders/. Diffusion models have emerged as state-of-the-art generative models, capable of producing realistic and diverse outputs across images, audio, and video (Rombach et al., 2022; Ho et al., 2022; Shi et al., 2023). Beyond generating high-quality samples, a central task is controllable generation, the ability to steer the generative process along user-specified signals (Liu et al., 2023; Ho et al., 2022). In particular, text-to-x, where x is a certain modality, has emerged as a powerful control signal for generative models, offering an intuitive human interface and enabling semantically aligned control (Zhang et al., 2023a;b). This text-guided capability plays a central role in creative applications, allowing users to produce high-quality content without requiring technical knowledge or professional design skills.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2511.00103

Country: Asia (0.67)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.74)

Add feedback

Sora Has Lost Its App Store Crown to Drake and Free Chicken

WIREDOct-24-2025, 19:30:26 GMT

Dave's Hot Chicken is the top app in the iOS App Store, ending Sora's weeks-long reign. On Friday, its reign came to an end. Your new champion is Dave's Hot Chicken. Dave's Hot Chicken now rules over the App Store, where its slack-beaked, bug-eyed mascot icon expresses appropriate surprise at its ascent. How did it break the grasp of OpenAI's golem TikTok?

dave, hot chicken, sora, (12 more...)

WIRED

Country: