Goto

Collaborating Authors

 Media


DiffusionLight-Turbo: Accelerated Light Probes for Free via Single-Pass Chrome Ball Inpainting

arXiv.org Artificial Intelligence

We introduce a simple yet effective technique for estimating lighting from a single low-dynamic-range (LDR) image by reframing the task as a chrome ball inpainting problem. This approach leverages a pre-trained diffusion model, Stable Diffusion XL, to overcome the generalization failures of existing methods that rely on limited HDR panorama datasets. While conceptually simple, the task remains challenging because diffusion models often insert incorrect or inconsistent content and cannot readily generate chrome balls in HDR format. Our analysis reveals that the inpainting process is highly sensitive to the initial noise in the diffusion process, occasionally resulting in unrealistic outputs. To address this, we first introduce DiffusionLight, which uses iterative inpainting to compute a median chrome ball from multiple outputs to serve as a stable, low-frequency lighting prior that guides the generation of a high-quality final result. To generate high-dynamic-range (HDR) light probes, an Exposure LoRA is fine-tuned to create LDR images at multiple exposure values, which are then merged. While effective, DiffusionLight is time-intensive, requiring approximately 30 minutes per estimation. To reduce this overhead, we introduce DiffusionLight-Turbo, which reduces the runtime to about 30 seconds with minimal quality loss. This 60x speedup is achieved by training a Turbo LoRA to directly predict the averaged chrome balls from the iterative process. Inference is further streamlined into a single denoising pass using a LoRA swapping technique. Experimental results that show our method produces convincing light estimates across diverse settings and demonstrates superior generalization to in-the-wild scenarios. Our code is available at https://diffusionlight.github.io/turbo


Exploring Classical Piano Performance Generation with Expressive Music Variational AutoEncoder

arXiv.org Artificial Intelligence

The creativity of classical music arises not only from composers who craft the musical sheets but also from performers who interpret the static notations with expressive nuances. This paper addresses the challenge of generating classical piano performances from scratch, aiming to emulate the dual roles of composer and pianist in the creative process. We introduce the Expressive Compound Word (ECP) representation, which effectively captures both the metrical structure and expressive nuances of classical performances. Building on this, we propose the Expressive Music Variational AutoEncoder (XMVAE), a model featuring two branches: a Vector Quantized Variational AutoEncoder (VQ-VAE) branch that generates score-related content, representing the Composer, and a vanilla VAE branch that produces expressive details, fulfilling the role of Pianist. These branches are jointly trained with similar Seq2Seq architectures, leveraging a multiscale encoder to capture beat-level contextual information and an orthogonal Transformer decoder for efficient compound tokens decoding. Both objective and subjective evaluations demonstrate that XMVAE generates classical performances with superior musical quality compared to state-of-the-art models. Furthermore, pretraining the Composer branch on extra musical score datasets contribute to a significant performance gain.


Epistemic Scarcity: The Economics of Unresolvable Unknowns

arXiv.org Artificial Intelligence

This paper presents a praxeological analysis of artificial intelligence and algorithmic governance, challenging assumptions about the capacity of machine systems to sustain economic and epistemic order. Drawing on Misesian a priori reasoning and Austrian theories of entrepreneurship, we argue that AI systems are incapable of performing the core functions of economic coordination: interpreting ends, discovering means, and communicating subjective value through prices. Where neoclassical and behavioural models treat decisions as optimisation under constraint, we frame them as purposive actions under uncertainty. We critique dominant ethical AI frameworks such as Fairness, Accountability, and Transparency (FAT) as extensions of constructivist rationalism, which conflict with a liberal order grounded in voluntary action and property rights. Attempts to encode moral reasoning in algorithms reflect a misunderstanding of ethics and economics. However complex, AI systems cannot originate norms, interpret institutions, or bear responsibility. They remain opaque, misaligned, and inert. Using the concept of epistemic scarcity, we explore how information abundance degrades truth discernment, enabling both entrepreneurial insight and soft totalitarianism. Our analysis ends with a civilisational claim: the debate over AI concerns the future of human autonomy, institutional evolution, and reasoned choice. The Austrian tradition, focused on action, subjectivity, and spontaneous order, offers the only coherent alternative to rising computational social control.


Enhanced Influence-aware Group Recommendation for Online Media Propagation

arXiv.org Artificial Intelligence

Group recommendation over social media streams has attracted significant attention due to its wide applications in domains such as e-commerce, entertainment, and online news broadcasting. By leveraging social connections and group behaviours, group recommendation (GR) aims to provide more accurate and engaging content to a set of users rather than individuals. Recently, influence-aware GR has emerged as a promising direction, as it considers the impact of social influence on group decision-making. In earlier work, we proposed Influence-aware Group Recommendation (IGR) to solve this task. However, this task remains challenging due to three key factors: the large and ever-growing scale of social graphs, the inherently dynamic nature of influence propagation within user groups, and the high computational overhead of real-time group-item matching. To tackle these issues, we propose an Enhanced Influence-aware Group Recommendation (EIGR) framework. First, we introduce a Graph Extraction-based Sampling (GES) strategy to minimise redundancy across multiple temporal social graphs and effectively capture the evolving dynamics of both groups and items. Second, we design a novel DYnamic Independent Cascade (DYIC) model to predict how influence propagates over time across social items and user groups. Finally, we develop a two-level hash-based User Group Index (UG-Index) to efficiently organise user groups and enable real-time recommendation generation. Extensive experiments on real-world datasets demonstrate that our proposed framework, EIGR, consistently outperforms state-of-the-art baselines in both effectiveness and efficiency.


Long-Sequence Memory with Temporal Kernels and Dense Hopfield Functionals

arXiv.org Artificial Intelligence

In this study we introduce a novel energy functional for long-sequence memory, building upon the framework of dense Hopfield networks which achieves exponential storage capacity through higher-order interactions. Building upon earlier work on long-sequence Hopfield memory models, we propose a temporal kernal $K(m, k)$ to incorporate temporal dependencies, enabling efficient sequential retrieval of patterns over extended sequences. We demonstrate the successful application of this technique for the storage and sequential retrieval of movies frames which are well suited for this because of the high dimensional vectors that make up each frame creating enough variation between even sequential frames in the high dimensional space. The technique has applications in modern transformer architectures, including efficient long-sequence modeling, memory augmentation, improved attention with temporal bias, and enhanced handling of long-term dependencies in time-series data. Our model offers a promising approach to address the limitations of transformers in long-context tasks, with potential implications for natural language processing, forecasting, and beyond.


Towards culturally-appropriate conversational AI for health in the majority world: An exploratory study with citizens and professionals in Latin America

arXiv.org Artificial Intelligence

There is justifiable interest in leveraging conversational AI (CAI) for health across the majority world, but to be effective, CAI must respond appropriately within cultur ally and linguistically diverse context s . Therefore, we need ways to address the fact that current LLMs exclude many lived experience s globally . Various advances are underway which focus on top - down approaches and increas ing training data . In this paper, we aim to complement these with a bottom - up locally - grounded approach based on qualitative data collected during participatory workshops in Latin America. Our goal is to construct a rich and human - centred understanding o f: a) potential areas of cultural misalignment in digital health; b) regional perspectives on chatbots for health and c) strategies for creating culturally - appropriate CAI; with a focus on the understudied Latin American context . Our findings show that academic boundaries on notions of cultur e lose meaning at the ground level and technologies will need to engage with a broad er framework; one that encapsulates the way economics, politics, geogr aphy and local logistics are entangled in cultural experience. To this end, we introduce a framework for ' Pluriversal Conversational AI for H ealth ' which allows for the possibility that more relationality and tolerance, rather than just more data, may be called for .


LoRA Fine-Tuning Without GPUs: A CPU-Efficient Meta-Generation Framework for LLMs

arXiv.org Machine Learning

Low-Rank Adapters (LoRAs) have transformed the fine-tuning of Large Language Models (LLMs) by enabling parameter-efficient updates. However, their widespread adoption remains limited by the reliance on GPU-based training. In this work, we propose a theoretically grounded approach to LoRA fine-tuning designed specifically for users with limited computational resources, particularly those restricted to standard laptop CPUs. Our method learns a meta-operator that maps any input dataset, represented as a probability distribution, to a set of LoRA weights by leveraging a large bank of pre-trained adapters for the Mistral-7B-Instruct-v0.2 model. Instead of performing new gradient-based updates, our pipeline constructs adapters via lightweight combinations of existing LoRAs directly on CPU. While the resulting adapters do not match the performance of GPU-trained counterparts, they consistently outperform the base Mistral model on downstream tasks, offering a practical and accessible alternative to traditional GPU-based fine-tuning.


Workflow-Based Evaluation of Music Generation Systems

arXiv.org Artificial Intelligence

This study presents an exploratory evaluation of Music Generation Systems (MGS) within contemporary music production workflows by examining eight open-source systems. The evaluation framework combines technical insights with practical experimentation through criteria specifically designed to investigate the practical and creative affordances of the systems within the iterative, non-linear nature of music production. Employing a single-evaluator methodology as a preliminary phase, this research adopts a mixed approach utilizing qualitative methods to form hypotheses subsequently assessed through quantitative metrics. The selected systems represent architectural diversity across both symbolic and audio-based music generation approaches, spanning composition, arrangement, and sound design tasks. The investigation addresses limitations of current MGS in music production, challenges and opportunities for workflow integration, and development potential as collaborative tools while maintaining artistic authenticity. Findings reveal these systems function primarily as complementary tools enhancing rather than replacing human expertise. They exhibit limitations in maintaining thematic and structural coherence that emphasize the indispensable role of human creativity in tasks demanding emotional depth and complex decision-making. This study contributes a structured evaluation framework that considers the iterative nature of music creation. It identifies methodological refinements necessary for subsequent comprehensive evaluations and determines viable areas for AI integration as collaborative tools in creative workflows. The research provides empirically-grounded insights to guide future development in the field.


Unexpected drone operated by unidentified party sighted near USMNT training grounds: reports

FOX News

Fox News Flash top sports headlines are here. Check out what's clicking on Foxnews.com. The U.S. men's national team is vying for the coveted CONCACAF Gold Cup winners trophy. But, as the USMNT prepared for Wednesday's semifinal match against Guatemala, a flying object caused a disruption at the team's training grounds. An unidentified party was believed to have been operating what appeared to be a drone in the vicinity of the team's training facility in St. Louis, CBS Sports reported.


New Google AI makes robots smarter without the cloud

FOX News

Google DeepMind has introduced a powerful on-device version of its Gemini Robotics AI. This new system allows robots to complete complex tasks without relying on a cloud connection. Known as Gemini Robotics On-Device, the model brings Gemini's advanced reasoning and control capabilities directly into physical robots. It is designed for fast, reliable performance in places with poor or no internet connectivity, making it ideal for real-world, latency-sensitive environments. Unlike its cloud-connected predecessor, this version runs entirely on the robot itself.