Goto

Collaborating Authors

 alchemist




OmniPlay: Benchmarking Omni-Modal Models on Omni-Modal Game Playing

Bie, Fuqing, Huang, Shiyu, Tao, Xijia, Fang, Zhiqin, Pan, Leyi, Chen, Junzhe, Ren, Min, Xiang, Liuyu, He, Zhaofeng

arXiv.org Artificial Intelligence

While generalist foundation models like Gemini and GPT-4o demonstrate impressive multi-modal competence, existing evaluations fail to test their intelligence in dynamic, interactive worlds. Static benchmarks lack agency, while interactive benchmarks suffer from a severe modal bottleneck, typically ignoring crucial auditory and temporal cues. To bridge this evaluation chasm, we introduce OmniPlay, a diagnostic benchmark designed not just to evaluate, but to probe the fusion and reasoning capabilities of agentic models across the full sensory spectrum. Built on a core philosophy of modality interdependence, OmniPlay comprises a suite of five game environments that systematically create scenarios of both synergy and conflict, forcing agents to perform genuine cross-modal reasoning. Our comprehensive evaluation of six leading omni-modal models reveals a critical dichotomy: they exhibit superhuman performance on high-fidelity memory tasks but suffer from systemic failures in challenges requiring robust reasoning and strategic planning. We demonstrate that this fragility stems from brittle fusion mechanisms, which lead to catastrophic performance degradation under modality conflict and uncover a counter-intuitive "less is more" paradox, where removing sensory information can paradoxically improve performance. Our findings suggest that the path toward robust AGI requires a research focus beyond scaling to explicitly address synergistic fusion. Our platform is available for anonymous review at https://github.com/fuqingbie/omni-game-benchmark.


The ALCHEmist: Automated Labeling 500x CHEaper than LLM Data Annotators

Neural Information Processing Systems

Large pretrained models can be used as annotators, helping replace or augment crowdworkers and enabling distilling generalist models into smaller specialist models. Unfortunately, this comes at a cost: employing top-of-the-line models often requires paying thousands of dollars for API calls, while the resulting datasets are static and challenging to audit. To address these challenges, we propose a simple alternative: rather than directly querying labels from pretrained models, we task models to generate programs that can produce labels. These programs can be stored and applied locally, re-used and extended, and cost orders of magnitude less. Our system, \textbf{Alchemist}, obtains comparable to or better performance than large language model-based annotation in a range of tasks for a fraction of the cost: on average, improvements amount to a \textbf{12.9}


Alchemist: Towards the Design of Efficient Online Continual Learning System

Huang, Yuyang, Liu, Yuhan, Gunawi, Haryadi S., Li, Beibin, Hwang, Changho

arXiv.org Artificial Intelligence

Continual learning has become a promising solution to refine large language models incrementally by leveraging user feedback. In particular, online continual learning - iteratively training the model with small batches of user feedback - has demonstrated notable performance improvements. However, the existing practice of separating training and serving processes forces the online trainer to recompute the intermediate results already done during serving. Such redundant computations can account for 30%-42% of total training time. In this paper, we propose Alchemist, to the best of our knowledge, the first online continual learning system that efficiently reuses serving activations to increase training throughput. Alchemist introduces two key techniques: (1) recording and storing activations and KV cache only during the prefill phase to minimize latency and memory overhead; and (2) smart activation offloading and hedging. Evaluations with inputs of varied token length sampled from ShareGPT dataset show that compared with a separate training cluster, Alchemist significantly increases training throughput by up to 1.72x, reduces up to 47% memory usage during training, and supports up to 2x more training tokens - all while maintaining negligible impact on serving latency.


The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators

Huang, Tzu-Heng, Cao, Catherine, Bhargava, Vaishnavi, Sala, Frederic

arXiv.org Artificial Intelligence

Large pretrained models can be used as annotators, helping replace or augment crowdworkers and enabling distilling generalist models into smaller specialist models. Unfortunately, this comes at a cost: employing top-of-the-line models often requires paying thousands of dollars for API calls, while the resulting datasets are static and challenging to audit. To address these challenges, we propose a simple alternative: rather than directly querying labels from pretrained models, we task models to generate programs that can produce labels. These programs can be stored and applied locally, re-used and extended, and cost orders of magnitude less. Our system, Alchemist, obtains comparable to or better performance than large language model-based annotation in a range of tasks for a fraction of the cost: on average, improvements amount to a 12.9% enhancement while the total labeling costs across all datasets are reduced by a factor of approximately 500x.


Alchemist: Parametric Control of Material Properties with Diffusion Models

Sharma, Prafull, Jampani, Varun, Li, Yuanzhen, Jia, Xuhui, Lagun, Dmitry, Durand, Fredo, Freeman, William T., Matthews, Mark

arXiv.org Artificial Intelligence

We propose a method to control material attributes of objects like roughness, metallic, albedo, and transparency in real images. Our method capitalizes on the generative prior of text-to-image models known for photorealism, employing a scalar value and instructions to alter low-level material properties. Addressing the lack of datasets with controlled material attributes, we generated an object-centric synthetic dataset with physically-based materials. Fine-tuning a modified pre-trained text-to-image model on this synthetic dataset enables us to edit material properties in real-world images while preserving all other attributes. We show the potential application of our model to material edited NeRFs.


Alchemist's Palette: Free AI Video Magic - by Datasculptor

#artificialintelligence

Jump into a thrilling exploration of visual transformation with these 3 free art tools that unlock endless artistic possibilities. Take your skills to the next level. In the kaleidoscopic realm of artistic experimentation, where the alchemy of reimagining has been stripped of their creators' stylistic signatures: A fully controlled cinematic spectacle crafted with finesse through a single stroke of a magical wand, transcending the boundaries of tools like Runway GEN 2. (!) Delve with me into this fantastical journey as we unlock the secrets of redefining the visual experience and unleash the boundless potential of these revolutionary methods, all nestled within the confines of this Substack post.


The Alchemy of Love

#artificialintelligence

It was the hight of summer 2018 when I had the great fortune to be acquainted with Dr. Julia Mossbridge, who was at that time just preparing for the launch of her newest book'The Premonition Code'. Now on valentines day 2020, we thought it a good time to reshare the amazing conversation that we had about being unconditionally loved by artificial general intelligence. "You see, what I'm suggesting is that love will be the key… by which they acquire a kind of subconscious never before achieved. In our modern, hyper-connected world where many prominent personalities warn about the dangers of artificial intelligence and declare that the AI researchers are summoning the demon, it is rather peculiar to come across the notion of a loving AI. News articles are declaring that AI is getting more emotional, that AI algorithms are better than us at recognizing emotions, that the rise of emotionally intelligent AI is near and questioning whether we can fall in love with an AI? Even Hollywood is exploring love between humans and AI, through movies like Her and Ex Machina. It seems that the Alchemists of our time, rather than striving to transform lead into gold, seek to be loved by silicon. For the second episode of the SingularityNET Podcast we invited one such modern-day Alchemist: Dr. Julia Mossbridge, to be our guest. As the principal founder of the loving AI project, Dr. Mossbridge is at the forefront of researching the ways of creating algorithmic love. But why build loving robots? Would such a love be different? Can machines love us more? Can we be loved unconditionally? And can we love in return? In our podcast, we asked Dr. Julia Mossbridge those questions. And like everything that concerns love, things were a bit complicated. "Powerful infatuations can be induced by the skillful potioneer, but never yet has anyone managed to create the truly unbreakable, eternal, unconditional attachment that alone can be called love." -- J.K.Rowling It is not unreasonable to wonder that if magicians could not create unconditional love, what hope do AI researchers have? And why do we need a love potion for an AGI in the first place? Dr. Julia Mossbridge saw the need to dedicate her time and energy toward the herculean task of creating unconditionally loving AI when she was approached by a group of people who were concerned about the future. More specifically, these people did not want a future where humanity was left wondering "if only we had taught AI to love." As she embarked on her journey, Julia realized that humanity had no other choice but to create unconditionally loving AI. "They [AI] are going to have super intelligence in many ways.


GOTO 2018 • Machine Learning: Alchemy for the Modern Computer Scientist • Erik Meijer

#artificialintelligence

This presentation was recorded at GOTO Copenhagen 2018. Erik Meijer - Think Like A Fundamentalist, Code Like A Hacker ABSTRACT In ancient times, the dream of alchemists was to mutate ordinary metals such as lead into noble metals such as gold. However, by using classic mathematics, modern physicists and chemists are much more successful in understanding and transforming matter than alchemists ever dreamt of. The situation in software seems to be the opposite. Modern computer scientists have been unsuccessful in their quest to reliably turn formal specifications into code and to accurately understand the mechanics of side-effecting computation.