Goto

Collaborating Authors

 ether


Aether: Geometric-Aware Unified World Modeling

Aether Team, null, Zhu, Haoyi, Wang, Yifan, Zhou, Jianjun, Chang, Wenzheng, Zhou, Yang, Li, Zizun, Chen, Junyi, Shen, Chunhua, Pang, Jiangmiao, He, Tong

arXiv.org Artificial Intelligence

The integration of geometric reconstruction and generative modeling remains a critical challenge in developing AI systems capable of human-like spatial reasoning. This paper proposes Aether, a unified framework that enables geometry-aware reasoning in world models by jointly optimizing three core capabilities: (1) 4D dynamic reconstruction, (2) action-conditioned video prediction, and (3) goal-conditioned visual planning. Through task-interleaved feature learning, Aether achieves synergistic knowledge sharing across reconstruction, prediction, and planning objectives. Building upon video generation models, our framework demonstrates unprecedented synthetic-to-real generalization despite never observing real-world data during training. Furthermore, our approach achieves zero-shot generalization in both action following and reconstruction tasks, thanks to its intrinsic geometric modeling. Remarkably, even without real-world data, its reconstruction performance is comparable with or even better than that of domain-specific models. Additionally, Aether employs camera trajectories as geometry-informed action spaces, enabling effective action-conditioned prediction and visual planning. We hope our work inspires the community to explore new frontiers in physically-reasonable world modeling and its applications.


Decoupling Angles and Strength in Low-rank Adaptation

Bini, Massimo, Girrbach, Leander, Akata, Zeynep

arXiv.org Artificial Intelligence

Parameter-Efficient FineTuning (PEFT) methods have recently gained significant popularity thanks to the widespread availability of large-scale pretrained models. These methods allow for quick adaptation to downstream tasks with minimal computational cost. However, popular finetuning methods such as LoRA exhibit limited robustness when it comes to hyperparameter choices or extended training regimes, preventing optimal out-of-the-box performance. In contrast, bounded approaches, such as ETHER, provide greater robustness but are limited to extremely low-rank adaptations and fixed-strength transformations, reducing their adaptation expressive power. In this work, we propose Decoupled Lowrank Adaptation (DeLoRA), a novel finetuning method that normalizes and scales learnable low-rank matrices. Through evaluations on subject-driven image generation, natural language understanding, and instruction tuning, we show that DeLoRA matches or surpasses performance of competing PEFT methods, while exhibiting stronger robustness. The rapid advancement of deep learning has led to the development of large-scale pretrained models in various domains, especially in computer vision and natural language processing (Touvron et al., 2023a;b; Radford et al., 2021; Rombach et al., 2022). However, the enormous size of these models, reaching billions of parameters, presents significant challenges when adapting them to specific downstream tasks, particularly in terms of computational cost and efficiency. To address these challenges, Parameter Efficient FineTuning (PEFT) methods have emerged. PEFT methods are characterized by their introduction of a small set of learnable parameters, in contrast to the extensive parameter updates required in full finetuning. Notable examples include adapters (Houlsby et al., 2019) and prompt tuning (Lester et al., 2021).


'Disc-shaped craft' hovers over Colorado concert venue, employees say: 'It knew it was being watched'

FOX News

A dozen employees said they watched a "large, disc-shaped craft" hover above a Colorado concert venue and then vanish. "What's even crazier is that as soon as we all started noticing it and stopped what we were doing to pay attention to it, the craft tipped at an angle and slowly started moving belly-first to the east," an employee reported to the National UFO Reporting Center about the June 5 sighting at the Red Rocks Ampitheatre in Morrison. "Then it started fading away until it was invisible. It simply dissolved into the ether. We all watched it vanish."


ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections

Bini, Massimo, Roth, Karsten, Akata, Zeynep, Khoreva, Anna

arXiv.org Artificial Intelligence

Parameter-efficient finetuning (PEFT) has become ubiquitous to adapt foundation models to downstream task requirements while retaining their generalization ability. However, the amount of additionally introduced parameters and compute for successful adaptation and hyperparameter searches can explode quickly, especially when deployed at scale to serve numerous individual requests. To ensure effective, parameter-efficient, and hyperparameter-robust adaptation, we propose the ETHER transformation family, which performs Efficient fineTuning via HypErplane Reflections. By design, ETHER transformations require a minimal number of parameters, are less likely to deteriorate model performance, and exhibit robustness to hyperparameter and learning rate choices. In particular, we introduce ETHER and its relaxation ETHER+, which match or outperform existing PEFT methods with significantly fewer parameters ($\sim$$10$-$100$ times lower than LoRA or OFT) across multiple image synthesis and natural language tasks without exhaustive hyperparameter tuning. Finally, we investigate the recent emphasis on Hyperspherical Energy retention for adaptation and raise questions on its practical utility. The code is available at https://github.com/mwbini/ether.


ETHER: Aligning Emergent Communication for Hindsight Experience Replay

Denamganaï, Kevin, Hernandez, Daniel, Vardal, Ozan, Missaoui, Sondess, Walker, James Alfred

arXiv.org Artificial Intelligence

Natural language instruction following is paramount to enable collaboration between artificial agents and human beings. Natural language-conditioned reinforcement learning (RL) agents have shown how natural languages' properties, such as compositionality, can provide a strong inductive bias to learn complex policies. Previous architectures like HIGhER combine the benefit of language-conditioning with Hindsight Experience Replay (HER) to deal with sparse rewards environments. Yet, like HER, HIGhER relies on an oracle predicate function to provide a feedback signal highlighting which linguistic description is valid for which state. This reliance on an oracle limits its application. Additionally, HIGhER only leverages the linguistic information contained in successful RL trajectories, thus hurting its final performance and data-efficiency. Without early successful trajectories, HIGhER is no better than DQN upon which it is built. In this paper, we propose the Emergent Textual Hindsight Experience Replay (ETHER) agent, which builds on HIGhER and addresses both of its limitations by means of (i) a discriminative visual referential game, commonly studied in the subfield of Emergent Communication (EC), used here as an unsupervised auxiliary task and (ii) a semantic grounding scheme to align the emergent language with the natural language of the instruction-following benchmark. We show that the referential game's agents make an artificial language emerge that is aligned with the natural-like language used to describe goals in the BabyAI benchmark and that it is expressive enough so as to also describe unsuccessful RL trajectories and thus provide feedback to the RL agent to leverage the linguistic, structured information contained in all trajectories. Our work shows that EC is a viable unsupervised auxiliary task for RL and provides missing pieces to make HER more widely applicable.


The AI 'gold rush' in Washington

#artificialintelligence

AI's little guys are getting into the Washington influence game. Tech giants and defense contractors have long dominated AI lobbying, seeking both money and favorable rules. And while the largest companies still dominate the debate, pending legislation in Congress aimed at getting ahead of China on innovation, along with proposed bills on data privacy, have caused a spike in lobbying by smaller AI players. A number of companies focused on robotics, drones and self-driving cars are all setting up their own Washington influence machines, positioning them to shape the future of AI policy to their liking. A lot of it is spurred by one major piece of legislation: The Bipartisan Innovation Act, commonly referred to as USICA -- an acronym for its previous title, and its goal to out-innovate China.


Here's why AI-equipped NFTs could be the real gateway to the Metaverse

#artificialintelligence

Nonfungible tokens (NFTs) have been largely acquired as proof-of-profile pictures (PFPs) that represent a brand, embody culture or ultimately, reflect as a static status symbol. Blue-chip NFTs like the Bored Ape Yacht Club or Cool Cats were not originally backed by any tangible utility other than speculative value and hype, along with the promise of an illustrative roadmap, but in 2022, investors are looking for a little bit "more." However, nonfungible tokens are finding their use beyond branding and status symbols by attempting to build out an existence in the Metaverse and some are ambitious enough to start within it. The Altered State Machine (ASM) Artificial Intelligence Football Association (AIFA) has introduced a novel concept to NFTs called nonfungible intelligence or NFI. By tokenizing artificial intelligence, the ASM AIFA has captured the attention of investors who are thinking long-term about the future of the Metaverse and decentralized play-to-earn (P2E) economies.


BeANKH (Blockchain Artificial Intelligence) makes Immortality a digital reality - Glob Intel Web/Systems Development SEO & Data Analytics

#artificialintelligence

For many years the human species has desired for immortality and humans have never stopped to look for a way to defeat death. Physically beating death is not possible but with the help of artificial intelligence algorithm that copies and preserves human decision making patterns, attitudes, psychology after death immortality is now be achieved. BeANKH is a blockchain platform that uses the power of artificial intelligence algorithms and smart contracts a person's digital analog by copying unique thinking patterns, psychology, and attitudes to come up with the future behavior of the analog that will continue to function even after a person dies physically. The initial idea conceptualized in 2015 with a platform uses BeANKH tokens that are available for purchase by supporters of the platform and crypto community members. They are utility tokens that are used to make in-app purchases in a BeANKH environment.