Goto

Collaborating Authors

 srinivasan


Humanoid robots perform advanced martial arts at Chinese New Year gala

Al Jazeera

China's annual gala on Lunar New Year's Eve has showcased Beijing's giant leap in technology as humanoid robots took centre stage to perform a joint martial arts routine featuring several firsts. China's Spring Festival Gala, which aired on Monday on state broadcaster CGTN, has gone viral, drawing nearly half a million views on YouTube. The performance marked a stark contrast with last year's show, when robots twirled handkerchiefs and performed simple movements. The first robots to appear were Noetix's Bumi models, who performed a comedy sketch. Unitree's robots later exhibited martial arts alongside child artists, including backflips and trampoline jumps, followed by Magiclab's humanoids in a musical segment.


PyNeRF: Pyramidal Neural Radiance Fields

Neural Information Processing Systems

We propose a simple modification to grid-based models by training model heads at different spatial grid resolutions. At render time, we simply use coarser grids to render samples that cover larger volumes.






Tech Billionaires Already Captured the White House. They Still Want to Be Kings

WIRED

From Montenegro to northern California, the tech elite dream of building cities where they make the rules. Is this, finally, their moment? The shirtless man in the golden mask and cape has plans to lead his own country one day. There is no location yet, but it will be a crypto-and AI-powered paradise of medical experimentation, filled with people who want to "make death optional," he says. For now, though, he's leading a sparsely attended rave on the second floor of a San Francisco office building. A DJ is spinning at one end of an open room. A handful of people sway and jump on the space cleared out as a dance floor. At a nearby table, coffee is available with many alternative milks.


Understanding visual attention beehind bee-inspired UAV navigation

Rajbhandari, Pranav, Veda, Abhi, Garratt, Matthew, Srinivasan, Mandyam, Ravi, Sridhar

arXiv.org Artificial Intelligence

Bio-inspired design is often used in autonomous UAV navigation due to the capacity of biological systems for flight and obstacle avoidance despite limited sensory and computational capabilities. In particular, honeybees mainly use the sensory input of optic flow, the apparent motion of objects in their visual field, to navigate cluttered environments. In our work, we train a Reinforcement Learning agent to navigate a tunnel with obstacles using only optic flow as sensory input. We inspect the attention patterns of trained agents to determine the regions of optic flow on which they primarily base their motor decisions. We find that agents trained in this way pay most attention to regions of discontinuity in optic flow, as well as regions with large optic flow magnitude. The trained agents appear to navigate a cluttered tunnel by avoiding the obstacles that produce large optic flow, while maintaining a centered position in their environment, which resembles the behavior seen in flying insects. This pattern persists across independently trained agents, which suggests that this could be a good strategy for developing a simple explicit control law for physical UAVs.


An Empirical Comparison of Cost Functions in Inductive Logic Programming

Hocquette, Céline, Cropper, Andrew

arXiv.org Artificial Intelligence

Recent inductive logic programming (ILP) approaches learn optimal hypotheses. An optimal hypothesis minimises a given cost function on the training data. There are many cost functions, such as minimising training error, textual complexity, or the description length of hypotheses. However, selecting an appropriate cost function remains a key question. To address this gap, we extend a constraint-based ILP system to learn optimal hypotheses for seven standard cost functions. We then empirically compare the generalisation error of optimal hypotheses induced under these standard cost functions. Our results on over 20 domains and 1000 tasks, including game playing, program synthesis, and image reasoning, show that, while no cost function consistently outperforms the others, minimising training error or description length has the best overall performance. Notably, our results indicate that minimising the size of hypotheses does not always reduce generalisation error.


CrossModalityDiffusion: Multi-Modal Novel View Synthesis with Unified Intermediate Representation

Berian, Alex, Brignac, Daniel, Wu, JhihYang, Daba, Natnael, Mahalanobis, Abhijit

arXiv.org Artificial Intelligence

Geospatial imaging leverages data from diverse sensing modalities-such as EO, SAR, and LiDAR, ranging from ground-level drones to satellite views. These heterogeneous inputs offer significant opportunities for scene understanding but present challenges in interpreting geometry accurately, particularly in the absence of precise ground truth data. To address this, we propose CrossModalityDiffusion, a modular framework designed to generate images across different modalities and viewpoints without prior knowledge of scene geometry. CrossModalityDiffusion employs modality-specific encoders that take multiple input images and produce geometry-aware feature volumes that encode scene structure relative to their input camera positions. The space where the feature volumes are placed acts as a common ground for unifying input modalities. These feature volumes are overlapped and rendered into feature images from novel perspectives using volumetric rendering techniques. The rendered feature images are used as conditioning inputs for a modality-specific diffusion model, enabling the synthesis of novel images for the desired output modality. In this paper, we show that jointly training different modules ensures consistent geometric understanding across all modalities within the framework. We validate CrossModalityDiffusion's capabilities on the synthetic ShapeNet cars dataset, demonstrating its effectiveness in generating accurate and consistent novel views across multiple imaging modalities and perspectives.