Goto

Collaborating Authors

 modem


MODEM: AMorton-Order Degradation Estimation Mechanism for Adverse Weather Image Recovery

Neural Information Processing Systems

Restoring images degraded by adverse weather remains a significant challenge due to the highly non-uniform and spatially heterogeneous nature of weather-induced artifacts, e.g., fine-grained rain streaks versus widespread haze. Accurately estimating the underlying degradation can intuitively provide restoration models with more targeted and effective guidance, enabling adaptive processing strategies. To this end, we propose a Morton-Order Degradation Estimation Mechanism (MODEM) for adverse weather image restoration. Central to MODEM is the Morton-Order 2D-Selective-Scan Module (MOS2D), which integrates Morton-coded spatial ordering with selective state-space models to capture long-range dependencies while preserving local structural coherence. Complementing MOS2D, we introduce a Dual Degradation Estimation Module (DDEM) that disentangles and estimates both global and local degradation priors.


MODEM: A Morton-Order Degradation Estimation Mechanism for Adverse Weather Image Recovery

Neural Information Processing Systems

Restoring images degraded by adverse weather remains a significant challenge due to the highly non-uniform and spatially heterogeneous nature of weather-induced artifacts, \emph{e.g.}, fine-grained rain streaks versus widespread haze. Accurately estimating the underlying degradation can intuitively provide restoration models with more targeted and effective guidance, enabling adaptive processing strategies. To this end, we propose a Morton-Order Degradation Estimation Mechanism (MODEM) for adverse weather image restoration. Central to MODEM is the Morton-Order 2D-Selective-Scan Module (MOS2D), which integrates Morton-coded spatial ordering with selective state-space models to capture long-range dependencies while preserving local structural coherence. Complementing MOS2D, we introduce a Dual Degradation Estimation Module (DDEM) that disentangles and estimates both global and local degradation priors.




Is Neuromancer's cyberpunk dystopia still thrilling in 2025?

New Scientist

Neuromancer begins with a brilliant, highly memorable line: "The sky above the port was the colour of television, tuned to a dead channel." The novel was first published in 1984, when very few people had access to computers. Famously, William Gibson wrote the book on a typewriter. But despite this, it goes on to draw a vivid portrait of a futuristic world where data is currency and business is done in "cyberspace", though companies can also be hacked into and robbed. And, shimmering mysteriously in the background, there are powerful AIs that no one really understands.


MoDEM: Mixture of Domain Expert Models

arXiv.org Artificial Intelligence

We propose a novel approach to enhancing the performance and efficiency of large language models (LLMs) by combining domain prompt routing with domain-specialized models. We introduce a system that utilizes a BERT-based router to direct incoming prompts to the most appropriate domain expert model. These expert models are specifically tuned for domains such as health, mathematics and science. Our research demonstrates that this approach can significantly outperform general-purpose models of comparable size, leading to a superior performance-to-cost ratio across various benchmarks. The implications of this study suggest a potential paradigm shift in LLM development and deployment. Rather than focusing solely on creating increasingly large, general-purpose models, the future of AI may lie in developing ecosystems of smaller, highly specialized models coupled with sophisticated routing systems. This approach could lead to more efficient resource utilization, reduced computational costs, and superior overall performance.


Imitation Bootstrapped Reinforcement Learning

arXiv.org Artificial Intelligence

Despite the considerable potential of reinforcement learning (RL), robotics control tasks predominantly rely on imitation learning (IL) owing to its better sample efficiency. However, given the high cost of collecting extensive demonstrations, RL is still appealing if it can utilize limited imitation data for efficient autonomous self-improvement. Existing RL methods that utilize demonstrations either initialize the replay buffer with demonstrations and oversample them during RL training, which does not benefit from the generalization potential of modern IL methods, or pretrain the RL policy with IL on the demonstrations, which requires additional mechanisms to prevent catastrophic forgetting during RL fine-tuning. We propose imitation bootstrapped reinforcement learning (IBRL), a novel framework that first trains an IL policy on a limited number of demonstrations and then uses it to propose alternative actions for both online exploration and target value bootstrapping. IBRL achieves SoTA performance and sample efficiency on 7 challenging sparse reward continuous control tasks in simulation while learning directly from pixels. As a highlight of our method, IBRL achieves $6.4\times$ higher success rate than RLPD, a strong method that combines the idea of oversampling demonstrations with modern RL improvements, under the budget of 10 demos and 100K interactions in the challenging PickPlaceCan task in the Robomimic benchmark.


An ML-assisted OTFS vs. OFDM adaptable modem

arXiv.org Artificial Intelligence

The Orthogonal-Time-Frequency-Space (OTFS) signaling is known to be resilient to doubly-dispersive channels, which impacts high mobility scenarios. On the other hand, the Orthogonal-Frequency-Division-Multiplexing (OFDM) waveforms enjoy the benefits of the reuse of legacy architectures, simplicity of receiver design, and low-complexity detection. Several studies that compare the performance of OFDM and OTFS have indicated mixed outcomes due to the plethora of system parameters at play beyond high-mobility conditions. In this work, we exemplify this observation using simulations and propose a deep neural network (DNN)-based adaptation scheme to switch between using either an OTFS or OFDM signal processing chain at the transmitter and receiver for optimal mean-squared-error (MSE) performance. The DNN classifier is trained to switch between the two schemes by observing the channel condition, received SNR, and modulation format. We compare the performance of the OTFS, OFDM, and the proposed switched-waveform scheme. The simulations indicate superior performance with the proposed scheme with a well-trained DNN, thus improving the MSE performance of the communication significantly.


Pioneering Hacker Kevin Mitnick, FBI-Wanted Felon Turned Security Guru, Dead at 59

TIME - Tech

Kevin Mitnick, whose pioneering antics tricking employees in the 1980s and 1990s into helping him steal software and services from big phone and tech companies made him the most celebrated U.S. hacker, has died at age 59. Mitnick died Sunday in Las Vegas after a 14-month battle with pancreatic cancer, said Stu Sjouwerman, CEO of the security training firm KnowBe4, where Mitnick was chief hacking officer. His colorful career--from student tinkerer to FBI-hunted fugitive, imprisoned felon and finally respected cybersecurity professional, public speaker and author tapped for advice by U.S. lawmakers and global corporations--mirrors the evolution of society's grasp of the nuances of computer hacking. Through Mitnick's professional trajectory, and what many consider the misplaced prosecutorial zeal that put him behind bars for nearly five years until 2000, the public has learned how to better distinguish serious computer crime from the mischievous troublemaking of youths hellbent on proving their hacking prowess. "He never hacked for money," said Sjouwerman, who became Mitnick's business partner in 2011.


MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations

arXiv.org Artificial Intelligence

Poor sample efficiency continues to be the primary challenge for deployment of deep Reinforcement Learning (RL) algorithms for real-world applications, and in particular for visuo-motor control. Model-based RL has the potential to be highly sample efficient by concurrently learning a world model and using synthetic rollouts for planning and policy improvement. However, in practice, sample-efficient learning with model-based RL is bottlenecked by the exploration challenge. In this work, we find that leveraging just a handful of demonstrations can dramatically improve the sample-efficiency of model-based RL. Simply appending demonstrations to the interaction dataset, however, does not suffice. We identify key ingredients for leveraging demonstrations in model learning -- policy pretraining, targeted exploration, and oversampling of demonstration data -- which forms the three phases of our model-based RL framework. We empirically study three complex visuo-motor control domains and find that our method is 150%-250% more successful in completing sparse reward tasks compared to prior approaches in the low data regime (100K interaction steps, 5 demonstrations). Code and videos are available at: https://nicklashansen.github.io/modemrl