Goto

Collaborating Authors

 Country


Dynamical modeling of nonlinear latent factors in multiscale neural activity with real-time inference

Neural Information Processing Systems

Real-time decoding of target variables from multiple simultaneously recorded neural time-series modalities, such as discrete spiking activity and continuous field potentials, is important across various neuroscience applications. However, a major challenge for doing so is that different neural modalities can have different timescales (i.e., sampling rates) and different probabilistic distributions, or can even be missing at some time-steps. Existing nonlinear models of multimodal neural activity do not address different timescales or missing samples across modalities. Further, some of these models do not allow for real-time decoding. Here, we develop a learning framework that can enable real-time recursive decoding while nonlinearly aggregating information across multiple modalities with different timescales and distributions and with missing samples. This framework consists of 1) a multiscale encoder that nonlinearly aggregates information after learning within-modality dynamics to handle different timescales and missing samples in real time, 2) a multiscale dynamical backbone that extracts multimodal temporal dynamics and enables real-time recursive decoding, and 3) modality-specific decoders to account for different probabilistic distributions across modalities. In both simulations and three distinct multiscale brain datasets, we show that our model can aggregate information across modalities with different timescales and distributions and missing samples to improve real-time target decoding. Further, our method outperforms various linear and nonlinear multimodal benchmarks in doing so.


Vision Function Layer in LLMs

Neural Information Processing Systems

This study identifies that visual-related functional decoding is distributed across different decoder layers in Multimodal Large Language Models (MLLMs). Typically, each function, such as counting, grounding, or OCR recognition, narrows down to two or three layers, which we define as Vision Function Layers (VFL). Additionally, the depth and its order of different VFLs exhibits a consistent pattern across different MLLMs, which is well-aligned with human behaviors (e.g., recognition occurs first, followed by counting, and then grounding). These findings are derived from Visual Token Swapping, our novel analytical framework that modifies targeted KV cache entries to precisely elucidate layer-specific functions during decoding. Furthermore, these insights offer substantial utility in tailoring MLLMs for real-world downstream applications. For instance, when LoRA training is selectively applied to VFLs whose functions align with the training data, VFLLoRA not only outperform full-LoRA but also prevent out-of-domain function forgetting. Moreover, by analyzing the performance differential on training data when particular VFLs are ablated, VFL-select automatically classifies data by function, enabling highly efficient data selection to directly bolster corresponding capabilities. Consequently, VFL-select surpasses human experts in data selection, and achieves 98% of full-data performance with only 20% of the original dataset. This study delivers deeper comprehension of MLLM visual processing, fostering the creation of more efficient, interpretable, and robust models.


27aa3aeff0f8460a7b43d30fa6c5c032-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing Systems

Large Language Models (LLMs) are transforming search engines into Conversational Search Engines (CSE). Consequently, Search Engine Optimization (SEO) is being shifted into Conversational Search Engine Optimization (C-SEO). We are beginning to see dedicated C-SEO methods for modifying web documents to increase their visibility in CSE responses. However, they are often tested only for a limited breadth of application domains; we do not know whether certain C-SEO methods would be effective for a broad range of domains. Moreover, existing evaluations consider only a single-actor scenario where only one web document adopts a C-SEO method; in reality, multiple players are likely to competitively adopt the cutting-edge C-SEO techniques, drawing an analogy from the dynamics we have seen in SEO.


Fast Rank-1 Lattice Targeted Sampling for Black-box Optimization Anonymous Author(s) Affiliation Address email

Neural Information Processing Systems

Black-box optimization has gained great attention for its success in recent ap-1 plications. However, scaling up to high-dimensional problems with good query2 efficiency remains challenging. This paper proposes a novel Rank-1 Lattice Tar-3 geted Sampling (RLTS) technique to address this issue. Our RLTS benefits from4 random rank-1 lattice Quasi-Monte Carlo, which enables us to perform fast local5 exact Gaussian processes (GP) training and inference with O(nlogn)complexity6 w.r.t.


Optimal Neural Compressors for the Rate-Distortion-Perception Tradeoff

Neural Information Processing Systems

Recent efforts in neural compression have focused on the rate-distortion-perception (RDP) tradeoff, where the perception constraint ensures the source and reconstruction distributions are close in terms of a statistical divergence. Theoretical work on RDP describes properties of RDP-optimal compressors without providing constructive and low complexity solutions. While classical rate-distortion theory shows that optimal compressors should efficiently pack space, RDP theory additionally shows that infinite randomness shared between the encoder and decoder may be necessary for RDP optimality. In this paper, we propose neural compressors that are low complexity and benefit from high packing efficiency through lattice coding and shared randomness through shared dithering over the lattice cells. For two important settings, namely infinite shared and zero shared randomness, we analyze the RDP tradeoff achieved by our proposed neural compressors and show optimality in both cases. Experimentally, we investigate the roles that these two components of our design, lattice coding and randomness, play in the performance of neural compressors on synthetic and real-world data. We observe that performance improves with more shared randomness and better lattice packing.


SpaceX IPO raised 10bn more than thought

BBC News

SpaceX raised $10bn (£7.5bn) more than initially thought when it sold shares to the public on Friday - bringing in a total of $85.7bn. Elon Musk's rocket and Artificial Intellgience (AI) company pulled off the biggest initial public offering (IPO) in history when it joined New York's Nasdaq stock exchange last week. The listing had raised $75bn from investors, which Musk told employees will be spent funding a significant growth phase. But the banks which backed the IPO exercised a so-called greenshoe clause, which let them purchase an extra $10bn of SpaceX shares. The extra $10bn raised, revealed in a statement by SpaceX announcing the completion of the listing, would by itself rank as one of the biggest IPOs in history.


Imitation Beyond Expectation Using Pluralistic Stochastic Dominance

Neural Information Processing Systems

Imitation learning seeks to estimate policies reflecting the values of demonstrated behaviors. Prevalent approaches learn to match or exceed the demonstrator's performance in expectation without knowing the demonstrator's reward function. Unfortunately, this does not induce pluralistic imitators that learn to support distinct demonstrations.


Training-Free Constrained Generation With Stable Diffusion Models

Neural Information Processing Systems

Stable diffusion models represent the state-of-the-art in data synthesis across diverse domains and hold transformative potential for applications in science and engineering, e.g., by facilitating the discovery of novel solutions and simulating systems that are computationally intractable to model explicitly. While there is increasing effort to incorporate physics-based constraints into generative models, existing techniques are either limited in their applicability to latent diffusion frameworks or lack the capability to strictly enforce domain-specific constraints. To address this limitation this paper proposes a novel integration of stable diffusion models with constrained optimization frameworks, enabling the generation of outputs satisfying stringent physical and functional requirements.


Why do South Koreans love AI so much?

MIT Technology Review

Why do South Koreans love AI so much? From eldercare robots to humanoid monks, South Koreans just can't get enough of AI. When I landed in Seoul after a grueling 12-hour flight from San Francisco, I walked through an unmanned immigration checkpoint, where a machine scanned my face and passport. On the subway home, people were glued to their phones (powered by flawless 5G even underground), as we raced past platforms lined with LED screens of ads celebrating K-pop idols ' birthdays. When I got off the station in Gangnam, a cartoon-eyed robot on wheels was waiting patiently at a crosswalk to deliver someone's dinner. Internet cafés dotted the sidewalks, crammed with teenagers playing computer games, maybe hoping to become the next legendary pro gamer .


Anthropic to meet White House over AI tool suspension

BBC News

Bosses at the artificial intelligence (AI) firm Anthropic are set to meet senior White House officials amid fresh national security concerns over the company's latest release. The meeting is set to take place on Monday in Washington DC between executives at Anthropic and the US Department of Commerce, a government department led by Secretary Howard Lutnick, according to two people familiar with the matter. It comes after Anthropic blocked all public access to the recent release of its latest AI tool on Friday, which it has previously said is too powerful. The firm made the decision after the US government prohibited Anthropic from allowing any foreign national access to the technology. The AI tool at issue is named Fable 5 or Mythos 5. Fable 5 is a version of the tool with extra safeguards made available to the public, while Mythos 5 has different controls and is only available to a select group of organisations.