Goto

Collaborating Authors

 gsm



bc827452450356f9f558f4e4568d553b-Paper-Conference.pdf

Neural Information Processing Systems

Here, we narrow this gap by developing aneffectivemethod fortraining acanonical model ofcortical neural circuits, the stabilized supralinear network (SSN), that in previous work had to beconstructed manually ortrainedwithundueconstraints.



Active Learning with LLMs for Partially Observed and Cost-Aware Scenarios

Neural Information Processing Systems

Conducting experiments and collecting data for machine learning models is a complex and expensive endeavor, particularly when confronted with limited information. Typically, extensive experiments to obtain features and labels come with a significant acquisition cost, making it impractical to carry out all of them. Therefore, it becomes crucial to strategically determine what to acquire to maximize the predictive performance while minimizing costs.





Cultivating Game Sense for Yourself: Making VLMs Gaming Experts

arXiv.org Artificial Intelligence

Developing agents capable of fluid gameplay in first/third-person games without API access remains a critical challenge in Artificial General Intelligence (AGI). Recent efforts leverage Vision Language Models (VLMs) as direct controllers, frequently pausing the game to analyze screens and plan action through language reasoning. However, this inefficient paradigm fundamentally restricts agents to basic and non-fluent interactions: relying on isolated VLM reasoning for each action makes it impossible to handle tasks requiring high reactivity (e.g., FPS shooting) or dynamic adaptability (e.g., ACT combat). To handle this, we propose a paradigm shift in gameplay agent design: instead of directly controlling gameplay, VLM develops specialized execution modules tailored for tasks like shooting and combat. These modules handle real-time game interactions, elevating VLM to a high-level developer. Building upon this paradigm, we introduce GameSense, a gameplay agent framework where VLM develops task-specific game sense modules by observing task execution and leveraging vision tools and neural network training pipelines. These modules encapsulate action-feedback logic, ranging from direct action rules to neural network-based decisions. Experiments demonstrate that our framework is the first to achieve fluent gameplay in diverse genres, including ACT, FPS, and Flappy Bird, setting a new benchmark for game-playing agents.


Reviews: Global Sparse Momentum SGD for Pruning Very Deep Neural Networks

Neural Information Processing Systems

Update: Authors justified the choice of the competitor in empirical evaluation (thought it's better to add it to the body of the paper in camera ready if accepted). I find technique interesting, though i think results are exploratory and some-what preliminary, I think it's important for NeurIPS community to get familiar with these results. They identify and address major issues of current approaches, such as 1) prune then finetune for accuracy recover 2) prunning by custom learning (mostly custom regulizers). Authors introduce GSM - a new approach, that does not require finetuning afterwards and can be solved by means of vanilla SGD. GSM only updats the top Q values of the gradient based on the suggested metric (first order Taylor) --- dL/dw * w .


DeepMedcast: A Deep Learning Method for Generating Intermediate Weather Forecasts among Multiple NWP Models

arXiv.org Artificial Intelligence

In recent decades, numerical weather predictions (NWPs) and their post-processing have played a central role in issuing weather forecasts, warnings, and advisories [WMO, 2013, Vannitsem el al., 2021]. NWP centers around the world have developed and are operating a variety of NWP models for accurate weather predictions. For example, the European Centre for Medium-Range Weather Forecasts (ECMWF) operates the Integrated Forecasting System (IFS) and its ensemble prediction system [ECMWF, 2024]; the UK Met Office operates the Unified Model and the Met Office Global and Regional Ensemble Prediction System [Brown et al., 2012, Hagelin et al., 2017, Inverarity et al., 2023]. The National Centers for Environmental Prediction (NCEP) at the National Oceanic and Atmospheric Administration (NOAA) operates the Global Forecast System [NCEP, 2016], the High-Resolution Rapid Refresh [Dowell et al., 2022], and the Hurricane Weather Research and Forecasting model [Gopalakrishnan et al., 2011]. The Japan Meteorological Agency (JMA) operates three deterministic NWP models and two ensemble prediction systems for short-range to weekly forecasts: the Global Spectrum Model (GSM), the Meso-Scale Model (MSM), the Local Forecast Model, the Global Ensemble Prediction System, and the Mesoscale Ensemble Prediction System [JMA, 2024]. These models cover different areas with varying resolutions and processes. In addition to traditional physics-based NWP models, recent advancements in artificial intelligence (AI) have introduced new methods for producing weather predictions.