PitcherNet helps researchers throw strikes with AI analysis
University of Waterloo researchers have developed new artificial intelligence (AI) technology that can accurately analyze pitcher performance and mechanics using low-resolution video of baseball games. The system, developed for the Baltimore Orioles by the Waterloo team, plugs holes in much more elaborate and expensive technology already installed in most stadiums that host Major League Baseball (MLB), whose teams have increasingly tapped into data analytics in recent years. Waterloo researchers convert video of a pitcher's performance into a two-dimensional model that PitcherNet's AI algorithm can later analyze. Those systems, produced by a company called Hawk-Eye Innovations, use multiple special cameras in each park to catch players in action, but the data they yield is typically available to the home team that owns the stadium those games are played in. To add away games to their analytics operation, as well as use smartphone video taken by scouts in minor league and college games, the Orioles asked video and AI experts at Waterloo for help about three years ago.
Windows 11 Pro may be the most underrated PC gaming upgrade ever at 15
TL;DR: Windows 11 Pro is down to 14.97 through June 1, its lowest price to date (reg. It's usually an afterthought, and that might be a mistake. Whether you're on Windows 10 or Windows 11 Home, upgrading to Windows 11 Pro may be the difference between an ordinary and an extraordinary gaming experience. Windows 11 Pro introduces DirectX 12 Ultimate, delivering higher frame rates, improved ray tracing, and lower latency for a smoother gaming experience. If you want faster load times and better graphics, this is your chance to optimize your rig.
RainNet: A Large-Scale Imagery Dataset and Benchmark for Spatial Precipitation Downscaling
AI-for-science approaches have been applied to solve scientific problems (e.g., nuclear fusion, ecology, genomics, meteorology) and have achieved highly promising results. Spatial precipitation downscaling is one of the most important meteorological problem and urgently requires the participation of AI. However, the lack of a well-organized and annotated large-scale dataset hinders the training and verification of more effective and advancing deep-learning models for precipitation downscaling. To alleviate these obstacles, we present the first large-scale spatial precipitation downscaling dataset named RainNet, which contains more than 62, 400 pairs of high-quality low/high-resolution precipitation maps for over 17 years, ready to help the evolution of deep learning models in precipitation downscaling. Specifically, the precipitation maps carefully collected in RainNet cover various meteorological phenomena (e.g., hurricane, squall), which is of great help to improve the model generalization ability.
Mind the Gap Between Prototypes and Images in Cross-domain Finetuning
In cross-domain few-shot classification (CFC), recent works mainly focus on adapting a simple transformation head on top of a frozen pre-trained backbone with few labeled data to project embeddings into a task-specific metric space where classification can be performed by measuring similarities between image instance and prototype representations. Technically, an assumption implicitly adopted in such a framework is that the prototype and image instance embeddings share the same representation transformation. However, in this paper, we find that there naturally exists a gap, which resembles the modality gap, between the prototype and image instance embeddings extracted from the frozen pre-trained backbone, and simply applying the same transformation during the adaptation phase constrains exploring the optimal representations and shrinks the gap between prototype and image representations. To solve this problem, we propose a simple yet effective method, contrastive prototype-image adaptation (CoPA), to adapt different transformations respectively for prototypes and images similarly to CLIP by treating prototypes as text prompts. Extensive experiments on Meta-Dataset demonstrate that CoPA achieves the state-of-the-art performance more efficiently. Meanwhile, further analyses also indicate that CoPA can learn better representation clusters, enlarge the gap, and achieve minimal validation loss at the enlarged gap.
Attention over learned object embeddings enables complex visual reasoning
Neural networks have achieved success in a wide array of perceptual tasks but often fail at tasks involving both perception and higher-level reasoning. On these more challenging tasks, bespoke approaches (such as modular symbolic components, independent dynamics models or semantic parsers) targeted towards that specific type of task have typically performed better. The downside to these targeted approaches, however, is that they can be more brittle than general-purpose neural networks, requiring significant modification or even redesign according to the particular task at hand. Here, we propose a more general neural-network-based approach to dynamic visual reasoning problems that obtains state-of-the-art performance on three different domains, in each case outperforming bespoke modular approaches tailored specifically to the task. Our method relies on learned object-centric representations, self-attention and self-supervised dynamics learning, and all three elements together are required for strong performance to emerge. The success of this combination suggests that there may be no need to trade off flexibility for performance on problems involving spatio-temporal or causal-style reasoning. With the right soft biases and learning objectives in a neural network we may be able to attain the best of both worlds.
One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL
While reinforcement learning algorithms can learn effective policies for complex tasks, these policies are often brittle to even minor task variations, especially when variations are not explicitly provided during training. One natural approach to this problem is to train agents with manually specified variation in the training task or environment. However, this may be infeasible in practical situations, either because making perturbations is not possible, or because it is unclear how to choose suitable perturbation strategies without sacrificing performance. The key insight of this work is that learning diverse behaviors for accomplishing a task can directly lead to behavior that generalizes to varying environments, without needing to perform explicit perturbations during training. By identifying multiple solutions for the task in a single environment during training, our approach can generalize to new situations by abandoning solutions that are no longer effective and adopting those that are. We theoretically characterize a robustness set of environments that arises from our algorithm and empirically find that our diversity-driven approach can extrapolate to various changes in the environment and task.