Pacific Ocean
Character-Aware Models Improve Visual Text Rendering
Liu, Rosanne, Garrette, Dan, Saharia, Chitwan, Chan, William, Roberts, Adam, Narang, Sharan, Blok, Irina, Mical, RJ, Norouzi, Mohammad, Constant, Noah
Current image generation models struggle to reliably produce well-formed visual text. In this paper, we investigate a key contributing factor: popular text-to-image models lack character-level input features, making it much harder to predict a word's visual makeup as a series of glyphs. To quantify this effect, we conduct a series of experiments comparing character-aware vs. character-blind text encoders. In the text-only domain, we find that character-aware models provide large gains on a novel spelling task (WikiSpell). Applying our learnings to the visual domain, we train a suite of image generation models, and show that character-aware variants outperform their character-blind counterparts across a range of novel text rendering tasks (our DrawText benchmark). Our models set a much higher state-of-the-art on visual spelling, with 30+ point accuracy gains over competitors on rare words, despite training on far fewer examples.
Never Give Artificial Intelligence the Nuclear Codes
No technology since the atomic bomb has inspired the apocalyptic imagination like artificial intelligence. Ever since ChatGPT began exhibiting glints of logical reasoning in November, the internet has been awash in doomsday scenarios. Many are self-consciously fanciful--they're meant to jar us into envisioning how badly things could go wrong if an emerging intelligence comes to understand the world, and its own goals, even a little differently from how its human creators do. One scenario, however, requires less imagination, because the first steps toward it are arguably already being taken--the gradual integration of AI into the most destructive technologies we possess today. Check out more from this issue and find your next story to read. The world's major military powers have begun a race to wire AI into warfare.
Blended Latent Diffusion
Avrahami, Omri, Fried, Ohad, Lischinski, Dani
The tremendous progress in neural image generation, coupled with the emergence of seemingly omnipotent vision-language models has finally enabled text-based interfaces for creating and editing images. Handling generic images requires a diverse underlying generative model, hence the latest works utilize diffusion models, which were shown to surpass GANs in terms of diversity. One major drawback of diffusion models, however, is their relatively slow inference time. In this paper, we present an accelerated solution to the task of local text-driven editing of generic images, where the desired edits are confined to a user-provided mask. Our solution leverages a recent text-to-image Latent Diffusion Model (LDM), which speeds up diffusion by operating in a lower-dimensional latent space. We first convert the LDM into a local image editor by incorporating Blended Diffusion into it. Next we propose an optimization-based solution for the inherent inability of this LDM to accurately reconstruct images. Finally, we address the scenario of performing local edits using thin masks. We evaluate our method against the available baselines both qualitatively and quantitatively and demonstrate that in addition to being faster, our method achieves better precision than the baselines while mitigating some of their artifacts.
Deep Ensembles to Improve Uncertainty Quantification of Statistical Downscaling Models under Climate Change Conditions
Gonzรกlez-Abad, Jose, Baรฑo-Medina, Jorge
Recently, deep learning has emerged as a promising tool for statistical downscaling, the set of methods for generating high-resolution climate fields from coarse low-resolution variables. Nevertheless, their ability to generalize to climate change conditions remains questionable, mainly due to the stationarity assumption. We propose deep ensembles as a simple method to improve the uncertainty quantification of statistical downscaling models. By better capturing uncertainty, statistical downscaling models allow for superior planning against extreme weather events, a source of various negative social and economic impacts. Since no observational future data exists, we rely on a pseudo reality experiment to assess the suitability of deep ensembles for quantifying the uncertainty of climate change projections. Deep ensembles allow for a better risk assessment, highly demanded by sectoral applications to tackle climate change.
Decadal Temperature Prediction via Chaotic Behavior Tracking
Ren, Jinfu, Liu, Yang, Liu, Jiming
Decadal temperature prediction provides crucial information for quantifying the expected effects of future climate changes and thus informs strategic planning and decision-making in various domains. However, such long-term predictions are extremely challenging, due to the chaotic nature of temperature variations. Moreover, the usefulness of existing simulation-based and machine learning-based methods for this task is limited because initial simulation or prediction errors increase exponentially over time. To address this challenging task, we devise a novel prediction method involving an information tracking mechanism that aims to track and adapt to changes in temperature dynamics during the prediction phase by providing probabilistic feedback on the prediction error of the next step based on the current prediction. We integrate this information tracking mechanism, which can be considered as a model calibrator, into the objective function of our method to obtain the corrections needed to avoid error accumulation. Our results show the ability of our method to accurately predict global land-surface temperatures over a decadal range. Furthermore, we demonstrate that our results are meaningful in a real-world context: the temperatures predicted using our method are consistent with and can be used to explain the well-known teleconnections within and between different continents.
Physical Knowledge Enhanced Deep Neural Network for Sea Surface Temperature Prediction
Meng, Yuxin, Gao, Feng, Rigall, Eric, Dong, Ran, Dong, Junyu, Du, Qian
Traditionally, numerical models have been deployed in oceanography studies to simulate ocean dynamics by representing physical equations. However, many factors pertaining to ocean dynamics seem to be ill-defined. We argue that transferring physical knowledge from observed data could further improve the accuracy of numerical models when predicting Sea Surface Temperature (SST). Recently, the advances in earth observation technologies have yielded a monumental growth of data. Consequently, it is imperative to explore ways in which to improve and supplement numerical models utilizing the ever-increasing amounts of historical observational data. To this end, we introduce a method for SST prediction that transfers physical knowledge from historical observations to numerical models. Specifically, we use a combination of an encoder and a generative adversarial network (GAN) to capture physical knowledge from the observed data. The numerical model data is then fed into the pre-trained model to generate physics-enhanced data, which can then be used for SST prediction. Experimental results demonstrate that the proposed method considerably enhances SST prediction performance when compared to several state-of-the-art baselines.
AI-Driven Weapons Systems Lead Today's Arms Race
When it comes to advanced artificial intelligence, much of the debate has focused on whether white-collar workers are now facing the sort of extinction-level threat that the working class once did with robotics. And while it's suddenly likely that AI will be capable of duplicating a good part of what lawyers, accountants, teachers, programmers, and--yes--journalists do, that's not even where the most significant revolution is likely to occur. The latest AI--known as generative pre-trained transformers (GPT)--promises to utterly transform the geopolitics of war and deterrence. It will do so in ways that are not necessarily comforting, and which may even turn existential. On one hand, this technology could make war less lethal and possibly strengthen deterrence. By dramatically expanding the role of AI-directed drones in air forces, navies and armies, human lives could be spared.
Towards Spatio-temporal Sea Surface Temperature Forecasting via Static and Dynamic Learnable Personalized Graph Convolution Network
Li, Xiaohan, Zhang, Gaowei, Huang, Kai, He, Zhaofeng
Sea surface temperature (SST) is uniquely important to the Earth's atmosphere since its dynamics are a major force in shaping local and global climate and profoundly affect our ecosystems. Accurate forecasting of SST brings significant economic and social implications, for example, better preparation for extreme weather such as severe droughts or tropical cyclones months ahead. However, such a task faces unique challenges due to the intrinsic complexity and uncertainty of ocean systems. Recently, deep learning techniques, such as graphical neural networks (GNN), have been applied to address this task. Even though these methods have some success, they frequently have serious drawbacks when it comes to investigating dynamic spatiotemporal dependencies between signals. To solve this problem, this paper proposes a novel static and dynamic learnable personalized graph convolution network (SD-LPGC). Specifically, two graph learning layers are first constructed to respectively model the stable long-term and short-term evolutionary patterns hidden in the multivariate SST signals. Then, a learnable personalized convolution layer is designed to fuse this information. Our experiments on real SST datasets demonstrate the state-of-the-art performances of the proposed approach on the forecasting task.
Research Intern - Machine Learning, Statistics, and AutoML
Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers. Our researchers and engineers pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment. The Machine Learning and Statistics and AutoML groups at Microsoft Research New England are hiring multiple interns to work alongside leading researchers and engineers to advance the state-of-the-art in the field. Projects include applied and theoretical machine learning research on topics including (but not limited to): auto-ML, transfer learning, domain adaptation, program synthesis, meta-learning, sign language modeling, statistics, approximate inference, distribution compression, weather and climate forecasting, causal inference, causal machine learning, ML for health, adversarial ML, learning and incentives, reinforcement learning, multi-agent reinforcement learning, deep learning, supervised learning, and unsupervised learning. Responsibilities: Interns put inquiry and theory into practice.
Multi-Scale Adaptive Graph Neural Network for Multivariate Time Series Forecasting
Chen, Ling, Chen, Donghui, Shang, Zongjiang, Wu, Binqing, Zheng, Cen, Wen, Bo, Zhang, Wei
Multivariate time series (MTS) forecasting plays an important role in the automation and optimization of intelligent applications. It is a challenging task, as we need to consider both complex intra-variable dependencies and inter-variable dependencies. Existing works only learn temporal patterns with the help of single inter-variable dependencies. However, there are multi-scale temporal patterns in many real-world MTS. Single inter-variable dependencies make the model prefer to learn one type of prominent and shared temporal patterns. In this paper, we propose a multi-scale adaptive graph neural network (MAGNN) to address the above issue. MAGNN exploits a multi-scale pyramid network to preserve the underlying temporal dependencies at different time scales. Since the inter-variable dependencies may be different under distinct time scales, an adaptive graph learning module is designed to infer the scale-specific inter-variable dependencies without pre-defined priors. Given the multi-scale feature representations and scale-specific inter-variable dependencies, a multi-scale temporal graph neural network is introduced to jointly model intra-variable dependencies and inter-variable dependencies. After that, we develop a scale-wise fusion module to effectively promote the collaboration across different time scales, and automatically capture the importance of contributed temporal patterns. Experiments on four real-world datasets demonstrate that MAGNN outperforms the state-of-the-art methods across various settings.