poseidon
Poseidon: Efficient Foundation Models for PDEs
We introduce Poseidon, a foundation model for learning the solution operators of PDEs. It is based on a multiscale operator transformer, with time-conditioned layer norms that enable continuous-in-time evaluations. A novel training strategy leveraging the semi-group property of time-dependent PDEs to allow for significant scaling-up of the training data is also proposed. Poseidon is pretrained on a diverse, large scale dataset for the governing equations of fluid dynamics. It is then evaluated on a suite of 15 challenging downstream tasks that include a wide variety of PDE types and operators. We show that Poseidon exhibits excellent performance across the board by outperforming baselines significantly, both in terms of sample efficiency and accuracy. Poseidon also generalizes very well to new physics that is not seen during pretraining. Moreover, Poseidon scales with respect to model and data size, both for pretraining and for downstream tasks. Taken together, our results showcase the surprising ability of Poseidon to learn effective representations from a very small set of PDEs during pretraining in order to generalize well to unseen and unrelated PDEs downstream, demonstrating its potential as an effective, general purpose PDE foundation model.
SPUS: A Lightweight and Parameter-Efficient Foundation Model for PDEs
Siddik, Abu Bucker, Oyen, Diane, Most, Alexander, Kucer, Michal, Biswas, Ayan
We introduce Small PDE U-Net Solver (SPUS), a compact and efficient foundation model (FM) designed as a unified neural operator for solving a wide range of partial differential equations (PDEs). Unlike existing state-of-the-art PDE FMs-primarily based on large complex transformer architectures with high computational and parameter overhead-SPUS leverages a lightweight residual U-Net-based architecture that has been largely underexplored as a foundation model architecture in this domain. To enable effective learning in this minimalist framework, we utilize a simple yet powerful auto-regressive pretraining strategy which closely replicates the behavior of numerical solvers to learn the underlying physics. SPUS is pretrained on a diverse set of fluid dynamics PDEs and evaluated across 6 challenging unseen downstream PDEs spanning various physical systems. Experimental results demonstrate that SPUS using residual U-Net based architecture achieves state-of-the-art generalization on these downstream tasks while requiring significantly fewer parameters and minimal fine-tuning data, highlighting its potential as a highly parameter-efficient FM for solving diverse PDE systems.
Poseidon: Efficient Foundation Models for PDEs
We introduce Poseidon, a foundation model for learning the solution operators of PDEs. It is based on a multiscale operator transformer, with time-conditioned layer norms that enable continuous-in-time evaluations. A novel training strategy leveraging the semi-group property of time-dependent PDEs to allow for significant scaling-up of the training data is also proposed. Poseidon is pretrained on a diverse, large scale dataset for the governing equations of fluid dynamics. It is then evaluated on a suite of 15 challenging downstream tasks that include a wide variety of PDE types and operators. We show that Poseidon exhibits excellent performance across the board by outperforming baselines significantly, both in terms of sample efficiency and accuracy.
Agents' Room: Narrative Generation through Multi-step Collaboration
Huot, Fantine, Amplayo, Reinald Kim, Palomaki, Jennimaria, Jakobovits, Alice Shoshana, Clark, Elizabeth, Lapata, Mirella
Writing compelling fiction is a multifaceted process combining elements such as crafting a plot, developing interesting characters, and using evocative language. While large language models (LLMs) show promise for story writing, they currently rely heavily on intricate prompting, which limits their use. We propose Agents' Room, a generation framework inspired by narrative theory, that decomposes narrative writing into subtasks tackled by specialized agents. To illustrate our method, we introduce Tell Me A Story, a high-quality dataset of complex writing prompts and human-written stories, and a novel evaluation framework designed specifically for assessing long narratives. We show that Agents' Room generates stories that are preferred by expert evaluators over those produced by baseline systems by leveraging collaboration and specialization to decompose the complex story writing task into tractable components. We provide extensive analysis with automated and human-based metrics of the generated output.
These Astonishing Minecraft Builds Were Years in the Making
Minecraft, the best-selling video game of all time, has been around for more than a decade. The procedurally generated survival sandbox is constantly evolving, playing host to everything from speedrun challenges and political dramas to lessons. But it's best known as digital Lego-- and it's seen some incredible creations over the years. For most, it's a time-consuming hobby, but a few have parlayed their passion into a professional career. Here are some of the most spectacular Minecraft creations that took years to build.
Pacific Commander: Sub-hunting spy plane missions continue in Pacific
Aviation Maintenance Administrationman 3rd Class Shea Wright, assigned to the Skinny Dragons of Patrol Squadron (VP) 4, recovers a squadron P-8A Poseidon maritime patrol and reconnaissance aircraft following an anti-submarine warfare mission over the Atlantic Ocean, Nov. 30, 2019. The increasingly global reach of Chinese nuclear-armed ballistic missile submarines, armed with JL-2 weapons reportedly able to hit parts of the U.S., continues to inspire an ongoing Navy effort to accelerate production of attack submarines, prepare long-dwell drones for deployment to the Pacific and continue acquisition of torpedo-armed sub-hunting planes such as the P-8/A Poseidon. The Navy has been moving quickly to increase its fleet of Poseidon's on an accelerated timetable; in the Navy's 2020 budget, the service was authorized for a near term increase in Poseidon production by three, moving funding for the year up for nine Poseidons, as cited in a report from USNI news. Last year, the Navy awarded Boeing a $2.4 billion deal to produce 19 more P-8A Poseidon surveillance and attack planes. The Poseidon increase appears to align with the service's overall Pacific theater strategy, which makes a point to sustain peaceful, yet vital surveillance and Freedom of Navigation missions in the region.
Drone weapons the future of underwater warfare
Naval technology is developing so rapidly that Australia's new $50 billion fleet of submarines may one day have to face deadly underwater drones, an expert has warned. Earlier this month, the federal government announced the signing of the Attack class submarine Strategic Partnering Agreement with French shipbuilder Naval Group. It will build 12 attack submarines to replace the Royal Australian Navy's ageing Collins class vessels, with the first one scheduled to be delivered in the early 2030s, the federal government said. But Russia has already provided a glimpse of underwater autonomous – or drone - weaponry. The Russian Ministry of Defence released testing footage of its'Poseidon' – a high-speed nuclear torpedo. Naval chiefs said the weapon is capable of carrying both conventional and nuclear warheads and will have a maximum speed of 200 km/h.
Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters
Zhang, Hao, Zheng, Zeyu, Xu, Shizhen, Dai, Wei, Ho, Qirong, Liang, Xiaodan, Hu, Zhiting, Wei, Jinliang, Xie, Pengtao, Xing, Eric P.
Deep learning models can take weeks to train on a single GPU-equipped machine, necessitating scaling out DL training to a GPU-cluster. However, current distributed DL implementations can scale poorly due to substantial parameter synchronization over the network, because the high throughput of GPUs allows more data batches to be processed per unit time than CPUs, leading to more frequent network synchronization. We present Poseidon, an efficient communication architecture for distributed DL on GPUs. Poseidon exploits the layered model structures in DL programs to overlap communication and computation, reducing bursty network communication. Moreover, Poseidon uses a hybrid communication scheme that optimizes the number of bytes required to synchronize each layer, according to layer properties and the number of machines. We show that Poseidon is applicable to different DL frameworks by plugging Poseidon into Caffe and TensorFlow. We show that Poseidon enables Caffe and TensorFlow to achieve 15.5x speed-up on 16 single-GPU machines, even with limited bandwidth (10GbE) and the challenging VGG19-22K network for image classification. Moreover, Poseidon-enabled TensorFlow achieves 31.5x speed-up with 32 single-GPU machines on Inception-V3, a 50% improvement over the open-source TensorFlow (20x speed-up).