NVIDIA CEO Jensen Hiang made a string of announcements during his Computex keynote, including details about the company's next DGX supercomputer. Given where the industry is clearly heading, it shouldn't come as a surprise that the DGX GH200 is largely about helping companies develop generative AI models. The supercomputer uses a new NVLink Switch System to enable 256 GH200 Grace Hopper superchips to act as a single GPU (each of the chips has an Arm-based Grace CPU and an H100 Tensor Core GPU). This, according to NVIDIA, allows the DGX GH200 to deliver 1 exaflop of performance and to have 144 terabytes of shared memory. The company says that's nearly 500 times as much memory as you'd find in a single DGX A100 system.
People in Texas sounded off on AI job displacement, with half of people who spoke to Fox News convinced that the tech will rob them of work. With new developments in generative artificial intelligence bringing the technology to the forefront of public conversation, concerns about how it will affect jobs in the entertainment industry have risen, even contributing in a writer strike in Hollywood. But, founders of Web3 animation studio Toonstar have been using artificial intelligence in their studio for years, and told Fox News Digital it serves as an aid in the creative process. AI can "unlock creativity" and give animators a "head start" in terms of creativity, Luisa Huang, COO and co-founder of Toonstar told Fox News Digital. "But I have yet to see AI be able to put output anything … that is ready for production," she added.
But the current boom has come as Big Tech companies and start-ups alike scramble to buy the company's graphics processing units, or GPU chips, for a totally different reason. The chips are well-suited to crunching the massive amounts of data that is necessary to train cutting-edge artificial intelligence programs like Google's PaLM 2 or OpenAI's GPT4. Nvidia has been steadily growing its AI-focused business over the past several years, but the explosion of interest and investment in the space over the last six months has turbocharged its sales.
Using characters and scenes he generated with Dall-E, writer / director Chad Nelson and creative agency Native Foreign have made the animated short Critters, which recently debuted on YouTube. The five-minute film, which was partly financed by OpenAI and is a cross between something from Pixar and a David Attenborough-style documentary, we meet a cast of cute, furry creatures who live in an imaginary jungle. While the assets were generated using AI, Chad wrote the script himself. He used actors to record the voices and the film was made together with a team of animators. His son also worked on the film, as an Unreal Engine programmer.
Meta has open-sourced an artificial intelligence project that lets anyone bring their doodles to life. The company hopes that by offering Animated Drawings as an open-source project other developers will be able to create new, richer experiences. The Fundamental AI Research (FAIR) team originally released a web-based version of the tool in 2021. It asks users to upload a drawing of a single human-like character or to select a demo figure. If you use your own doodle, you'll see a consent form that asks if Meta can use your drawing to help train its models.
It allows the use of the DirectML API for GPU acceleration in TensorFlow models on any GPU that supports the Direct3D 12 API, including both Nvidia and AMD GPUs, without the need for specific GPU drivers or libraries. It's a project by Microsoft that enables the use of DirectML API for GPU acceleration in TensorFlow models, making it easier than ever to take advantage of the power of the GPU. In this article, I'll walk you through the process of setting up an Anaconda environment for TensorFlow-DirectML and running tests to verify that DirectML and the GPU are being used for machine learning computations. With this guide, you'll be able to overcome the struggles of setting up TensorFlow to run with GPUs and unlock the full potential of your machine learning models. DirectML is designed to be hardware-agnostic and can work with any GPU that supports the Direct3D 12 API, including both Nvidia and AMD GPUs.
We introduce a paradigm for understanding physical scenes without human annotations. At the core of our system is a physical world representation that is first recovered by a perception module and then utilized by physics and graphics engines. During training, the perception module and the generative models learn by visual de-animation -- interpreting and reconstructing the visual information stream. During testing, the system first recovers the physical world state, and then uses the generative models for reasoning and future prediction. Even more so than forward simulation, inverting a physics or graphics engine is a computationally hard problem; we overcome this challenge by using a convolutional inversion network. Our system quickly recognizes the physical world state from appearance and motion cues, and has the flexibility to incorporate both differentiable and non-differentiable physics and graphics engines. We evaluate our system on both synthetic and real datasets involving multiple physical scenes, and demonstrate that our system performs well on both physical state estimation and reasoning problems. We further show that the knowledge learned on the synthetic dataset generalizes to constrained real images.
Traditional computer graphics rendering pipelines are designed for procedurally generating 2D images from 3D shapes with high performance. The nondifferentiability due to discrete operations (such as visibility computation) makes it hard to explicitly correlate rendering parameters and the resulting image, posing a significant challenge for inverse rendering tasks. Recent work on differentiable rendering achieves differentiability either by designing surrogate gradients for non-differentiable operations or via an approximate but differentiable renderer. These methods, however, are still limited when it comes to handling occlusion, and restricted to particular rendering effects. We present RenderNet, a differentiable rendering convolutional network with a novel projection unit that can render 2D images from 3D shapes. Spatial occlusion and shading calculation are automatically encoded in the network. Our experiments show that RenderNet can successfully learn to implement different shaders, and can be used in inverse rendering tasks to estimate shape, pose, lighting and texture from a single image.
This paper presents a novel framework in which video/image segmentation and localization are cast into a single optimization problem that integrates information from low level appearance cues with that of high level localization cues in a very weakly supervised manner. The proposed framework leverages two representations at different levels, exploits the spatial relationship between bounding boxes and superpixels as linear constraints and simultaneously discriminates between foreground and background at bounding box and superpixel level. Different from previous approaches that mainly rely on discriminative clustering, we incorporate a foreground model that minimizes the histogram difference of an object across all image frames. Exploiting the geometric relation between the superpixels and bounding boxes enables the transfer of segmentation cues to improve localization output and vice-versa. Inclusion of the foreground model generalizes our discriminative framework to video data where the background tends to be similar and thus, not discriminative. We demonstrate the effectiveness of our unified framework on the YouTube Object video dataset, Internet Object Discovery dataset and Pascal VOC 2007.
This paper presents a weakly supervised instance segmentation method that consumes training data with tight bounding box annotations. The major difficulty lies in the uncertain figure-ground separation within each bounding box since there is no supervisory signal about it. We address the difficulty by formulating the problem as a multiple instance learning (MIL) task, and generate positive and negative bags based on the sweeping lines of each bounding box. The proposed deep model integrates MIL into a fully supervised instance segmentation network, and can be derived by the objective consisting of two terms, i.e., the unary term and the pairwise term. The former estimates the foreground and background areas of each bounding box while the latter maintains the unity of the estimated object masks. The experimental results show that our method performs favorably against existing weakly supervised methods and even surpasses some fully supervised methods for instance segmentation on the PASCAL VOC dataset.