sonogram
Siamese networks for Poincar\'e embeddings and the reconstruction of evolutionary trees
Carvallo, Ciro, Bocaccio, Hernán, Mindlin, Gabriel B., Groisman, Pablo
Animal classification systems are based on evolutionary relationships between different organisms, known as phylogeny. This approach allows us to organize species in a way that reflects our understanding of how they evolved from common ancestors. Phylogenetic trees are diagrams that graphically represent these evolutionary relationships between organisms. In these representations, the species of interest are placed at the tips of branches that emerge from a point representing a common ancestor. The natural mathematical object associated to this situation is a tree (a graph with no cycles) with a root (the common ancestor). It is important to note that the hypotheses regarding how different species may have descended from a common ancestor are typically based on physical traits (which are therefore interpretable) or directly on DNA sequences.
- South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.05)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > New York > Richmond County > New York City (0.04)
- (7 more...)
LIPSFUS: A neuromorphic dataset for audio-visual sensory fusion of lip reading
Rios-Navarro, Antonio, Piñero-Fuentes, Enrique, Canas-Moreno, Salvador, Javed, Aqib, Harkin, Jin, Linares-Barranco, Alejandro
This paper presents a sensory fusion neuromorphic dataset collected with precise temporal synchronization using a set of Address-Event-Representation sensors and tools. The target application is the lip reading of several keywords for different machine learning applications, such as digits, robotic commands, and auxiliary rich phonetic short words. The dataset is enlarged with a spiking version of an audio-visual lip reading dataset collected with frame-based cameras. LIPSFUS is publicly available and it has been validated with a deep learning architecture for audio and visual classification. It is intended for sensory fusion architectures based on both artificial and spiking neural network algorithms.
- Health & Medicine > Therapeutic Area (0.47)
- Education (0.46)
Impact of PCA-based preprocessing and different CNN structures on deformable registration of sonograms
Schmidt, Christian, Overhoff, Heinrich Martin
Central venous catheters (CVC) are commonly inserted into the large veins of the neck, e.g. the internal jugular vein (IJV). CVC insertion may cause serious complications like misplacement into an artery or perforation of cervical vessels. Placing a CVC under sonographic guidance is an appropriate method to reduce such adverse events, if anatomical landmarks like venous and arterial vessels can be detected reliably. This task shall be solved by registration of patient individual images vs. an anatomically labelled reference image. In this work, a linear, affine transformation is performed on cervical sonograms, followed by a non-linear transformation to achieve a more precise registration. Voxelmorph (VM), a learning-based library for deformable image registration using a convolutional neural network (CNN) with U-Net structure was used for non-linear transformation. The impact of principal component analysis (PCA)-based pre-denoising of patient individual images, as well as the impact of modified net structures with differing complexities on registration results were examined visually and quantitatively, the latter using metrics for deformation and image similarity. Using the PCA-approximated cervical sonograms resulted in decreased mean deformation lengths between 18% and 66% compared to their original image counterparts, depending on net structure. In addition, reducing the number of convolutional layers led to improved image similarity with PCA images, while worsening in original images. Despite a large reduction of network parameters, no overall decrease in registration quality was observed, leading to the conclusion that the original net structure is oversized for the task at hand.
Riffusion's AI generates music from text using visual sonograms
On Thursday, a pair of tech hobbyists released Riffusion, an AI model that generates music from text prompts by creating a visual representation of sound and converting it to audio for playback. It uses a fine-tuned version of the Stable Diffusion 1.5 image synthesis model, applying visual latent diffusion to sound processing in a novel way. Created as a hobby project by Seth Forsgren and Hayk Martiros, Riffusion works by generating sonograms, which store audio in a two-dimensional image. In a sonogram, the X-axis represents time (the order in which the frequencies get played, from left to right), and the Y-axis represents the frequency of the sounds. Meanwhile, the color of each pixel in the image represents the amplitude of the sound at that given moment in time.
- Media > Music (0.78)
- Leisure & Entertainment (0.78)
Deep learning deciphers what rats are saying
For many years, researchers knew that rodents' squeaks tell a lot about how the animals are feeling. Much like a wagging tail on a dog, certain vocalizations indicate the rodents are happy. Conversely, other vocalizations indicate the rodents are stressed, or even depressed. But why were they interested in the rodents' moods? These researchers wanted to understand the rodents' responses to various stimuli.
Thousands jam to see Jen-Hsun Huang's keynote at GPU Developers Conference
In a 2 hour talk that filled the Keynote Hall and spillover rooms at the San Jose McEnery Convention Center and had thousands of people in line for hours before, Nvidia's CEO Jen-Hsun Huang, in characteristic jeans, leather jacket, and humble humor, described the world of graphics processing units (GPUs) with brilliant images and memorable one-liners: Most of Nvidia's revenue comes from GPUs for gaming, super-capable ray-tracing professional graphics, and extraordinarily powerful super computers for data centers. Most of their current research and development is involved with AI-ready chips that enable clients to develop machine and deep learning models and apps. Nvidia is banking on these new development chips to be the chips of the future. In AI-focused healthcare, this covers CLARA, a deep learning engine that uses present-day black and white sonogram, PET and MRI 2D scans and enhances the data to 3D and then color rendering. In the example on the right, a black and white ultrasound sonogram on the left is enhanced into the fully rendered baby picture on the right.