AITopics | spectrogram

Collaborating Authors

spectrogram

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

People used AI to recreate the voices of pilots killed in a plane crash

EngadgetMay-23-2026, 14:59:55 GMT

US transportation regulator NTSB pulled its accident reports after the audio recreations were uploaded online. The National Transportation Safety Board (NTSB) has pulled its docket system offline after people used information uploaded to it to recreate the voices of pilots killed in a plane crash with AI. As CNN reports, the agency recently uploaded files filled with details about the November 4, 2025 crash involving UPS flight 2976. One of the plane's engines separated from the wing during takeoff from Louisville, Kentucky, killing three crew members and 12 people on the ground. While the NTSB uploads accident reports that the public can access, it is not allowed by federal law to release cockpit audio recordings due to the highly sensitive nature of verbal communications inside the cockpit.

artificial intelligence, social media, transportation review smartphone laptop, (8 more...)

Engadget

Country: North America > United States > Kentucky > Jefferson County > Louisville (0.28)

Industry:

Transportation > Air (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Leisure & Entertainment > Games > Computer Games (0.75)

Technology:

Information Technology > Communications > Mobile (0.55)
Information Technology > Artificial Intelligence > Applied AI (0.55)
Information Technology > Communications > Social Media (0.44)

Add feedback

51200d29d1fc15f5a71c1dab4bb54f7c-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 21:40:24 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario (0.28)

Industry:

Media (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Unsupervised Learning of Spoken Language with Visual Context

David Harwath, Antonio Torralba, James Glass

Neural Information Processing SystemsMar-23-2026, 14:48:15 GMT

Humans learn to speak before they can read or write, so why can't computers do the same? In this paper, we present a deep neural network model capable of rudimentary spoken language acquisition using untranscribed audio training data, whose only supervision comes in the form of contextually relevant visual images. We describe the collection of our data comprised of over 120,000 spoken audio captions for the Places image dataset and evaluate our model on an image search and annotation task. We also provide some visualizations which suggest that our model is learning to recognize meaningful words within the caption spectrograms.

caption, machine learning, pattern recognition, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.35)

Add feedback

Images that Sound: Composing Images and Sounds on a Single Canvas

Neural Information Processing SystemsMar-21-2026, 18:42:31 GMT

Spectrograms are 2D representations of sound that look very different from the images found in our visual world. And natural images, when played as spectrograms, make unnatural sounds. In this paper, we show that it is possible to synthesize spectrograms that simultaneously look like natural images and sound like natural audio. We call these visual spectrograms . Our approach is simple and zero-shot, and it leverages pre-trained text-to-image and text-to-spectrogram diffusion models that operate in a shared latent space. During the reverse process, we denoise noisy latents with both the audio and image diffusion models in parallel, resulting in a sample that is likely under both models. Through quantitative evaluations and perceptual studies, we find that our method successfully generates spectrograms that align with a desired audio prompt while also taking the visual appearance of a desired image prompt.

artificial intelligence, machine learning, proceedings, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.65)

Add feedback

cb3213ada48302953cb0f166464ab356-Supplemental.pdf

Neural Information Processing SystemsFeb-19-2026, 09:12:10 GMT

classifier, dataset, patch size, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The iNaturalist Sounds Dataset

Neural Information Processing SystemsFeb-18-2026, 15:29:18 GMT

A current gap in the acoustic space is a large-scale, fine-grained dataset of species sounds that is easy to download and easy to use.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.04)
North America > Mexico > Baja California (0.04)
South America > Colombia (0.04)
(9 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Images that Sound: Composing Images and Sounds on a Single Canvas Ziyang Chen Daniel Geng Andrew Owens University of Michigan

Neural Information Processing SystemsFeb-16-2026, 22:08:42 GMT

Spectrograms are 2D representations of sound that look very different from the images found in our visual world. And natural images, when played as spectrograms, make unnatural sounds.

machine learning, natural language, spectrogram, (16 more...)

Neural Information Processing Systems

Country: