Goto

Collaborating Authors

 van horn


Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset

arXiv.org Artificial Intelligence

We present a new benchmark dataset, Sapsucker Woods 60 (SSW60), for advancing research on audiovisual fine-grained categorization. While our community has made great strides in fine-grained visual categorization on images, the counterparts in audio and video fine-grained categorization are relatively unexplored. To encourage advancements in this space, we have carefully constructed the SSW60 dataset to enable researchers to experiment with classifying the same set of categories in three different modalities: images, audio, and video. The dataset covers 60 species of birds and is comprised of images from existing datasets, and brand new, expert-curated audio and video datasets. We thoroughly benchmark audiovisual classification performance and modality fusion experiments through the use of state-of-the-art transformer methods. Our findings show that performance of audiovisual fusion methods is better than using exclusively image or audio based methods for the task of video classification. We also present interesting modality transfer experiments, enabled by the unique construction of SSW60 to encompass three different modalities. We hope the SSW60 dataset and accompanying baselines spur research in this fascinating area.


What bird is singing? Merlin Bird ID app offers instant answers

AIHub

And it's usually followed up by a question: What was that bird? The question just got much easier to answer. The Cornell Lab of Ornithology's free Merlin Bird ID app can now identify bird sounds. Merlin can recognize the sounds of more than 400 species from the U.S. and Canada, with that number set to expand rapidly in future updates. As Merlin listens, it uses artificial intelligence (AI) technology to identify each species, displaying in real time a list and photos of the birds that are singing or calling. Automatic song ID has been a dream for decades, but analyzing sound has always been extremely difficult.


June returns with a cheaper smart oven for lazy cooks

Engadget

Two years ago, June unveiled its first smart oven, complete with a 2.3-GHz quad-core NVIDIA processor, lots of sensors and a dose of artificial intelligence. In my review, I was mesmerized by how well it cooked a variety of foods simply using preset programs. Unfortunately, at $1,500, it was ridiculously expensive. Today, June is discontinuing it and is ready to reveal its second-generation oven. Not only does it cook faster, it's smarter and, at $600, significantly cheaper.


Five years ago, AI was struggling to identify cats. Now it's trying to tackle 5000 species

#artificialintelligence

In 2012, Google made a breakthrough: It trained its AI to recognize cats in YouTube videos. Google's neural network, software which uses statistics to approximate how the brain learns, taught itself to detect the shapes of cats and humans with more than 70% accuracy. It was a 70% improvement over any other machine learning at the time. Five years later, a contest Google is sponsoring speaks volumes about the field's advancement. Instead of finding cats, researchers will be required to train an AI to identify more than 5000 different species of plants and animals.


This $1,500 Toaster Oven Is Everything That's Wrong With Silicon Valley Design

#artificialintelligence

I slide a piece of salmon into the June, one of the most advanced ovens ever built. It required nearly $30 million in venture capital to create. It was the brainchild of the engineer who brought us the iPhone's camera and Ammunition, the design firm that gave us Beats headphones. "We take very hard technologies, AI, deep learning, and lots of sensors, and we apply that to creating a well thought through, simple interface that just makes your life better," says June cofounder Matt Van Horn, another Apple alum who cofounded Zimride, today known as Lyft. "Our MO is we just want to inspire people to cook more."