AITopics | Media

Collaborating Authors

Media

Deep Representation-Decoupling Neural Networks for Monaural Music Mixture Separation

Li, Zhuo (The Hong Kong Polytechnic University) | Wang, Hongwei (Shanghai Jiao Tong University) | Zhao, Miao (The Hong Kong Polytechnic University) | Li, Wenjie (The Hong Kong Polytechnic University) | Guo, Minyi (Shanghai Jiao Tong University)

AAAI ConferencesFeb-8-2018

Monaural source separation (MSS) aims to extract and reconstruct different sources from a single-channel mixture, which could facilitate a variety of applications such as chord recognition, pitch estimation and automatic transcription. In this paper, we study the problem of separating vocals and instruments from monaural music mixture. Existing works for monaural source separation either utilize linear and shallow models (e.g., non-negative matrix factorization), or do not explicitly address the coupling and tangling of multiple sources in original input signals, hence they do not perform satisfactorily in real-world scenarios. To overcome the above limitations, we propose a novel end-to-end framework for monaural music mixture separation called Deep Representation-Decoupling Neural Networks (DRDNN). DRDNN takes advantages of both traditional signal processing methods and popular deep learning models. For each input of music mixture, DRDNN converts it to a two-dimensional time-frequency spectrogram using short-time Fourier transform (STFT), followed by stacked convolutional neural networks (CNN) layers and long-short term memory (LSTM) layers to extract more condensed features. Afterwards, DRDNN utilizes a decoupling component, which consists of a group of multi-layer perceptrons (MLP), to decouple the features further into different separated sources. The design of decoupling component in DRDNN produces purified single-source signals for subsequent full-size restoration, and can significantly improve the performance of final separation. Through extensive experiments on real-world dataset, we prove that DRDNN outperforms state-of-the-art baselines in the task of monaural music mixture separation and reconstruction.

artificial intelligence, machine learning, spectrogram, (18 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia (0.28)

Industry:

Media > Music (0.46)
Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment

Dong, Hao-Wen (Academia Sinica) | Hsiao, Wen-Yi (Academia Sinica) | Yang, Li-Chia (Academia Sinica) | Yang, Yi-Hsuan (Academia Sinica)

AAAI ConferencesFeb-8-2018

Generating music has a few notable differences from generating images and videos. First, music is an art of time, necessitating a temporal model. Second, music is usually composed of multiple instruments/tracks with their own temporal dynamics, but collectively they unfold over time interdependently. Lastly, musical notes are often grouped into chords, arpeggios or melodies in polyphonic music, and thereby introducing a chronological ordering of notes is not naturally suitable. In this paper, we propose three models for symbolic multi-track music generation under the framework of generative adversarial networks (GANs). The three models, which differ in the underlying assumptions and accordingly the network architectures, are referred to as the jamming model, the composer model and the hybrid model. We trained the proposed models on a dataset of over one hundred thousand bars of rock music and applied them to generate piano-rolls of five tracks: bass, drums, guitar, piano and strings. A few intra-track and inter-track objective metrics are also proposed to evaluate the generative results, in addition to a subjective user study. We show that our models can generate coherent music of four bars right from scratch (i.e. without human inputs). We also extend our models to human-AI cooperative music generation: given a specific track composed by human, we can generate four additional tracks to accompany it. All code, the dataset and the rendered audio samples are available at https://salu133445.github.io/musegan/.

artificial intelligence, machine learning, music, (17 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Genre:

Questionnaire & Opinion Survey (0.69)
Research Report (0.46)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Go With the Flow, on Jupiter and Snow. Coherence From Model-Free Video Data without Trajectories

AlMomani, Abd AlRahman, Bollt, Erik M.

arXiv.org Machine LearningFeb-8-2018

Viewing a data set such as the clouds of Jupiter, coherence is readily apparent to human observers, especially the Great Red Spot, but also other great storms and persistent structures. There are now many different definitions and perspectives mathematically describing coherent structures, but we will take an image processing perspective here. We describe an image processing perspective inference of coherent sets from a fluidic system directly from image data, without attempting to first model underlying flow fields, related to a concept in image processing called motion tracking. In contrast to standard spectral methods for image processing which are generally related to a symmetric affinity matrix, leading to standard spectral graph theory, we need a not symmetric affinity which arises naturally from the underlying arrow of time. We develop an anisotropic, directed diffusion operator corresponding to flow on a directed graph, from a directed affinity matrix developed with coherence in mind, and corresponding spectral graph theory from the graph Laplacian. Our methodology is not offered as more accurate than other traditional methods of finding coherent sets, but rather our approach works with alternative kinds of data sets, in the absence of vector field. Our examples will include partitioning the weather and cloud structures of Jupiter, and a local to Potsdam, N.Y. lake-effect snow event on Earth, as well as the benchmark test double-gyre system.

artificial intelligence, coherent structure, machine learning, (17 more...)

arXiv.org Machine Learning

1610.01857

Country:

North America > United States > New York (0.46)
Europe > Germany > Brandenburg > Potsdam (0.24)

Genre: Research Report (0.64)

Industry:

Government > Space Agency (0.69)
Media > Film (0.68)
Government > Regional Government > North America Government > United States Government (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Police in China are scanning travelers with facial recognition glasses

EngadgetFeb-7-2018, 22:59:38 GMT

Police in China are now sporting glasses equipped with facial recognition devices and they're using them to scan train riders and plane passengers for individuals who may be trying to avoid law enforcement or are using fake IDs. So far, police have caught seven people connected to major criminal cases and 26 who were using false IDs while traveling, according to People's Daily. The Wall Street Journal reports that Beijing-based LLVision Technology Co. developed the devices. The company produces wearable video cameras as well and while it sells those to anyone, it's vetting buyers for its facial recognition devices. LLVision says that in tests, the system was able to pick out individuals from a database of 10,000 people and it could do so in 100 milliseconds.

artificial intelligence, china, facial recognition glasses, (5 more...)

Engadget

Country: Asia > China > Beijing > Beijing (0.27)

Industry:

Media (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.76)
Transportation > Passenger (0.59)

Technology: Information Technology > Artificial Intelligence > Vision > Face Recognition (0.97)

Add feedback

What Is Project Yeti? Google Working On Live Stream Gaming Service, Console Project

International Business TimesFeb-7-2018, 22:09:35 GMT

Google seems to be working on a subscription-based game streaming service called "Yeti," according to Wednesday report by The Information. Google could also launch a gaming console under its Made by Google department, sources familiar with the matter told the news site. The subscription game service could work on Google's Chromecast and possibly with the rumored console. The project has gone through multiple iterations, including one that would have worked with the Chromecast, the report said. The Made by Google console would heighten Google's push for centering its products in consumers' homes.

artificial intelligence, google, live stream gaming service, (6 more...)

International Business Times

Industry:

Media (1.00)
Leisure & Entertainment > Games > Computer Games (0.83)

Technology: Information Technology > Artificial Intelligence (0.38)

Add feedback

Reddit bans the 'deepfake' AI porn it helped spawn

EngadgetFeb-7-2018, 20:48:59 GMT

Hot on the heels of Twitter, Reddit has updated its rules to expressly ban AI-generated "deepfake" porn. Where it previously had a single rule forbidding porn and suggestive material involving minors, it now has two -- and it's clear that you're not allowed to post "depictions that have been faked." Accordingly, Reddit has cracked down on some of the offending communities. It has shut down the deepfakes subreddit that got the ball rolling, as well as YouTubefakes. It hasn't closed non-deepfake subreddits like CelebFakes, however, and it's also maintaining the communities with more innocuous intentions, such as FakeApp (the program itself) and SFWdeepfakes. At the moment, this is more about addressing the specific violations that triggered the uproar than to stamp out every potential violation of the policy.

artificial intelligence, machine learning, social media, (6 more...)

Engadget

Industry:

Information Technology > Security & Privacy (1.00)
Media > News (0.91)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Top Tech Trends Manufacturers Need to Watch in 2018 Rootstock Software

#artificialintelligenceFeb-7-2018, 18:41:51 GMT

An old episode of The Simpsons predicted how smartphones would someday need to self-correct annoying spelling mishaps on the phone's keyboard. "Lisa on Ice" – a Season 6, Episode 8 show which aired way back in 1994- opens in a Springfield Elementary School assembly where Kearney asks fellow bully Dolph to take a memo on his Newton to "Beat up Martin." When the machine translates the message into "Eat up Martha," it is signaling how common text messaging errors can be blamed on their phone's lack of autocorrect technology. By 2013, Apple had perfected the autocorrect technology for smartphone keyboards. Nitin Ganatra, Apple's former director of engineering for iOS applications, explained "If you heard people talking and they used the words Eat up Martha, it was basically a reference to the fact that we needed to nail the keyboard. We needed to make sure the text input works on this thing otherwise- Here comes the Eat up Martha's."

artificial intelligence, enterprise resource planning, machine learning, (17 more...)

#artificialintelligence

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Automobiles & Trucks (0.70)
(2 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Robots (0.71)
Information Technology > Communications (0.71)
(3 more...)

Add feedback

I taught an AI to shave Henry Cavill's mustache

#artificialintelligenceFeb-7-2018, 18:37:04 GMT

Visit https://www.deepfakes.club to learn how you can start using these techniques with free software. The deepfakes algorithm is not just for face-swapping but can produce visual effects that would normally be quite costly to implement. This demo showcases the mustache-removal abilities of a trained neural network. Mustachegate involved actor Henry Cavill sporting a mustache during reshoots as Superman in the film Justice League. A competing studio would not allow him to shave his mustache.

machine learning, mustache, social media, (4 more...)

#artificialintelligence

Industry:

Media > Film (0.68)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Google is reportedly working on a video game streaming service

EngadgetFeb-7-2018, 18:35:31 GMT

It sounds like Google might be working on a game streaming service. According to a report from The Information, the tech juggernaut has been floating the idea for a streaming service (like PlayStation Now or NVIDIA's GeForce Now) for around two years. The service is codenamed "Yeti" and Google is apparently even testing hardware for it as well. The Information's sources say that the service might stream to a Chromecast, and that hiring Phil Harrison last month as VP of hardware -- formerly of Microsoft and Sony's gaming divisions -- could point toward a standalone gaming console. You probably shouldn't get your hopes up yet, though.

artificial intelligence, google, video game, (3 more...)

Engadget

Industry:

Media > Television (1.00)
Media > Radio (0.85)
Media > Music (0.85)
Leisure & Entertainment > Games > Computer Games (0.52)

Technology: Information Technology > Artificial Intelligence > Games (0.40)

Add feedback

Meet Erica, Japan's Next Robot News Anchor

#artificialintelligenceFeb-7-2018, 16:29:32 GMT

At a mere 23 years old, Japan's latest news anchor would make her parents proud -- if she had any. Erica, a lifelike android designed to look like a 23-year-old woman, may soon become a TV news anchor in Japan, the Wall Street Journal reported. According to Hiroshi Ishiguro, director of the Intelligent Robotics Laboratory at Osaka Universityand Erica's creator, the android will replace a human news anchor on the airwaves as soon as April, the Daily Mail said. Erica the android may be well suited for this desk job. For starters, she can capably recite scripted writing and sit in a chair, making her about as qualified for television as most humans.

artificial intelligence, japan, robot news anchor, (7 more...)

#artificialintelligence

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.27)
Asia > Middle East > Saudi Arabia (0.19)

Industry: Media > News (1.00)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback