Artificial intelligence system learns concepts shared across video, audio, and text