Beyond Word Embeddings: Dense Representations for Multi-Modal Data
Armona, Luis (Stanford University) | González-Brenes, José P. (Chegg Inc.) | Edezhath, Ralph (Chegg Inc.)
Methods that calculate dense vector representations for text have proven to be very successful for knowledge representation. We study how to estimate dense representations for multi-modal data (e.g., text, continuous, categorical). We propose Feat2Vec as a novel model that supports supervised learning when explicit labels are available, and self-supervised learning when there are no labels. Feat2Vec calculates embeddings for data with multiple feature types, enforcing that all embeddings exist in a common space. We believe that we are the first to propose a method for learning self-supervised embeddings that leverage the structure of multiple feature types. Our experiments suggest that Feat2Vec outperforms previously published methods, and that it may be useful for avoiding the cold-start problem.
May-15-2019
- Country:
- North America > United States
- District of Columbia > Washington (0.04)
- New York > New York County
- New York City (0.04)
- California > Santa Clara County
- Palo Alto (0.05)
- North America > United States
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Technology: