mapillary
1 Hosting Licensing and Maintenance Plan
The dataset will be available for a minimum of five years, with no plans for removal. We will ensure ongoing maintenance to verify and maintain data accessibility. For what purpose was the dataset created? Was there a specific task in mind? Who created the dataset (e.g., which team, research group) and on behalf of which Who funded the creation of the dataset?
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > United States > California > San Francisco County > San Francisco (0.05)
- North America > United States > California > Los Angeles County > Los Angeles (0.05)
- Law (0.68)
- Information Technology > Security & Privacy (0.68)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- (7 more...)
1 Hosting Licensing and Maintenance Plan
The dataset will be available for a minimum of five years, with no plans for removal. We will ensure ongoing maintenance to verify and maintain data accessibility. For what purpose was the dataset created? Was there a specific task in mind? Who created the dataset (e.g., which team, research group) and on behalf of which Who funded the creation of the dataset?
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > United States > California > San Francisco County > San Francisco (0.05)
- North America > United States > California > Los Angeles County > Los Angeles (0.05)
- Law (0.68)
- Information Technology > Security & Privacy (0.68)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- (7 more...)
- North America > United States > Texas (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.35)
OpenStreetView-5M: The Many Roads to Global Visual Geolocation
Astruc, Guillaume, Dufour, Nicolas, Siglidis, Ioannis, Aronssohn, Constantin, Bouia, Nacim, Fu, Stephanie, Loiseau, Romain, Nguyen, Van Nguyen, Raude, Charles, Vincent, Elliot, XU, Lintao, Zhou, Hongyu, Landrieu, Loic
Determining the location of an image anywhere on Earth is a complex visual task, which makes it particularly relevant for evaluating computer vision algorithms. Yet, the absence of standard, large-scale, open-access datasets with reliably localizable images has limited its potential. To address this issue, we introduce OpenStreetView-5M, a large-scale, open-access dataset comprising over 5.1 million geo-referenced street view images, covering 225 countries and territories. In contrast to existing benchmarks, we enforce a strict train/test separation, allowing us to evaluate the relevance of learned geographical features beyond mere memorization. To demonstrate the utility of our dataset, we conduct an extensive benchmark of various state-of-the-art image encoders, spatial representations, and training strategies. All associated codes and models can be found at https://github.com/gastruc/osv5m.
- Africa > Ghana (0.04)
- South America > Brazil (0.04)
- Oceania > New Zealand (0.04)
- (17 more...)
- Law (0.92)
- Information Technology > Security & Privacy (0.67)
Domain Generalization via Balancing Training Difficulty and Model Capability
Jiang, Xueying, Huang, Jiaxing, Jin, Sheng, Lu, Shijian
Domain generalization (DG) aims to learn domain-generalizable models from one or multiple source domains that can perform well in unseen target domains. Despite its recent progress, most existing work suffers from the misalignment between the difficulty level of training samples and the capability of contemporarily trained models, leading to over-fitting or under-fitting in the trained generalization model. We design MoDify, a Momentum Difficulty framework that tackles the misalignment by balancing the seesaw between the model's capability and the samples' difficulties along the training process. MoDify consists of two novel designs that collaborate to fight against the misalignment while learning domain-generalizable models. The first is MoDify-based Data Augmentation which exploits an RGB Shuffle technique to generate difficulty-aware training samples on the fly. The second is MoDify-based Network Optimization which dynamically schedules the training samples for balanced and smooth learning with appropriate difficulty. Without bells and whistles, a simple implementation of MoDify achieves superior performance across multiple benchmarks. In addition, MoDify can complement existing methods as a plug-in, and it is generic and can work for different visual recognition tasks.
- North America > United States > New York (0.04)
- North America > United States > Montana > Roosevelt County (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Continual Learning with Evolving Class Ontologies
Lin, Zhiqiu, Pathak, Deepak, Wang, Yu-Xiong, Ramanan, Deva, Kong, Shu
Lifelong learners must recognize concept vocabularies that evolve over time. A common yet underexplored scenario is learning with class labels that continually refine/expand old classes. For example, humans learn to recognize ${\tt dog}$ before dog breeds. In practical settings, dataset $\textit{versioning}$ often introduces refinement to ontologies, such as autonomous vehicle benchmarks that refine a previous ${\tt vehicle}$ class into ${\tt school-bus}$ as autonomous operations expand to new cities. This paper formalizes a protocol for studying the problem of $\textit{Learning with Evolving Class Ontology}$ (LECO). LECO requires learning classifiers in distinct time periods (TPs); each TP introduces a new ontology of "fine" labels that refines old ontologies of "coarse" labels (e.g., dog breeds that refine the previous ${\tt dog}$). LECO explores such questions as whether to annotate new data or relabel the old, how to leverage coarse labels, and whether to finetune the previous TP's model or train from scratch. To answer these questions, we leverage insights from related problems such as class-incremental learning. We validate them under the LECO protocol through the lens of image classification (CIFAR and iNaturalist) and semantic segmentation (Mapillary). Our experiments lead to surprising conclusions; while the current status quo is to relabel existing datasets with new ontologies (such as COCO-to-LVIS or Mapillary1.2-to-2.0), LECO demonstrates that a far better strategy is to annotate $\textit{new}$ data with the new ontology. However, this produces an aggregate dataset with inconsistent old-vs-new labels, complicating learning. To address this challenge, we adopt methods from semi-supervised and partial-label learning. Such strategies can surprisingly be made near-optimal, approaching an "oracle" that learns on the aggregate dataset exhaustively labeled with the newest ontology.
- North America > United States > Texas (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Case studies of successful AI startups
With tech giants pouring billions of dollars into artificial intelligence projects, it's hard to see how startups can find their place and create successful business models that leverage AI. However, while fiercely competitive, the AI space is also constantly causing fundamental shifts in many sectors. And this creates the perfect environment for fast-thinking and -moving startups to carve a niche for themselves before the big players move in. Last week, technology analysis firm CB Insights published an update on the status of its list of top 100 AI startups of 2020 (in case you don't know, CB Insight publishes a list of 100 most promising AI startups every year). Out of the hundred startups, four have made exits, with three going public and one being acquired by Facebook.
- Information Technology > Services (1.00)
- Energy (0.97)
- Banking & Finance > Insurance (0.96)
- Leisure & Entertainment (0.89)