car image
TalkMosaic: Interactive PhotoMosaic with Multi-modal LLM Q&A Interactions
We use images of cars of a wide range of varieties to compose an image of an animal such as a bird or a lion for the theme of environmental protection to maximize the information about cars in a single composed image and to raise the awareness about environmental challenges. We present a novel way of image interaction with an artistically-composed photomosaic image, in which a simple operation of "click and display" is used to demonstrate the interactive switch between a tile image in a photomosaic image and the corresponding original car image, which will be automatically saved on the Desktop. We build a multimodal custom GPT named TalkMosaic by incorporating car images information and the related knowledge to ChatGPT. By uploading the original car image to TalkMosaic, we can ask questions about the given car image and get the corresponding answers efficiently and effectively such as where to buy the tire in the car image that satisfies high environmental standards. We give an in-depth analysis on how to speed up the inference of multimodal LLM using sparse attention and quantization techniques with presented probabilistic FlashAttention (PrFlashAttention) and Staircase Adaptive Quantization (SAQ) methods. The implemented prototype demonstrates the feasibility and effectiveness of the presented approach.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Africa (0.04)
Discovering the Effectiveness of Pre-Training in a Large-scale Car-sharing Platform
Park, Kyung Ho, Chung, Hyunhee
Recent progress of deep learning has empowered various intelligent transportation applications, especially in car-sharing platforms. While the traditional operations of the car-sharing service highly relied on human engagements in fleet management, modern car-sharing platforms let users upload car images before and after their use to inspect the cars without a physical visit. To automate the aforementioned inspection task, prior approaches utilized deep neural networks. They commonly employed pre-training, a de-facto technique to establish an effective model under the limited number of labeled datasets. As candidate practitioners who deal with car images would presumably get suffered from the lack of a labeled dataset, we analyzed a sophisticated analogy into the effectiveness of pre-training is important. However, prior studies primarily shed a little spotlight on the effectiveness of pre-training. Motivated by the aforementioned lack of analysis, our study proposes a series of analyses to unveil the effectiveness of various pre-training methods in image recognition tasks at the car-sharing platform. We set two real-world image recognition tasks in the car-sharing platform in a live service, established them under the many-shot and few-shot problem settings, and scrutinized which pre-training method accomplishes the most effective performance in which setting. Furthermore, we analyzed how does the pre-training and fine-tuning convey different knowledge to the neural networks for a precise understanding.
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States (0.04)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
License plate cover on car images - deep learning project
Goal of this project was to develop and train a Deep learning model capable of detecting and covering license plates of vehicles. This project was implemented with Python using the Keras framework on top of Tensorflow. A number of python libraries like OpenCV, Numpy, Pandas, etc… were used. A lot of work done in order the reduce the overall size of the code and model weights to allow a deployment in Serverless environments like AWS Lambda and Google Cloud Functions. Finally i was able to deploy on Google Cloud Function with an API in front.
Vehicle Classification
To build such a model, we will use The Stanford Cars Dataset, an extensive collection of car images. It consists of 16,185 total images labeled with 196 classes based on the car's Make/Model/Year. An example of one of these classes is shown below. The dataset is split into training images and testing images (roughly 50–50% train-test split), and each car class has around 40 images in the training set and the same amount in the testing set as well. The dataset contained no missing values, so no data removal or imputations was required. A challenge was to automate the extraction cars' Make/Model/Year since strings varied in length and character type.
3 deep learning mysteries: Ensemble, knowledge- and self-distillation
Under now-standard techniques, such as over-parameterization, batch-normalization, and adding residual links, "modern age" neural network training--at least for image classification tasks and many others--is usually quite stable. Using standard neural network architectures and training algorithms (typically SGD with momentum), the learned models perform consistently well, not only in terms of training accuracy but even in test accuracy, regardless of which random initialization or random data order is used during the training. For instance, if one trains the same WideResNet-28-10 architecture on the CIFAR-100 dataset 10 times with different random seeds, the mean test accuracy is 81.51% while the standard deviation is only 0.16%. In a new paper, "Towards Understanding Ensemble, Knowledge Distillation, and Self-Distillation in Deep Learning," we focus on studying the discrepancy of neural networks during the training process that has arisen purely from randomizations. We ask the following questions: besides this small deviation in test accuracies, do the neural networks trained from different random initializations actually learn very different functions?
A Large-Scale Car Dataset for Fine-Grained Categorization and Verification
Yang, Linjie, Luo, Ping, Loy, Chen Change, Tang, Xiaoou
This paper aims to highlight vision related tasks centered around "car", which has been largely neglected by vision community in comparison to other objects. We show that there are still many interesting car-related problems and applications, which are not yet well explored and researched. To facilitate future car-related research, in this paper we present our ongoing effort in collecting a large-scale dataset, "CompCars", that covers not only different car views, but also their different internal and external parts, and rich attributes. Importantly, the dataset is constructed with a cross-modality nature, containing a surveillancenature set and a web-nature set. We further demonstrate a few important applications exploiting the dataset, namely car model classification, car model verification, and attribute prediction. We also discuss specific challenges of the car-related problems and other potential applications that worth further investigations.
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > United States > California (0.04)
- (2 more...)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks > Manufacturer (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.97)
- Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Learning Hyper-Features for Visual Identification
Ferencz, Andras D., Learned-miller, Erik G., Malik, Jitendra
We address the problem of identifying specific instances of a class (cars) from a set of images all belonging to that class. Although we cannot build a model for any particular instance (as we may be provided with only one "training" example of it), we can use information extracted from observing other members of the class. We pose this task as a learning problem, in which the learner is given image pairs, labeled as matching or not, and must discover which image features are most consistent for matching instances and discriminative for mismatches. We explore a patch based representation, where we model the distributions of similarity measurements defined on the patches. Finally, we describe an algorithm that selects the most salient patches based on a mutual information criterion. This algorithm performs identification well for our challenging dataset of car images, after matching only a few, well chosen patches.
Learning Hyper-Features for Visual Identification
Ferencz, Andras D., Learned-miller, Erik G., Malik, Jitendra
We address the problem of identifying specific instances of a class (cars) from a set of images all belonging to that class. Although we cannot build a model for any particular instance (as we may be provided with only one "training" example of it), we can use information extracted from observing other members of the class. We pose this task as a learning problem, in which the learner is given image pairs, labeled as matching or not, and must discover which image features are most consistent for matching instances and discriminative for mismatches. We explore a patch based representation, where we model the distributions of similarity measurements defined on the patches. Finally, we describe an algorithm that selects the most salient patches based on a mutual information criterion. This algorithm performs identification well for our challenging dataset of car images, after matching only a few, well chosen patches.
Learning Hyper-Features for Visual Identification
Ferencz, Andras D., Learned-miller, Erik G., Malik, Jitendra
We address the problem of identifying specific instances of a class (cars) from a set of images all belonging to that class. Although we cannot build a model for any particular instance (as we may be provided with only one "training" example of it), we can use information extracted from observing othermembers of the class. We pose this task as a learning problem, in which the learner is given image pairs, labeled as matching or not, and must discover which image features are most consistent for matching instances anddiscriminative for mismatches. We explore a patch based representation, where we model the distributions of similarity measurements definedon the patches. Finally, we describe an algorithm that selects the most salient patches based on a mutual information criterion. This algorithm performs identification well for our challenging dataset of car images, after matching only a few, well chosen patches.