AITopics | label data

Collaborating Authors

label data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning from Concealed Labels

Li, Zhongnian, Wei, Meng, Ying, Peng, Sun, Tongfeng, Xu, Xinzheng

arXiv.org Artificial IntelligenceDec-3-2024

Annotating data for sensitive labels (e.g., disease, smoking) poses a potential threats to individual privacy in many real-world scenarios. To cope with this problem, we propose a novel setting to protect privacy of each instance, namely learning from concealed labels for multi-class classification. Concealed labels prevent sensitive labels from appearing in the label set during the label collection stage, which specifies none and some random sampled insensitive labels as concealed labels set to annotate sensitive data. In this paper, an unbiased estimator can be established from concealed data under mild assumptions, and the learned multi-class classifier can not only classify the instance from insensitive labels accurately but also recognize the instance from the sensitive labels. Moreover, we bound the estimation error and show that the multi-class classifier achieves the optimal parametric convergence rate. Experiments demonstrate the significance and effectiveness of the proposed method for concealed labels in synthetic and real-world datasets.

classifier, dataset, learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3664647.3680627

2412.0223

Country:

Oceania > Australia > Victoria > Melbourne (0.15)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Asia > China > Jiangsu Province > Xuzhou (0.05)
(16 more...)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Data Science > Data Mining (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Scale AI launches rapid data-labeling service

#artificialintelligenceNov-9-2021, 00:15:10 GMT

Amid the boom of AI in application building, companies face a significant data-labeling problem, especially when it comes to labeling images or other media content they want to train deep learning algorithms on. Today data-labeling and infrastructure provider Scale AI launched a service called Scale Rapid that aims to solve this problem by labeling a data sample within one to three hours. Users can review the work to make sure the labeling is being done correctly, iterate upon their labeling instructions if necessary, and then ramp up to have Scale AI label the rest of their dataset. This is the latest in a series of products Scale AI has launched in the last year as it seeks to maintain its leadership in the labeling sphere. In April, the company raised $325 million, bringing its total raised to over $602 million.

ai launch rapid data-labeling service, instruction, scale ai, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Learning with Different Amounts of Annotation: From Zero to Many Labels

Zhang, Shujian, Gong, Chengyue, Choi, Eunsol

arXiv.org Artificial IntelligenceSep-10-2021

Training NLP systems typically assumes access to annotated data that has a single human label per example. Given imperfect labeling from annotators and inherent ambiguity of language, we hypothesize that single label is not sufficient to learn the spectrum of language interpretation. We explore new annotation distribution schemes, assigning multiple labels per example for a small subset of training examples. Introducing such multi label examples at the cost of annotating fewer examples brings clear gains on natural language inference task and entity typing task, even when we simply first train with a single label data and then fine tune with multi label examples. Extending a MixUp data augmentation framework, we propose a learning algorithm that can learn from training examples with different amount of annotation (with zero, one, or multiple labels). This algorithm efficiently combines signals from uneven training data and brings additional gains in low annotation budget and cross domain settings. Together, our method achieves consistent gains in two tasks, suggesting distributing labels unevenly among training examples can be beneficial for many NLP tasks.

dataset, label data, mixup, (13 more...)

arXiv.org Artificial Intelligence

2109.04408

Country:

South America > Peru > Cusco Department > Cusco Province > Cusco (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

6 Reasons to Spend More Time Thinking About Labels

#artificialintelligenceJun-15-2021, 00:55:53 GMT

Quite a few of the issues should be addressed as part of an established machine learning operations. Some issues may be resolved through support functions such as legal, people, general data management and smart procedure design -- more on that at a later post. For now, let's focus on the all important labels, as opposed to the features.

approximate label, data collection, dataset, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.33)

Add feedback

The ultimate guide to data labeling: How to label data for ML

#artificialintelligenceMay-27-2021, 18:47:25 GMT

Artificial Intelligence (AI) is driving the future, and you should be ready for it to have a competitive advantage. Machine learning (ML) is a subset of AI that provides software applications with the ability to detect patterns and make accurate predictions. ML gave us self-driving cars, email spam filtering, traffic detection, and more. To train the highest-quality ML models, you need to feed their algorithm with accurate labeled data. This blog post covers everything you need to know about data labeling to make informed decisions for your business.

guideline, platform, workforce, (14 more...)

#artificialintelligence

Industry: Transportation > Ground > Road (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.71)

Add feedback

Aggregate Learning for Mixed Frequency Data

Toda, Takamichi, Moriwaki, Daisuke, Ota, Kazuhiro

arXiv.org Machine LearningMay-20-2021

Large and acute economic shocks such as the 2007-2009 financial crisis and the current COVID-19 infections rapidly change the economic environment. In such a situation, the importance of real-time economic analysis using alternative datais emerging. Alternative data such as search query and location data are closer to real-time and richer than official statistics that are typically released once a month in an aggregated form. We take advantage of spatio-temporal granularity of alternative data and propose a mixed-FrequencyAggregate Learning (MF-AGL)model that predicts economic indicators for the smaller areas in real-time. We apply the model for the real-world problem; prediction of the number of job applicants which is closely related to the unemployment rates. We find that the proposed model predicts (i) the regional heterogeneity of the labor market condition and (ii) the rapidly changing economic status. The model can be applied to various tasks, especially economic analysis

job applicant, prediction, predictor, (14 more...)

arXiv.org Machine Learning

2105.09579

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > Japan > Kyūshū & Okinawa > Okinawa (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry:

Banking & Finance > Economy (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.56)
Health & Medicine > Therapeutic Area > Immunology (0.56)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (0.94)
Information Technology > Architecture > Real Time Systems (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

How Synthetic Data Sets Can Improve Computer Vision Models

#artificialintelligenceAug-8-2020, 03:30:58 GMT

In recent years, deep learning models have produced a substantial amount of advances in various areas, including computer vision. Computer vision typically usually works by analysing images that have been captured using the physical camera sensor, followed by a human-in-the-loop process that requires annotators to label things of interest. It's important to note that the more sophisticated the annotation is, the more laborious labelling can be. But it provides for a much richer analysis of the image itself. For example, for spotting a tiny detail within an image, a simple bounding box around the object might suffice. But once you start looking to get a robot to grasp something, you might need a segmentation mask to flesh out the fine contours of the object.

artificial intelligence, deep learning, machine learning, (17 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)

Add feedback

Methods of Data Labeling in Machine Learning

#artificialintelligenceJun-13-2020, 18:28:40 GMT

Accruing a large amount of data is relatively simple. Data can be scraped, created or copied and then be stored in huge data storages. A key driver in developing an intelligent model, however, is not just a sheer mass of data but also an effective strategy to intelligently label data to add structure and sense to the data. Data labeling can, therefore, be described as a way to organize information depending on its content. This content determines the tag or label to be assigned to a specific piece of information after it has been processed.

artificial intelligence, information, machine learning, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

How to Label Data -- Create ML for Object Detection

#artificialintelligenceJan-23-2020, 19:17:38 GMT

The new Create ML app just announced at WWDC 2019, is an incredibly easy way to train your own personalized machine learning models. All that's required is dragging a folder containing your training data into the tool and Create ML does the rest of the heavy lifting. So how do we prepare our data? When doing image or sound classification we just need to organize the data into folders, but if we want to do object detection the task becomes a bit more complicated. With object detection, we need to specify some additional information.

annotation, low-resolution image, training data, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)

Add feedback

Amazon SageMaker Ground Truth AWS

#artificialintelligenceSep-9-2019, 19:58:00 GMT

Amazon SageMaker Ground Truth helps you build highly accurate training datasets for machine learning quickly. SageMaker Ground Truth offers easy access to public and private human labelers and provides them with built-in workflows and interfaces for common labeling tasks. Additionally, SageMaker Ground Truth can lower your labeling costs by up to 70% using automatic labeling, which works by training Ground Truth from data labeled by humans so that the service learns to label data independently. Successful machine learning models are built on the shoulders of large volumes of high-quality training data. But, the process to create the training data necessary to build these models is often expensive, complicated, and time-consuming.

amazon sagemaker ground truth aw, artificial intelligence, machine learning, (2 more...)

#artificialintelligence

Industry:

Marketing (0.85)
Retail > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback