AITopics | machine learning dataset

Collaborating Authors

machine learning dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Berlin V2X: A Machine Learning Dataset from Multiple Vehicles and Radio Access Technologies

Hernangómez, Rodrigo, Geuer, Philipp, Palaios, Alexandros, Schäufele, Daniel, Watermann, Cara, Taleb-Bouhemadi, Khawla, Parvini, Mohammad, Krause, Anton, Partani, Sanket, Vielhaus, Christian, Kasparick, Martin, Külzer, Daniel F., Burmeister, Friedrich, Fitzek, Frank H. P., Schotten, Hans D., Fettweis, Gerhard, Stańczak, Sławomir

arXiv.org Artificial IntelligenceApr-14-2023

The evolution of wireless communications into 6G and beyond is expected to rely on new machine learning (ML)-based capabilities. These can enable proactive decisions and actions from wireless-network components to sustain quality-of-service (QoS) and user experience. Moreover, new use cases in the area of vehicular and industrial communications will emerge. Specifically in the area of vehicle communication, vehicle-to-everything (V2X) schemes will benefit strongly from such advances. With this in mind, we have conducted a detailed measurement campaign that paves the way to a plethora of diverse ML-based studies. The resulting datasets offer GPS-located wireless measurements across diverse urban environments for both cellular (with two different operators) and sidelink radio access technologies, thus enabling a variety of different studies towards V2X. The datasets are labeled and sampled with a high time resolution. Furthermore, we make the data publicly available with all the necessary information to support the onboarding of new researchers. We provide an initial analysis of the data showing some of the challenges that ML needs to overcome and the features that ML can leverage, as well as some hints at potential research studies.

machine learning dataset, multiple vehicle, vehicle and radio access technology

arXiv.org Artificial Intelligence

doi: 10.1109/VTC2023-Spring57618.2023.10200750

2212.10343

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Networks (0.87)

Add feedback

An overview of Machine Learning Datasets

#artificialintelligenceDec-23-2022, 03:40:17 GMT

In this article, we will learn about An overview of Machine Learning Datasets. An overview of training datasets which can subsequently be enriched through data annotation and labeling for further use as artificial intelligence (AI) training data. It is possible to simulate human intelligence in machines with artificial intelligence (AI) and machine learning (ML). These simulations allow them to complete a variety of tasks without much human assistance. Companies need precise training data if they are to develop AI and ML models that are more efficient and newer.

dataset, machine learning dataset, training data, (11 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Papers with Code - Machine Learning Datasets

#artificialintelligenceAug-22-2022, 13:30:55 GMT

KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) is one of the most popular datasets for use in mobile robotics and autonomous driving. It consists of hours of traffic scenarios recorded with a variety of sensor modalities, including high-resolution RGB, grayscale stereo cameras, and a 3D laser scanner. Despite its popularity, the dataset itself does not contain ground truth for semantic segmentation. However, various researchers have manually annotated parts of the dataset to fit their necessities. Zhang et al. annotated 252 (140 for training and 112 for testing) acquisitions – RGB and Velodyne scans – from the tracking challenge for ten object categories: building, sky, road, vegetation, sidewalk, car, pedestrian, cyclist, sign/pole, and fence. Ros et al. labeled 170 training images and 46 testing images (from the visual odome

ground truth, institute, machine learning dataset

#artificialintelligence

Country: Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.30)

Industry: Automobiles & Trucks (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.76)
Information Technology > Artificial Intelligence > Robots (0.66)

Add feedback

Major Problems of Machine Learning Datasets: Part 1

#artificialintelligenceAug-7-2022, 07:55:28 GMT

Data play a key role in machine learning, and the better and more relevant data you have, the more accurate the model you will build. Getting the perfect data, however, is still a dream for many data scientists. A lot of data comes from web scraping, APIs and other external sources, and most real-world datasets will just look like an ugly stack of information, at least at first. However, data will speak for itself, if you keep it organized. In this blog, I would love to share some major problems that occur with many supervised machine learning datasets, as well as how to deal with them.

category, machine learning dataset, major problem, (6 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

GitHub - StatsGary/MLDataR: A collection of Machine Learning datasets for health care and beyond

#artificialintelligenceJan-14-2022, 12:40:40 GMT

The package currently has three example datasets, and more are being added every week. More datasets are being added, so look out for the next version of this package. It has been fun putting this package together and I hope you find it useful. If you find any issues using the package, please raise a git hub ticket and I will address it as soon as possible.

health care, machine learning dataset, statsgary mldatar, (1 more...)

#artificialintelligence

Industry: Health & Medicine (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

Top 5 Sources For Analytics and Machine Learning Datasets

#artificialintelligenceJan-11-2022, 15:15:44 GMT

Machine learning becomes engaging when we face various challenges and thus finding suitable datasets relevant to the use case is essential. Flexibility refers to the number of tasks that it supports. For example, Microsoft's COCO( Common Objects in Context) is used for object classification, detection, and segmentation. Add a bunch of captions for the same, and we can use it as a dataset for an image caption generator as well. Well, when we are just starting, we shall be working with some of the small and standard machine learning datasets like the CIFAR-10, MNIS, Iris, etc.

dataset, machine learning, website, (13 more...)

#artificialintelligence

Country:

Asia > Singapore (0.15)
Oceania > New Zealand (0.14)
North America > United States (0.05)
Asia > China (0.05)

Industry: Government > Regional Government (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Council Post: Why AI Teams Need A Unified Data Format For Machine Learning Datasets

#artificialintelligenceNov-17-2021, 14:45:13 GMT

Davit Buniatyan is the Founding CEO at Activeloop, the company behind the fastest-growing dataset format specifically designed for AI. "If I want to tell you there is a spot on your shirt," Steve Jobs once said in an interview, "I'm not going to do it linguistically: 'There's a spot on your shirt 14 centimeters down from the collar and three centimeters to the left of your button.'" He would simply point at the spot. That was how he envisioned normal people using computers. While we realized this vision for day-to-day computer use, the same can't be said for working with data.

dataset, unified data format, unstructured data, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Council Post: Why AI Teams Need A Unified Data Format For Machine Learning Datasets

#artificialintelligenceNov-12-2021, 05:50:30 GMT

dataset, unified data format, unstructured data, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Computing the Similarity Between Two Machine Learning Datasets -- Visual Studio Magazine

#artificialintelligenceSep-28-2021, 18:35:51 GMT

At first thought, computing the similarity/distance between two datasets sounds easy, but in fact the problem is extremely difficult, explains Dr. James McCaffrey of Microsoft Research. A fairly common sub-problem in many machine learning and data science scenarios is the need to compute the similarity (or difference or distance) between two datasets. For example, if you select a sample from a huge set of training data, you likely want to know how similar the sample dataset is to the source dataset. Or if you want to prime the training for a very deep neural network, you need to find an existing model that was trained using a dataset that is most similar to your new dataset. At first thought, computing the similarity/distance between two datasets sounds easy, but in fact the problem is extremely difficult. If you try to compare individual lines between datasets, you quickly run into the combinatorial explosion problem -- there are just too many comparisons.

dataset, demo program, frequency distribution, (11 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

4 Ways to Tackle the Lack of Machine Learning Datasets

#artificialintelligenceAug-10-2021, 22:10:31 GMT

Machine learning's abilities and applications have become vital for several organizations around the world. Problems, however, can arise if there isn't enough quality data for the purpose of training AI models. Such situations, in which machine learning data is difficult to attain, can be resolved in a few clever ways. Machine learning, one of AI's prime components, is a major driver of automation and digitization in workplaces worldwide. Machine learning is the process of training or'teaching' your AI models and neural networks to serve your organization's data processing and decision-making needs in an increasingly effective manner.

dataset, learning, machine learning, (12 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.75)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Add feedback