AITopics | ai training dataset

Collaborating Authors

ai training dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Vaccine misinformation can easily poison AI – but there's a fix

New ScientistJan-8-2025, 10:00:48 GMT

Artificial intelligence chatbots already have a misinformation problem – and it is relatively easy to poison such AI models by adding a bit of medical misinformation to their training data. Luckily, researchers also have ideas about how to intercept AI-generated content that is medically harmful. Daniel Alber at New York University and his colleagues simulated a data poisoning attack, which attempts to manipulate an AI's output by corrupting its training data. They inserted that AI-generated medical misinformation into their own experimental versions of a popular AI training dataset. Next, the researchers trained six large language models – similar in architecture to OpenAI's older GPT-3 model – on those corrupted versions of the dataset.

medical misinformation, misinformation, vaccine misinformation, (11 more...)

New Scientist

Country: North America > United States > New York (0.26)

Genre:

Research Report > Strength High (0.34)
Research Report > Experimental Study (0.34)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.95)
Health & Medicine > Therapeutic Area > Vaccines (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.80)

Add feedback

AI Weekly: The challenges of creating open source AI training datasets

#artificialintelligenceMar-4-2021, 06:10:40 GMT

Indeed, creating AI training datasets in a privacy-preserving, ethical way remains a major blocker for researchers in the AI community, particularly those who specialize in computer vision. In January 2019, IBM released a corpus designed to mitigate bias in facial recognition algorithms that contained nearly a million photos of people from Flickr. But neither the photographers nor the subjects of the photos were notified by IBM that their work would be included. Separately, an earlier version of ImageNet, a dataset used to train AI systems around the world, was found to contain photos of naked children, porn actresses, college parties, and more -- all scraped from the web without those individuals' consent. "There are real harms that have emerged from casual repurposing, open-sourcing, collecting, and scraping of biometric data," said Liz O'Sullivan, cofounder and technology director at the Surveillance Technology Oversight Project, a nonprofit organization litigating and advocating for privacy.

ai training dataset, dataset, training dataset, (6 more...)

#artificialintelligence

Country: North America > United States > Arizona (0.06)

Genre: Research Report (0.35)

Industry: Information Technology > Security & Privacy (0.57)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.91)
Information Technology > Artificial Intelligence > Vision (0.59)

Add feedback

Find Out If Your Photo Is In This AI Training Dataset

#artificialintelligenceMar-13-2019, 14:17:05 GMT

Facial recognition systems are everywhere, from security cameras that try to spot criminals to the way Snapchat finds your face to put bunny ears on it. Computers need a lot of data to be able to learn how to recognize faces, and some of it comes from Flickr. IBM released a "Diversity in Faces" data set earlier this year, which in a way is arguably a good thing: a lot of early face-recognition algorithms were trained on thin white celebrities, because it's easy to find a lot of photos of celebrities. Your data source affects what your algorithm is able to do and understand, so there are a lot of racist, sexist algorithms out there. This dataset aims to help, by providing images of faces alongside data about the face such as skin color. But most folks who uploaded their personal snapshots to Flickr probably didn't realize that, years down the road, their faces and their friends' and families' faces could be used to train the next big mega-algorithm.

ai training dataset, artificial intelligence, social media, (2 more...)

#artificialintelligence

Industry: Information Technology > Services (0.87)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.60)

Add feedback

The Future of Ethics might be hanging on that #AI training dataset

#artificialintelligenceJan-8-2019, 02:53:08 GMT

With algorithms playing an increasingly more important role in business transactions, from online retail to innovative brick-and-mortar; from structuring dispersed - and often not standardized - electronic health records, to diagnosing patients and connecting them with the right specialist; from autonomous vehicles deciding between saving the life of a passenger on-board or a pedestrian on a road side, many are warming up to the idea of an AI regulatory framework, which will never happen soon enough. But as the framework is far from being ready, companies should embrace an AI based not only on possibilities - what we can do - but also on ethical implication - what we should do not pursue. The importance is underscored by two examples, that made it to mainstream media: Amazon scrapping its HR-related AI project because it showed recruiting bias, and Equivant / Northpointe which had to kill their machine-learning for parole recommendation, because of wrong - biased - recommendations on prisoners. The risks should not underestimated. In an article on the MIT Sloan Review of August 2018, Davenport and Foutty identify seven attributes of AI-driven Leaders, or as I prefer to call them, of leaders in the era of AI.

ai training dataset, artificial intelligence, machine learning, (5 more...)

#artificialintelligence

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)

Industry:

Law (0.90)
Health & Medicine > Health Care Technology > Medical Record (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback