AITopics

2506.02315

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > San Luis Obispo County > San Luis Obispo (0.04)

Genre: Research Report (1.00)

Industry:

Transportation > Air (1.00)
Government > Military > Air Force (1.00)
Aerospace & Defense > Aircraft (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceOct-29-2024

FT-PrivacyScore: Personalized Privacy Scoring Service for Machine Learning Participation

Gu, Yuechun, He, Jiajie, Chen, Keke

Training data privacy has been a top concern in AI modeling. While methods like differentiated private learning allow data contributors to quantify acceptable privacy loss, model utility is often significantly damaged. In practice, controlled data access remains a mainstream method for protecting data privacy in many industrial and research environments. In controlled data access, authorized model builders work in a restricted environment to access sensitive data, which can fully preserve data utility with reduced risk of data leak. However, unlike differential privacy, there is no quantitative measure for individual data contributors to tell their privacy risk before participating in a machine learning task. We developed the demo prototype FT-PrivacyScore to show that it's possible to efficiently and quantitatively estimate the privacy risk of participating in a model fine-tuning task. The demo source code will be available at \url{https://github.com/RhincodonE/demo_privacy_scoring}.

artificial intelligence, machine learning, privacy, (13 more...)

2410.22651

Country: North America > United States > Utah > Salt Lake County > Salt Lake City (0.05)

Genre: Research Report (0.83)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Yaghini, Mohammad, Liu, Patty, Boenisch, Franziska, Papernot, Nicolas

Regulation Games for Trustworthy Machine Learning

arXiv.org Artificial IntelligenceFeb-5-2024

Existing work on trustworthy machine learning (ML) often concentrates on individual aspects of trust, such as fairness or privacy. Additionally, many techniques overlook the distinction between those who train ML models and those responsible for assessing their trustworthiness. To address these issues, we propose a framework that views trustworthy ML as a multi-objective multi-agent optimization problem. This naturally lends itself to a game-theoretic formulation we call regulation games. We illustrate a particular game instance, the SpecGame in which we model the relationship between an ML model builder and fairness and privacy regulators. Regulators wish to design penalties that enforce compliance with their specification, but do not want to discourage builders from participation. Seeking such socially optimal (i.e., efficient for all agents) solutions to the game, we introduce ParetoPlay. This novel equilibrium search algorithm ensures that agents remain on the Pareto frontier of their objectives and avoids the inefficiencies of other equilibria. Simulating SpecGame through ParetoPlay can provide policy guidance for ML Regulation. For instance, we show that for a gender classification application, regulators can enforce a differential privacy budget that is on average 4.0 lower if they take the initiative to specify their desired guarantee first.

pareto frontier, priv, regulator, (16 more...)

2402.0354

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.93)
Leisure & Entertainment > Games (0.88)
Law (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceFeb-3-2024

Copyright Protection in Generative AI: A Technical Perspective

Ren, Jie, Xu, Han, He, Pengfei, Cui, Yingqian, Zeng, Shenglai, Zhang, Jiankun, Wen, Hongzhi, Ding, Jiayuan, Liu, Hui, Chang, Yi, Tang, Jiliang

Generative AI has witnessed rapid advancement in recent years, expanding their capabilities to create synthesized content such as text, images, audio, and code. The high fidelity and authenticity of contents generated by these Deep Generative Models (DGMs) have sparked significant copyright concerns. There have been various legal debates on how to effectively safeguard copyrights in DGMs. This work delves into this issue by providing a comprehensive overview of copyright protection from a technical perspective. We examine from two distinct viewpoints: the copyrights pertaining to the source data held by the data owners and those of the generative models maintained by the model builders. For data copyright, we delve into methods data owners can protect their content and DGMs can be utilized without infringing upon these rights. For model copyright, our discussion extends to strategies for preventing model theft and identifying outputs generated by specific models. Finally, we highlight the limitations of existing techniques and identify areas that remain unexplored. Furthermore, we discuss prospective directions for the future of copyright protection, underscoring its importance for the sustainable and ethical development of Generative AI.

arxiv preprint arxiv, protection, watermark, (14 more...)

2402.02333

Country:

North America > United States > Michigan (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Chung, John Joon Young, Kamar, Ece, Amershi, Saleema

Increasing Diversity While Maintaining Accuracy: Text Data Generation with Large Language Models and Human Interventions

arXiv.org Artificial IntelligenceJun-7-2023

Large language models (LLMs) can be used to generate text data for training and evaluating other models. However, creating high-quality datasets with LLMs can be challenging. In this work, we explore human-AI partnerships to facilitate high diversity and accuracy in LLM-based text data generation. We first examine two approaches to diversify text generation: 1) logit suppression, which minimizes the generation of languages that have already been frequently generated, and 2) temperature sampling, which flattens the token sampling probability. We found that diversification approaches can increase data diversity but often at the cost of data accuracy (i.e., text and labels being appropriate for the target domain). To address this issue, we examined two human interventions, 1) label replacement (LR), correcting misaligned labels, and 2) out-of-scope filtering (OOSF), removing instances that are out of the user's domain of interest or to which no considered label applies. With oracle studies, we found that LR increases the absolute accuracy of models trained with diversified datasets by 14.4%. Moreover, we found that some models trained with data generated with LR interventions outperformed LLM-based few-shot classification. In contrast, OOSF was not effective in increasing model accuracy, implying the need for future work in human-in-the-loop text data generation.

accuracy, large language model, machine learning, (18 more...)

doi: 10.18653/v1/2023.acl-long.34

2306.0414

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.05)
(12 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Film (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

#artificialintelligenceFeb-16-2023, 14:06:20 GMT

Model Rollbacks Through Versioning

There's general consensus in the Machine Learning community that models can and have made biased decisions against traditionally marginalized groups. Ethical AI researchers from Dr. Cathy O'Neil to Dr. Joy Buolamwini have gone to great lengths to establish a pattern of faulty decision making rooted in biased and unrepresentative data that result in serious harms. Unfortunately, our "intelligent" learning algorithms are only as smart, capable and ethical as we make them and we are only at the beginning of understanding the long term effects of biased models. Fortunately, there are many strategies already at our disposal that we can use to mitigate harms when they arise. Today, we will focus on a very powerful strategy: Model Rollbacks through Versioning.

model builder, model rollback, versioning, (11 more...)

Industry: Banking & Finance > Credit (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceOct-6-2021, 13:16:10 GMT

ML.NET Updates & Announcing Notebooks in Visual Studio

ML.NET is an open-source, cross-platform machine learning framework for .NET developers that enables integration of custom machine learning into .NET apps. In this post, we'll cover the following items: Interactive Notebooks are used extensively in data science and machine learning. They are great for data exploration and preparation, experimentation, model explainability, and even education. Last year, .NET Interactive Notebooks were announced, and you can currently use .NET Interactive Notebooks in VS Code as an extension. After talking to customers, the team decided to experiment with Interactive Notebooks in Visual Studio which has resulted in the new Notebook Editor extension!

model builder, notebook, visual studio, (13 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.83)
Information Technology > Communications > Social Media (0.65)

#artificialintelligenceApr-1-2021, 00:40:09 GMT

AWS Adds Explainability to SageMaker

Amazon Web Services is adding an automated machine learning tool to its SageMaker machine learning model builder that improves model accuracy via explainable AI. The new SageMaker feature dubbed Autopilot generates a model explainability report via SageMaker Clarify, the Amazon tool used to detect algorithmic bias while increasing the transparency of machine learning models. The reports would help model developers understand how individual attributes of training data contribute to a predicted result. The combination is promoted as helping to identify and limit algorithmic bias and explain predictions, allowing users to make informed decisions based on how models arrived at conclusions, AWS said this week. The reports also include "feature importance values" that allow developers to understand as a percentage the correlation between a training data attribute and how it contributed to a predicted result.

explainability, prediction, sagemaker, (11 more...)

Industry: Information Technology (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

#artificialintelligenceDec-1-2020, 08:01:26 GMT

ML.NET Model Builder November Updates

ML.NET is an open-source, cross-platform machine learning framework for .NET developers. It enables integrating machine learning into your .NET apps without requiring you to leave the .NET ecosystem or even have a background in ML or data science. ML.NET provides tooling (Model Builder UI in Visual Studio and the cross platform ML.NET CLI) that automatically trains custom machine learning models for you based on your scenario and data. This release of ML.NET Model Builder brings numerous bug fixes and enhancements as well as new features, including advanced data loading options and streaming training data from SQL. In this post, we'll cover the following items: Previously, Model Builder did not offer any data loading options, relying on AutoML to detect column purpose, header, and separator as well as decimal separator style.

column purpose, data loading option, model builder, (7 more...)

Industry:

Education (0.53)
Transportation > Passenger (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceAug-24-2020, 16:10:51 GMT

Tackling Bias and Explainability in Automated Machine Learning

Automated machine learning is likely to introduce two critical problems. Fortunately, vendors are introducing tools to tackle both of them. Adoption of automated machine learning -- tools that help data scientists and business analysts (and even business users) automate the construction of machine learning models -- is expected to increase over the next few years because these tools simplify model building. For example, in some of the tools, all the user needs to do is specify the outcome or target variable of interest along with the attributes believed to be predictive. The automated machine learning (autoML) platform picks the best model.

artificial intelligence, big data, data mining, (12 more...)

AI-Alerts: 2020 > 2020-08 > AAAI AI-Alert for Aug 25, 2020 (1.00)

Country: North America > United States > Texas (0.05)

Industry:

Law (0.50)
Information Technology (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.51)