AITopics | Jebreel, Najeeb

Collaborating Authors

Jebreel, Najeeb

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Digital Forgetting in Large Language Models: A Survey of Unlearning Methods

Blanco-Justicia, Alberto, Jebreel, Najeeb, Manzanares, Benet, Sánchez, David, Domingo-Ferrer, Josep, Collell, Guillem, Tan, Kuan Eeik

arXiv.org Artificial IntelligenceApr-2-2024

The objective of digital forgetting is, given a model with undesirable knowledge or behavior, obtain a new model where the detected issues are no longer present. The motivations for forgetting include privacy protection, copyright protection, elimination of biases and discrimination, and prevention of harmful content generation. Effective digital forgetting has to be effective (meaning how well the new model has forgotten the undesired knowledge/behavior), retain the performance of the original model on the desirable tasks, and be scalable (in particular forgetting has to be more efficient than retraining from scratch on just the tasks/data to be retained). This survey focuses on forgetting in large language models (LLMs). We first provide background on LLMs, including their components, the types of LLMs, and their usual training pipeline. Second, we describe the motivations, types, and desired properties of digital forgetting. Third, we introduce the approaches to digital forgetting in LLMs, among which unlearning methodologies stand out as the state of the art. Fourth, we provide a detailed taxonomy of machine unlearning methods for LLMs, and we survey and compare current approaches. Fifth, we detail datasets, models and metrics used for the evaluation of forgetting, retaining and runtime. Sixth, we discuss challenges in the area. Finally, we provide some concluding remarks.

information, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2404.02062

Country:

Europe (1.00)
Asia (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.27)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Law > Civil Rights & Constitutional Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Examination of the Alleged Privacy Threats of Confidence-Ranked Reconstruction of Census Microdata

Sánchez, David, Jebreel, Najeeb, Domingo-Ferrer, Josep, Muralidhar, Krishnamurty, Blanco-Justicia, Alberto

arXiv.org Artificial IntelligenceNov-6-2023

The alleged threat of reconstruction attacks has led the U.S. Census Bureau (USCB) to replace in the Decennial Census 2020 the traditional statistical disclosure limitation based on rank swapping with one based on differential privacy (DP). This has resulted in substantial accuracy loss of the released statistics. Worse yet, it has been shown that the reconstruction attacks used as an argument to move to DP are very far from allowing unequivocal reidentification of the respondents, because in general there are a lot of reconstructions compatible with the released statistics. In a very recent paper, a new reconstruction attack has been proposed, whose goal is to indicate the confidence that a reconstructed record was in the original respondent data. The alleged risk of serious disclosure entailed by such confidence-ranked reconstruction has renewed the interest of the USCB to use DP-based solutions. To forestall the potential accuracy loss in future data releases resulting from adoption of these solutions, we show in this paper that the proposed confidence-ranked reconstruction does not threaten privacy. Specifically, we report empirical results showing that the proposed ranking cannot guide reidentification or attribute disclosure attacks, and hence it fails to warrant the USCB's move towards DP. Further, we also demonstrate that, due to the way the Census data are compiled, processed and released, it is not possible to reconstruct original and complete records through any methodology, and the confidence-ranked reconstruction not only is completely ineffective at accurately reconstructing Census records but is trivially outperformed by an adequate interpretation of the released aggregate statistics.

artificial intelligence, prototype, reconstruction, (17 more...)

arXiv.org Artificial Intelligence

2311.03171

Country: North America > United States > Alabama > Autauga County (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence (0.93)

Add feedback