AITopics | Anderson, Mark

Plotting

Anderson, Mark

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fairness-Aware Low-Rank Adaptation Under Demographic Privacy Constraints

Kamalaruban, Parameswaran, Anderson, Mark, Burrell, Stuart, Madigan, Maeve, Skalski, Piotr, Sutton, David

arXiv.org Artificial IntelligenceMar-7-2025

Pre-trained foundation models can be adapted for specific tasks using Low-Rank Adaptation (LoRA). However, the fairness properties of these adapted classifiers remain underexplored. Existing fairness-aware fine-tuning methods rely on direct access to sensitive attributes or their predictors, but in practice, these sensitive attributes are often held under strict consumer privacy controls, and neither the attributes nor their predictors are available to model developers, hampering the development of fair models. To address this issue, we introduce a set of LoRA-based fine-tuning methods that can be trained in a distributed fashion, where model developers and fairness auditors collaborate without sharing sensitive attributes or predictors. In this paper, we evaluate three such methods - sensitive unlearning, adversarial training, and orthogonality loss - against a fairness-unaware baseline, using experiments on the CelebA and UTK-Face datasets with an ImageNet pre-trained ViT-Base model. We find that orthogonality loss consistently reduces bias while maintaining or improving utility, whereas adversarial training improves False Positive Rate Parity and Demographic Parity in some cases, and sensitive unlearning provides no clear benefit. In tasks where significant biases are present, distributed fairness-aware fine-tuning methods can effectively eliminate bias without compromising consumer privacy and, in most cases, improve model utility.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.05684

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Neural Text Sanitization with Privacy Risk Indicators: An Empirical Analysis

Papadopoulou, Anthi, Lison, Pierre, Anderson, Mark, Øvrelid, Lilja, Pilán, Ildikó

arXiv.org Artificial IntelligenceOct-22-2023

Text sanitization is the task of redacting a document to mask all occurrences of (direct or indirect) personal identifiers, with the goal of concealing the identity of the individual(s) referred in it. In this paper, we consider a two-step approach to text sanitization and provide a detailed analysis of its empirical performance on two recently published datasets: the Text Anonymization Benchmark (Pil\'an et al., 2022) and a collection of Wikipedia biographies (Papadopoulou et al., 2022). The text sanitization process starts with a privacy-oriented entity recognizer that seeks to determine the text spans expressing identifiable personal information. This privacy-oriented entity recognizer is trained by combining a standard named entity recognition model with a gazetteer populated by person-related terms extracted from Wikidata. The second step of the text sanitization process consists in assessing the privacy risk associated with each detected text span, either isolated or in combination with other text spans. We present five distinct indicators of the re-identification risk, respectively based on language model probabilities, text span classification, sequence labelling, perturbations, and web search. We provide a contrastive analysis of each privacy indicator and highlight their benefits and limitations, notably in relation to the available labeled data.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2310.14312

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.28)
(2 more...)

Genre:

Research Report (1.00)
Personal > Honors (0.46)

Industry:

Media (1.00)
Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(7 more...)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(3 more...)

Add feedback

Parsing linearizations appreciate PoS tags - but some are fussy about errors

Muñoz-Ortiz, Alberto, Anderson, Mark, Vilares, David, Gómez-Rodríguez, Carlos

arXiv.org Artificial IntelligenceOct-27-2022

PoS tags, once taken for granted as a useful resource for syntactic parsing, have become more situational with the popularization of deep learning. Recent work on the impact of PoS tags on graph- and transition-based parsers suggests that they are only useful when tagging accuracy is prohibitively high, or in low-resource scenarios. However, such an analysis is lacking for the emerging sequence labeling parsing paradigm, where it is especially relevant as some models explicitly use PoS tags for encoding and decoding. We undertake a study and uncover some trends. Among them, PoS tags are generally more useful for sequence labeling parsers than for other paradigms, but the impact of their accuracy is highly encoding-dependent, with the PoS-based head-selection encoding being best only when both tagging accuracy and resource availability are high.

accuracy, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.15219

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Learning from learning machines: a new generation of AI technology to meet the needs of science

Pion-Tonachini, Luca, Bouchard, Kristofer, Martin, Hector Garcia, Peisert, Sean, Holtz, W. Bradley, Aswani, Anil, Dwivedi, Dipankar, Wainwright, Haruko, Pilania, Ghanshyam, Nachman, Benjamin, Marrone, Babetta L., Falco, Nicola, Prabhat, null, Arnold, Daniel, Wolf-Yadlin, Alejandro, Powers, Sarah, Climer, Sharlee, Jackson, Quinn, Carlson, Ty, Sohn, Michael, Zwart, Petrus, Kumar, Neeraj, Justice, Amy, Tomlin, Claire, Jacobson, Daniel, Micklem, Gos, Gkoutos, Georgios V., Bickel, Peter J., Cazier, Jean-Baptiste, Müller, Juliane, Webb-Robertson, Bobbie-Jo, Stevens, Rick, Anderson, Mark, Kreutz-Delgado, Ken, Mahoney, Michael W., Brown, James B.

arXiv.org Artificial IntelligenceNov-26-2021

We outline emerging opportunities and challenges to enhance the utility of AI for scientific discovery. The distinct goals of AI for industry versus the goals of AI for science create tension between identifying patterns in data versus discovering patterns in the world from data. If we address the fundamental challenges associated with "bridging the gap" between domain-driven scientific models and data-driven AI learning machines, then we expect that these AI models can transform hypothesis generation, scientific discovery, and the scientific process itself.

artificial intelligence, machine learning, north america government, (21 more...)

arXiv.org Artificial Intelligence

2111.13786

Country:

North America > United States > California > Alameda County > Berkeley (0.15)
North America > United States > Tennessee > Knox County > Knoxville (0.14)
Europe > United Kingdom > England > West Midlands (0.14)
(7 more...)

Genre: Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.68)

Add feedback

A Modest Pareto Optimisation Analysis of Dependency Parsers in 2021

Anderson, Mark, Rodríguez, Carlos Gómez

arXiv.org Artificial IntelligenceJun-9-2021

We evaluate three leading dependency parser systems from different paradigms on a small yet diverse subset of languages in terms of their accuracy-efficiency Pareto front. As we are interested in efficiency, we evaluate core parsers without pretrained language models (as these are typically huge networks and would constitute most of the compute time) or other augmentations that can be transversally applied to any of them. Biaffine parsing emerges as a well-balanced default choice, with sequence-labelling parsing being preferable if inference speed (but not training energy cost) is the priority.

artificial intelligence, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

2106.04216

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Replicating and Extending "Because Their Treebanks Leak": Graph Isomorphism, Covariants, and Parser Performance

Anderson, Mark, Søgaard, Anders, Rodríguez, Carlos Gómez

arXiv.org Artificial IntelligenceJun-2-2021

S{\o}gaard (2020) obtained results suggesting the fraction of trees occurring in the test data isomorphic to trees in the training set accounts for a non-trivial variation in parser performance. Similar to other statistical analyses in NLP, the results were based on evaluating linear regressions. However, the study had methodological issues and was undertaken using a small sample size leading to unreliable results. We present a replication study in which we also bin sentences by length and find that only a small subset of sentences vary in performance with respect to graph isomorphism. Further, the correlation observed between parser performance and graph isomorphism in the wild disappears when controlling for covariants. However, in a controlled experiment, where covariants are kept fixed, we do observe a strong correlation. We suggest that conclusions drawn from statistical analyses like this need to be tempered and that controlled experiments can complement them by more readily teasing factors apart.

artificial intelligence, dug, natural language, (19 more...)

arXiv.org Artificial Intelligence

2106.00352

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

What Taggers Fail to Learn, Parsers Need the Most

Anderson, Mark, Gómez-Rodríguez, Carlos

arXiv.org Artificial IntelligenceApr-2-2021

However, Zhang et al. (2020) found that We present an error analysis of neural the only way to leverage POS tags (both coarse UPOS taggers to evaluate why using gold and fine-grained) for English and Chinese dependency standard tags has such a large positive contribution parsing was to utilise them as an auxiliary to parsing performance while using task in a multi-task framework. Further, Anderson predicted UPOS tags either harms performance and Gómez-Rodríguez (2020) investigated the impact or offers a negligible improvement. UPOS tagging accuracy has on graph-based We evaluate what neural dependency and transition-based parsers and found that a prohibitively parsers implicitly learn about word types high tagging accuracy was needed to and how this relates to the errors taggers utilise predicted UPOS tags. Here we investigate make to explain the minimal impact using whether dependency parsers inherently learn similar predicted tags has on parsers. We also word type information to taggers, and therefore present a short analysis on what contexts can only benefit from the hard to predict tags that result in reductions in tagging performance.

artificial intelligence, natural language, tagger, (16 more...)

arXiv.org Artificial Intelligence

2104.01083

Country: Asia (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback