AITopics | Canterbury

Collaborating Authors

Canterbury

Fuck the Algorithm: Conceptual Issues in Algorithmic Bias

arXiv.org Artificial IntelligenceMay-21-2025

Algorithmic bias has been the subject of much recent controversy. To clarify what is at stake and to make progress resolving the controversy, a better understanding of the concepts involved would be helpful. The discussion here focuses on the disputed claim that algorithms themselves cannot be biased. To clarify this claim we need to know what kind of thing 'algorithms themselves' are, and to disambiguate the several meanings of 'bias' at play. This further involves showing how bias of moral import can result from statistical biases, and drawing connections to previous conceptual work about political artifacts and oppressive things. Data bias has been identified in domains like hiring, policing and medicine. Examples where algorithms themselves have been pinpointed as the locus of bias include recommender systems that influence media consumption, academic search engines that influence citation patterns, and the 2020 UK algorithmically-moderated A-level grades. Recognition that algorithms are a kind of thing that can be biased is key to making decisions about responsibility for harm, and preventing algorithmically mediated discrimination.

artificial intelligence, information management, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.13509

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > Canada > Ontario > Kingston (0.04)
Oceania > New Zealand > South Island > Canterbury > Christchurch (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
Education > Educational Setting (1.00)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Information Management > Search (0.88)

Add feedback

Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs

Saeed, Muhammed, Mohamed, Elgizouli, Mohamed, Mukhtar, Raza, Shaina, Abdul-Mageed, Muhammad, Shehata, Shady

arXiv.org Artificial IntelligenceNov-26-2024

Large language models (LLMs) are widely used but raise ethical concerns due to embedded social biases. This study examines LLM biases against Arabs versus Westerners across eight domains, including women's rights, terrorism, and anti-Semitism and assesses model resistance to perpetuating these biases. To this end, we create two datasets: one to evaluate LLM bias toward Arabs versus Westerners and another to test model safety against prompts that exaggerate negative traits ("jailbreaks"). We evaluate six LLMs -- GPT-4, GPT-4o, LlaMA 3.1 (8B & 405B), Mistral 7B, and Claude 3.5 Sonnet. We find 79% of cases displaying negative biases toward Arabs, with LlaMA 3.1-405B being the most biased. Our jailbreak tests reveal GPT-4o as the most vulnerable, despite being an optimized version, followed by LlaMA 3.1-8B and Mistral 7B. All LLMs except Claude exhibit attack success rates above 87% in three categories. We also find Claude 3.5 Sonnet the safest, but it still displays biases in seven of eight categories. Despite being an optimized version of GPT4, We find GPT-4o to be more prone to biases and jailbreaks, suggesting optimization flaws. Our findings underscore the pressing need for more robust bias mitigation strategies and strengthened security measures in LLMs.

category, dataset, loser group, (15 more...)

arXiv.org Artificial Intelligence

2410.24049

Country:

Asia > Middle East > Oman (0.27)
Asia > Middle East > Qatar (0.14)
Asia > Middle East > Kuwait (0.14)
(34 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media (1.00)
Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety > Terrorism (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Australia's spy chief warns AI will accelerate online radicalisation

The GuardianOct-11-2024, 10:35:44 GMT

The head of Australia's peak intelligence agency has warned that people like the Christchurch terrorist are being radicalised on social media, and artificial intelligence is likely to make it much worse. The director general of the Australian Security Intelligence Organisation (Asio), Mike Burgess, told a social media summit in Adelaide on Friday that social media is "both a goldmine and a cesspit" that creates communities and divides them, and the internet was "the world's most potent incubator of extremism". He said people were embracing anti-authority ideologies, conspiracy theories and diverse grievances, and while social media was not the sole driver, he said Asio considered it a "significant driver". "Social media allows extremist ideologies, conspiracies, dis- and misinformation to be shared at an unprecedented scale and speed," he said. He said radicalisation can now take days and weeks rather than months and years as it previously did, with the most likely perpetrator of a terrorist attack being a lone actor.

accelerate online radicalisation, australia, social media, (12 more...)

The Guardian

Country:

Oceania > Australia (0.95)
Oceania > New Zealand > South Island > Canterbury > Christchurch (0.05)
Europe > France (0.05)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.94)
Law Enforcement & Public Safety > Terrorism (0.79)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Cross-Platform Hate Speech Detection with Weakly Supervised Causal Disentanglement

Sheth, Paras, Kumarage, Tharindu, Moraffah, Raha, Chadha, Aman, Liu, Huan

arXiv.org Artificial IntelligenceApr-16-2024

Content moderation faces a challenging task as social media's ability to spread hate speech contrasts with its role in promoting global connectivity. With rapidly evolving slang and hate speech, the adaptability of conventional deep learning to the fluid landscape of online dialogue remains limited. In response, causality inspired disentanglement has shown promise by segregating platform specific peculiarities from universal hate indicators. However, its dependency on available ground truth target labels for discerning these nuances faces practical hurdles with the incessant evolution of platforms and the mutable nature of hate speech. Using confidence based reweighting and contrastive regularization, this study presents HATE WATCH, a novel framework of weakly supervised causal disentanglement that circumvents the need for explicit target labeling and effectively disentangles input features into invariant representations of hate. Empirical validation across platforms two with target labels and two without positions HATE WATCH as a novel method in cross platform hate speech detection with superior performance. HATE WATCH advances scalable content moderation techniques towards developing safer online communities.

hate-watch, representation, speech detection, (12 more...)

arXiv.org Artificial Intelligence

2404.11036

Country:

Oceania > New Zealand > South Island > Canterbury > Christchurch (0.04)
North America > United States > California > Santa Clara County > Sunnyvale (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > Arizona > Maricopa County > Tempe (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Media > News (0.48)
Health & Medicine (0.47)
Law > Civil Rights & Constitutional Law (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Evaluating Human-Language Model Interaction

Lee, Mina, Srivastava, Megha, Hardy, Amelia, Thickstun, John, Durmus, Esin, Paranjape, Ashwin, Gerard-Ursin, Ines, Li, Xiang Lisa, Ladhak, Faisal, Rong, Frieda, Wang, Rose E., Kwon, Minae, Park, Joon Sung, Cao, Hancheng, Lee, Tony, Bommasani, Rishi, Bernstein, Michael, Liang, Percy

arXiv.org Artificial IntelligenceJan-5-2024

Many real-world applications of language models (LMs), such as writing assistance and code autocomplete, involve human-LM interaction. However, most benchmarks are non-interactive in that a model produces output without human involvement. To evaluate human-LM interaction, we develop a new framework, Human-AI Language-based Interaction Evaluation (HALIE), that defines the components of interactive systems and dimensions to consider when designing evaluation metrics. Compared to standard, non-interactive evaluation, HALIE captures (i) the interactive process, not only the final output; (ii) the first-person subjective experience, not just a third-party assessment; and (iii) notions of preference beyond quality (e.g., enjoyment and ownership). We then design five tasks to cover different forms of interaction: social dialogue, question answering, crossword puzzles, summarization, and metaphor generation. With four state-of-the-art LMs (three variants of OpenAI's GPT-3 and AI21 Labs' Jurassic-1), we find that better non-interactive performance does not always translate to better human-LM interaction. In particular, we highlight three cases where the results from non-interactive and interactive metrics diverge and underscore the importance of human-LM interaction for LM evaluation.

arxiv preprint arxiv, computational linguistic, metaphorical sentence, (14 more...)

arXiv.org Artificial Intelligence

2212.09746

Country:

Europe > United Kingdom > England > West Yorkshire (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)
(7 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Law Enforcement & Public Safety (1.00)
Education (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Artificial Intelligence is no match for the human heart - The Big Issue

#artificialintelligenceJan-23-2023, 11:00:24 GMT

Nick Cave said something interesting last week. On this occasion, he was reacting to a question from a fan. Cave does this a lot on the Red Hand Files, his online repository where he answers any number and range of enquiries from devotees. This one was about artificial intelligence. There is an open access AI bot, ChatGPT, that some people have been playing with to see if it can create as well as a human. Mark, from Christchurch in New Zealand, fired in a load of Cave's lyrics, got a resulting set of lyrics and sent them to Cave asking for his reaction.

artificial intelligence, nick cave

#artificialintelligence

Country: Oceania > New Zealand > South Island > Canterbury > Christchurch (0.27)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

TINYCD: A (Not So) Deep Learning Model For Change Detection

Codegoni, Andrea, Lombardi, Gabriele, Ferrari, Alessandro

arXiv.org Artificial IntelligenceNov-7-2022

In this paper, we present a lightweight and effective change detection model, called TinyCD. This model has been designed to be faster and smaller than current state-of-the-art change detection models due to industrial needs. Despite being from 13 to 140 times smaller than the compared change detection models, and exposing at least a third of the computational complexity, our model outperforms the current state-of-the-art models by at least $1\%$ on both F1 score and IoU on the LEVIR-CD dataset, and more than $8\%$ on the WHU-CD dataset. To reach these results, TinyCD uses a Siamese U-Net architecture exploiting low-level features in a globally temporal and locally spatial way. In addition, it adopts a new strategy to mix features in the space-time domain both to merge the embeddings obtained from the Siamese backbones, and, coupled with an MLP block, it forms a novel space-semantic attention mechanism, the Mix and Attention Mask Block (MAMB). Source code, models and results are available here: https://github.com/AndreaCodegoni/Tiny_model_4_CD

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2207.13159

Country:

Oceania > New Zealand > South Island > Canterbury > Christchurch (0.14)
South America > Ecuador (0.04)
South America > Colombia (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Inferring Offensiveness In Images From Natural Language Supervision

Schramowski, Patrick, Kersting, Kristian

arXiv.org Artificial IntelligenceOct-8-2021

Probing or fine-tuning (large-scale) pre-trained models results in state-of-the-art performance for many NLP tasks and, more recently, even for computer vision tasks when combined with image data. Unfortunately, these approaches also entail severe risks. In particular, large image datasets automatically scraped from the web may contain derogatory terms as categories and offensive images, and may also underrepresent specific classes. Consequently, there is an urgent need to carefully document datasets and curate their content. Unfortunately, this process is tedious and error-prone. We show that pre-trained transformers themselves provide a methodology for the automated curation of large-scale vision datasets. Based on human-annotated examples and the implicit knowledge of a CLIP based model, we demonstrate that one can select relevant prompts for rating the offensiveness of an image. Deep learning models yielded many improvements in several fields. Particularly, transfer learning from models pre-trained on large-scale supervised data has become common practice in many tasks both with and without sufficient data to train deep learning models. While approaches like semisupervised sequence learning (Dai & Le, 2015) and datasets such as ImageNet (Deng et al., 2009), especially the ImageNet-ILSVRC-2012 dataset with 1.2 million images, established pre-training approaches, in the following years, the training data size increased rapidly to billions of training examples (Brown et al., 2020; Jia et al., 2021), steadily improving the capabilities of deep models.

dataset, knowledge, offensiveness, (13 more...)

arXiv.org Artificial Intelligence

2110.04222

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Oceania > New Zealand > South Island > Canterbury > Christchurch (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)

Genre: Research Report (1.00)

Industry: Law (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Detecting White Supremacist Hate Speech using Domain Specific Word Embedding with Deep Learning and BERT

Alatawi, Hind Saleh, Alhothali, Areej Maatog, Moria, Kawthar Mustafa

arXiv.org Artificial IntelligenceOct-1-2020

White supremacists embrace a radical ideology that considers white people superior to people of other races. The critical influence of these groups is no longer limited to social media; they also have a significant effect on society in many ways by promoting racial hatred and violence. White supremacist hate speech is one of the most recently observed harmful content on social media.Traditional channels of reporting hate speech have proved inadequate due to the tremendous explosion of information, and therefore, it is necessary to find an automatic way to detect such speech in a timely manner. This research investigates the viability of automatically detecting white supremacist hate speech on Twitter by using deep learning and natural language processing techniques. Through our experiments, we used two approaches, the first approach is by using domain-specific embeddings which are extracted from white supremacist corpus in order to catch the meaning of this white supremacist slang with bidirectional Long Short-Term Memory (LSTM) deep learning model, this approach reached a 0.74890 F1-score. The second approach is by using the one of the most recent language model which is BERT, BERT model provides the state of the art of most NLP tasks. It reached to a 0.79605 F1-score. Both approaches are tested on a balanced dataset given that our experiments were based on textual data only. The dataset was combined from dataset created from Twitter and a Stormfront dataset compiled from that white supremacist forum.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2010.00357

Country:

Europe > Norway (0.04)
Oceania > New Zealand > South Island > Canterbury > Christchurch (0.04)
Oceania > New Zealand > South Island > Canterbury Region > Christchurch (0.04)
(2 more...)

Genre: Research Report > New Finding (0.47)

Industry: Law Enforcement & Public Safety > Terrorism (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Local Model Feature Transformations

Brown, CScott

arXiv.org Machine LearningApr-13-2020

Local learning methods are a popular class of machine learning algorithms. The basic idea for the entire cadre is to choose some non-local model family, to train many of them on small sections of neighboring data, and then to `stitch' the resulting models together in some way. Due to the limits of constraining a training dataset to a small neighborhood, research on locally-learned models has largely been restricted to simple model families. Also, since simple model families have no complex structure by design, this has limited use of the individual local models to predictive tasks. We hypothesize that, using a sufficiently complex local model family, various properties of the individual local models, such as their learned parameters, can be used as features for further learning. This dissertation improves upon the current state of research and works toward establishing this hypothesis by investigating algorithms for localization of more complex model families and by studying their applications beyond predictions as a feature extraction mechanism. We summarize this generic technique of using local models as a feature extraction step with the term ``local model feature transformations.'' In this document, we extend the local modeling paradigm to Gaussian processes, orthogonal quadric models and word embedding models, and extend the existing theory for localized linear classifiers. We then demonstrate applications of local model feature transformations to epileptic event classification from EEG readings, activity monitoring via chest accelerometry, 3D surface reconstruction, 3D point cloud segmentation, handwritten digit classification and event detection from Twitter feeds.

algorithm, decision surface, local model, (17 more...)

arXiv.org Machine Learning

2004.06149

Country:

Europe > France > Île-de-France > Paris > Paris (0.14)
North America > United States > Virginia (0.04)
North America > United States > California > San Diego County > Poway (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Law Enforcement & Public Safety (0.67)
Media > News (0.67)
Health & Medicine > Therapeutic Area (0.46)
Information Technology > Services (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)
(2 more...)

Add feedback