AITopics | Bensemann, Joshua

Plotting

Bensemann, Joshua

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Is GPT-4 conscious?

Tait, Izak, Bensemann, Joshua, Wang, Ziqi

arXiv.org Artificial IntelligenceJun-19-2024

GPT-4 is often heralded as a leading commercial AI offering, sparking debates over its potential as a steppingstone toward artificial general intelligence. But does it possess consciousness? This paper investigates this key question using the nine qualitative measurements of the Building Blocks theory. GPT-4's design, architecture and implementation are compared to each of the building blocks of consciousness to determine whether it has achieved the requisite milestones to be classified as conscious or, if not, how close to consciousness GPT-4 is. Our assessment is that, while GPT-4 in its native configuration is not currently conscious, current technological research and development is sufficient to modify GPT-4 to have all the building blocks of consciousness. Consequently, we argue that the emergence of a conscious AI model is plausible in the near term. The paper concludes with a comprehensive discussion of the ethical implications and societal ramifications of engineering conscious AI entities.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2407.09517

Country: Oceania > New Zealand (0.14)

Genre:

Research Report (1.00)
Overview (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Do Smaller Language Models Answer Contextualised Questions Through Memorisation Or Generalisation?

Hartill, Tim, Bensemann, Joshua, Witbrock, Michael, Riddle, Patricia J.

arXiv.org Artificial IntelligenceNov-20-2023

A distinction is often drawn between a model's ability to predict a label for an evaluation sample that is directly memorised from highly similar training samples versus an ability to predict the label via some method of generalisation. In the context of using Language Models for question-answering, discussion continues to occur as to the extent to which questions are answered through memorisation. We consider this issue for questions that would ideally be answered through reasoning over an associated context. We propose a method of identifying evaluation samples for which it is very unlikely our model would have memorised the answers. Our method is based on semantic similarity of input tokens and label tokens between training and evaluation samples. We show that our method offers advantages upon some prior approaches in that it is able to surface evaluation-train pairs that have overlap in either contiguous or discontiguous sequences of tokens. We use this method to identify unmemorisable subsets of our evaluation datasets. We train two Language Models in a multitask fashion whereby the second model differs from the first only in that it has two additional datasets added to the training regime that are designed to impart simple numerical reasoning strategies of a sort known to improve performance on some of our evaluation datasets but not on others. We then show that there is performance improvement between the two models on the unmemorisable subsets of the evaluation datasets that were expected to benefit from the additional training datasets. Specifically, performance on unmemorisable subsets of two of our evaluation datasets, DROP and ROPES significantly improves by 9.0%, and 25.7% respectively while other evaluation datasets have no significant change in performance.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2311.12337

Country:

Europe (0.68)
Asia (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment > Sports (1.00)
Energy (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Add feedback

Challenges in Annotating Datasets to Quantify Bias in Under-represented Society

Yogarajan, Vithya, Dobbie, Gillian, Pistotti, Timothy, Bensemann, Joshua, Knowles, Kobe

arXiv.org Artificial IntelligenceSep-11-2023

Recent advances in artificial intelligence, including the development of highly sophisticated large language models (LLM), have proven beneficial in many real-world applications. However, evidence of inherent bias encoded in these LLMs has raised concerns about equity. In response, there has been an increase in research dealing with bias, including studies focusing on quantifying bias and developing debiasing techniques. Benchmark bias datasets have also been developed for binary gender classification and ethical/racial considerations, focusing predominantly on American demographics. However, there is minimal research in understanding and quantifying bias related to under-represented societies. Motivated by the lack of annotated datasets for quantifying bias in under-represented societies, we endeavoured to create benchmark datasets for the New Zealand (NZ) population. We faced many challenges in this process, despite the availability of three annotators. This research outlines the manual annotation process, provides an overview of the challenges we encountered and lessons learnt, and presents recommendations for future research.

large language model, natural language, under-represented society, (4 more...)

arXiv.org Artificial Intelligence

2309.08624

Country: Oceania > New Zealand (0.24)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)

Add feedback

Neuromodulation Gated Transformer

Knowles, Kobe, Bensemann, Joshua, Benavides-Prado, Diana, Yogarajan, Vithya, Witbrock, Michael, Dobbie, Gillian, Chen, Yang

arXiv.org Artificial IntelligenceMay-11-2023

We introduce a novel architecture, the Neuromodulation Gated Transformer (NGT), which implements neuromodulation in transformers via a multiplicative effect. We compare it to baselines and show that it results in the best average performance on the SuperGLUE benchmark validation sets. Cellular neuromodulation is a biological mechanism involving neurons, where their intrinsic properties are continuously modified in a context-dependent manner according to stimuli, i.e., biochemicals called neuromodulators (Bargmann & Marder, 2013; Marder et al., 2014; Shine et al., 2021; Vecoven et al., 2020); it allows for the regulation of a population of neurons (Katz & Edwards, 1999). It has achieved notable success in the continual learning domain (Beaulieu et al., 2020; Ellefsen et al., 2015; Velez & Clune, 2017). Transformers (Vaswani et al., 2017) are architectures that eliminate recurrence by relying entirely on attention.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.03232

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)

Genre: Research Report (0.65)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Input-length-shortening and text generation via attention values

Tan, Neşet Özkan, Peng, Alex Yuxuan, Bensemann, Joshua, Bao, Qiming, Hartill, Tim, Gahegan, Mark, Witbrock, Michael

arXiv.org Artificial IntelligenceMar-13-2023

Identifying words that impact a task's performance more than others is a challenge in natural language processing. Transformers models have recently addressed this issue by incorporating an attention mechanism that assigns greater attention (i.e., relevance) scores to some words than others. Because of the attention mechanism's high computational cost, transformer models usually have an input-length limitation caused by hardware constraints. This limitation applies to many transformers, including the well-known bidirectional encoder representations of the transformer (BERT) model. In this paper, we examined BERT's attention assignment mechanism, focusing on two questions: (1) How can attention be employed to reduce input length? (2) How can attention be used as a control mechanism for conditional text generation? We investigated these questions in the context of a text classification task. We discovered that BERT's early layers assign more critical attention scores for text classification tasks compared to later layers. We demonstrated that the first layer's attention sums could be used to filter tokens in a given sequence, considerably decreasing the input length while maintaining good test accuracy. We also applied filtering, which uses a compute-efficient semantic similarities algorithm, and discovered that retaining approximately 6\% of the original sequence is sufficient to obtain 86.5\% accuracy. Finally, we showed that we could generate data in a stable manner and indistinguishable from the original one by only using a small percentage (10\%) of the tokens with high attention scores according to BERT's first layer.

machine learning, natural language, text classification, (20 more...)

arXiv.org Artificial Intelligence

2303.07585

Country:

Oceania > New Zealand (0.17)
North America > United States > Louisiana (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Media > Film (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback