AITopics | Tang, Kevin

Collaborating Authors

Tang, Kevin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Connecting the Persian-speaking World through Transliteration

Merchant, Rayyan, Ramarao, Akhilesh Kakolu, Tang, Kevin

arXiv.org Artificial IntelligenceFeb-27-2025

Despite speaking mutually intelligible varieties of the same language, speakers of Tajik Persian, written in a modified Cyrillic alphabet, cannot read Iranian and Afghan texts written in the Perso-Arabic script. As the vast majority of Persian text on the Internet is written in Perso-Arabic, monolingual Tajik speakers are unable to interface with the Internet in any meaningful way. This paper presents a transformer-based G2P approach to Tajik-Farsi transliteration, achieving chrF++ scores of 58.70 (Farsi to Tajik) and 74.20 (Tajik to Farsi) on novel digraphic datasets, setting a comparable baseline metric for future work. Our results also demonstrate the non-trivial difficulty of this task in both directions. We also provide an overview of the differences between the two scripts and the challenges they present, so as to aid future efforts in Tajik-Farsi transliteration. Keywords: Persian, Tajik, Transliteration, Orthography, Computational Linguistics 1 Introduction Tajik Persian (henceforth, Tajik) is the formal variety of Modern Persian spoken in Tajikistan. As such, it retains an extremely high level of mutual intelligibility with formal Persian as spoken in Iran and Afghanistan (henceforth referred to as Farsi). Unlike these two countries which use the centuries-old Perso-Arabic script, Tajikistan uses the relatively new Tajik-Cyrillic script due to Tajikistan's Soviet heritage (Perry 2005). While proposals have been made to shift the script back to Perso-Arabic, any significant shift will likely not occur in the near future, with Tajikistan's former Minister of Culture stating in 2008 that "...some 90-95% of Tajikistan's population is not familiar with Arabic script..." 1 (Ghufronov 2008).

machine learning, natural language, transliteration, (19 more...)

arXiv.org Artificial Intelligence

2502.20047

Country:

Asia > Tajikistan (1.00)
Asia > Middle East > Iran (0.24)
North America > United States > Florida > Alachua County > Gainesville (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge

Remme, Lian, Tang, Kevin

arXiv.org Artificial IntelligenceFeb-18-2025

This paper provides a proof of concept that audio of tabletop role-playing games (TTRPG) could serve as a challenge for diarization systems. TTRPGs are carried out mostly by conversation. Participants often alter their voices to indicate that they are talking as a fictional character. Audio processing systems are susceptible to voice conversion with or without technological assistance. TTRPG present a conversational phenomenon in which voice conversion is an inherent characteristic for an immersive gaming experience. This could make it more challenging for diarizers to pick the real speaker and determine that impersonating is just that. We present the creation of a small TTRPG audio dataset and compare it against the AMI and the ICSI corpus. The performance of two diarizers, pyannote.audio and wespeaker, were evaluated. We observed that TTRPGs' properties result in a higher confusion rate for both diarizers. Additionally, wespeaker strongly underestimates the number of speakers in the TTRPG audio files. We propose TTRPG audio as a promising challenge for diarization systems.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.12714

Country: Europe > Switzerland (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.93)

Add feedback

Analysis of LLM as a grammatical feature tagger for African American English

Porwal, Rahul, Rozet, Alice, Houck, Pryce, Gowda, Jotsna, Moeller, Sarah, Tang, Kevin

arXiv.org Artificial IntelligenceFeb-9-2025

African American English (AAE) presents unique challenges in natural language processing (NLP). This research systematically compares the performance of available NLP models--rule-based, transformer-based, and large language models (LLMs)--capable of identifying key grammatical features of AAE, namely Habitual Be and Multiple Negation. These features were selected for their distinct grammatical complexity and frequency of occurrence. The evaluation involved sentence-level binary classification tasks, using both zero-shot and few-shot strategies. The analysis reveals that while LLMs show promise compared to the baseline, they are influenced by biases such as recency and unrelated features in the text such as formality. This study highlights the necessity for improved model training and architectural adjustments to better accommodate AAE's unique linguistic characteristics. Data and code are available.

computational linguistic, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.06004

Country:

Europe (1.00)
North America > United States > Texas (0.14)
North America > United States > Louisiana (0.14)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.96)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Frequency matters: Modeling irregular morphological patterns in Spanish with Transformers

Ramarao, Akhilesh Kakolu, Tang, Kevin, Baer-Henney, Dinah

arXiv.org Artificial IntelligenceDec-13-2024

The present paper evaluates the learning behaviour of a transformer-based neural network with regard to an irregular inflectional paradigm. We apply the paradigm cell filling problem to irregular patterns. We approach this problem using the morphological reinflection task and model it as a character sequence-to-sequence learning problem. The test case under investigation are irregular verbs in Spanish. Besides many regular verbs in Spanish L-shaped verbs the first person singular indicative stem irregularly matches the subjunctive paradigm, while other indicative forms remain unaltered. We examine the role of frequency during learning and compare models under differing input frequency conditions. We train the model on a corpus of Spanish with a realistic distribution of regular and irregular verbs to compare it with models trained on input with augmented distributions of (ir)regular words. We explore how the neural models learn this L-shaped pattern using post-hoc analyses. Our experiments show that, across frequency conditions, the models are surprisingly capable of learning the irregular pattern. Furthermore, our post-hoc analyses reveal the possible sources of errors. All code and data are available at \url{https://anonymous.4open.science/r/modeling_spanish_acl-7567/} under MIT license.

computational linguistic, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.21013

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GreedLlama: Performance of Financial Value-Aligned Large Language Models in Moral Reasoning

Yu, Jeffy, Huber, Maximilian, Tang, Kevin

arXiv.org Artificial IntelligenceApr-2-2024

This paper investigates the ethical implications of aligning Large Language Models (LLMs) with financial optimization, through the case study of "GreedLlama," a model fine-tuned to prioritize economically beneficial outcomes. By comparing GreedLlama's performance in moral reasoning tasks to a base Llama2 model, our results highlight a concerning trend: GreedLlama demonstrates a marked preference for profit over ethical considerations, making morally appropriate decisions at significantly lower rates than the base model in scenarios of both low and high moral ambiguity. In low ambiguity situations, GreedLlama's ethical decisions decreased to 54.4%, compared to the base model's 86.9%, while in high ambiguity contexts, the rate was 47.4% against the base model's 65.1%. These findings emphasize the risks of single-dimensional value alignment in LLMs, underscoring the need for integrating broader ethical values into AI development to ensure decisions are not solely driven by financial incentives. The study calls for a balanced approach to LLM deployment, advocating for the incorporation of ethical considerations in models intended for business applications, particularly in light of the absence of regulatory oversight.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2404.02934

Genre: Research Report > New Finding (0.48)

Industry:

Law (0.68)
Government (0.68)
Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Shifting Weights: Adapting Object Detectors from Image to Video

Tang, Kevin, Ramanathan, Vignesh, Fei-fei, Li, Koller, Daphne

Neural Information Processing SystemsDec-31-2012

Typical object detectors trained on images perform poorly on video, as there is a clear distinction in domain between the two types of data. In this paper, we tackle the problem of adapting object detectors learned from images to work well on videos. We treat the problem as one of unsupervised domain adaptation, in which we are given labeled data from the source domain (image), but only unlabeled data from the target domain (video). Our approach, self-paced domain adaptation, seeks to iteratively adapt the detector by retraining the detector with automatically discoveredtarget domain examples, starting with the easiest first. At each iteration, the algorithm adapts by considering an increased number of target domain examples,and a decreased number of source domain examples. To discover target domain examples from the vast amount of video data, we introduce a simple, robustapproach that scores trajectory tracks instead of bounding boxes. We also show how rich and expressive features specific to the target domain can be incorporated under the same framework. We show promising results on the 2011 TRECVID Multimedia Event Detection [1] and LabelMe Video [2] datasets that illustrate the benefit of our approach to adapt object detectors to video.

detector, inductive learning, us government, (21 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County (0.14)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback