AITopics | hwt

Collaborating Authors

hwt

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix

Neural Information Processing SystemsFeb-10-2026, 02:18:59 GMT

In practice, building f and g requires the computation for wtiwtj for all i,j. B.2 Classification For the classification task with the logistic regression model, we modify the formula of logistic regression in teaching objectives to make it convenient for derivation. It also indicates that with probability at least p1, the LST teacher can achieve exponential teachability in the iteration t. In order to achieve exponential teachiability in T iterations, the sufficient condition in Eq. (22) must be satisfied in all T iterations. Then, we use a pre-trained DenseNet [65] shown in [53] to generate 1024 dim features and the confidencescoreforeachimage.

artificial intelligence, hwt, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.76)

Add feedback

Linguistic Characteristics of AI-Generated Text: A Survey

Terčon, Luka, Dobrovoljc, Kaja

arXiv.org Artificial IntelligenceOct-8-2025

Large language models (LLMs) are solidifying their position in the modern world as effective tools for the automatic generation of text. Their use is quickly becoming commonplace in fields such as education, healthcare, and scientific research. There is a growing need to study the linguistic features present in AI-generated text, as the increasing presence of such texts has profound implications in various disciplines such as corpus linguistics, computational linguistics, and natural language processing. Many observations have already been made, however a broader synthesis of the findings made so far is required to provide a better understanding of the topic. The present survey paper aims to provide such a synthesis of extant research. We categorize the existing works along several dimensions, including the levels of linguistic description, the models included, the genres analyzed, the languages analyzed, and the approach to prompting. Additionally, the same scheme is used to present the findings made so far and expose the current trends followed by researchers. Among the most-often reported findings is the observation that AI-generated text is more likely to contain a more formal and impersonal style, signaled by the increased presence of nouns, determiners, and adpositions and the lower reliance on adjectives and adverbs. AI-generated text is also more likely to feature a lower lexical diversity, a smaller vocabulary size, and repetitive text. Current research, however, remains heavily concentrated on English data and mostly on text generated by the GPT model family, highlighting the need for broader cross-linguistic and cross-model investigation. In most cases authors also fail to address the issue of prompt sensitivity, leaving much room for future studies that employ multiple prompt wordings in the text generation phase.

aigt, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.05136

Country:

Europe (0.93)
Asia (0.67)
North America > Mexico (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > New Finding (0.68)

Industry:

Health & Medicine (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AnalogNAS-Bench: A NAS Benchmark for Analog In-Memory Computing

Bessalah, Aniss, Abdelmoumen, Hatem Mohamed, Benatchba, Karima, Benmeziane, Hadjer

arXiv.org Artificial IntelligenceJun-24-2025

Analog In-memory Computing (AIMC) has emerged as a highly efficient paradigm for accelerating Deep Neural Networks (DNNs), offering significant energy and latency benefits over conventional digital hardware. However, state-of-the-art neural networks are not inherently designed for AIMC, as they fail to account for its unique non-idealities. Neural Architecture Search (NAS) is thus needed to systematically discover neural architectures optimized explicitly for AIMC constraints. However, comparing NAS methodologies and extracting insights about robust architectures for AIMC requires a dedicated NAS benchmark that explicitly accounts for AIMC-specific hardware non-idealities. To address this, we introduce AnalogNAS-Bench, the first NAS benchmark tailored specifically for AIMC. Our study reveals three key insights: (1) standard quantization techniques fail to capture AIMC-specific noises, (2) robust architectures tend to feature wider and branched blocks, (3) skip connections improve resilience to temporal drift noise. These insights highlight the limitations of current NAS benchmarks for AIMC and pave the way for future analog-aware NAS. All the implementations used in this paper can be found at https://github.com/IBM/analog-nas/tree/main/analognasbench.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.18495

Country: North America > Canada > Ontario (0.28)

Genre: Research Report (0.64)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors

Pedrotti, Andrea, Papucci, Michele, Ciaccio, Cristiano, Miaschi, Alessio, Puccetti, Giovanni, Dell'Orletta, Felice, Esuli, Andrea

arXiv.org Artificial IntelligenceJun-2-2025

Recent advancements in Generative AI and Large Language Models (LLMs) have enabled the creation of highly realistic synthetic content, raising concerns about the potential for malicious use, such as misinformation and manipulation. Moreover, detecting Machine-Generated Text (MGT) remains challenging due to the lack of robust benchmarks that assess generalization to real-world scenarios. In this work, we present a pipeline to test the resilience of state-of-the-art MGT detectors (e.g., Mage, Radar, LLM-DetectAIve) to linguistically informed adversarial attacks. To challenge the detectors, we fine-tune language models using Direct Preference Optimization (DPO) to shift the MGT style toward human-written text (HWT). This exploits the detectors' reliance on stylistic clues, making new generations more challenging to detect. Additionally, we analyze the linguistic shifts induced by the alignment and which features are used by detectors to detect MGT texts. Our results show that detectors can be easily fooled with relatively few examples, resulting in a significant drop in detection performance. This highlights the importance of improving detection methods and making them robust to unseen in-domain texts.

detector, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2505.24523

Country:

Europe > United Kingdom (1.00)
Asia > Laos (0.68)
North America > United States (0.67)
Asia > Middle East > UAE (0.28)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Media > News (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Government > Regional Government > Europe Government > United Kingdom Government (0.92)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Human Variability vs. Machine Consistency: A Linguistic Analysis of Texts Generated by Humans and Large Language Models

Zanotto, Sergio E., Aroyehun, Segun

arXiv.org Artificial IntelligenceDec-3-2024

The rapid advancements in large language models (LLMs) have significantly improved their ability to generate natural language, making texts generated by LLMs increasingly indistinguishable from human-written texts. Recent research has predominantly focused on using LLMs to classify text as either human-written or machine-generated. In our study, we adopt a different approach by profiling texts spanning four domains based on 250 distinct linguistic features. We select the M4 dataset from the Subtask B of SemEval 2024 Task 8. We automatically calculate various linguistic features with the LFTK tool and additionally measure the average syntactic depth, semantic similarity, and emotional content for each document. We then apply a two-dimensional PCA reduction to all the calculated features. Our analyses reveal significant differences between human-written texts and those generated by LLMs, particularly in the variability of these features, which we find to be considerably higher in human-written texts. This discrepancy is especially evident in text genres with less rigid linguistic style constraints. Our findings indicate that humans write texts that are less cognitively demanding, with higher semantic content, and richer emotional content compared to texts generated by LLMs. These insights underscore the need for incorporating meaningful linguistic features to enhance the understanding of textual outputs of LLMs.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.03025

Country:

Europe (0.93)
North America > Mexico (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

PetKaz at SemEval-2024 Task 8: Can Linguistics Capture the Specifics of LLM-generated Text?

Petukhova, Kseniia, Kazakov, Roman, Kochmar, Ekaterina

arXiv.org Artificial IntelligenceApr-8-2024

In this paper, we present our submission to the SemEval-2024 Task 8 "Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection", focusing on the detection of machine-generated texts (MGTs) in English. Specifically, our approach relies on combining embeddings from the RoBERTa-base with diversity features and uses a resampled training set. We score 12th from 124 in the ranking for Subtask A (monolingual track), and our results show that our approach is generalizable across unseen models and domains, achieving an accuracy of 0.91.

detection, linguistic feature, semeval-2024 task 8, (15 more...)

arXiv.org Artificial Intelligence

2404.05483

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Leisure & Entertainment > Games (0.48)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

AI can now copy your HANDWRITING - so, can you tell which of these was written by a robot?

Daily Mail - Science & techJan-17-2024, 13:57:16 GMT

AI tools like ChatGPT can draft letters, tell jokes and even give legal advice – but only in the form of computerized text. Now, scientists have created an AI that can imitate human handwriting, which could herald fresh issues regarding fraud and fake documents. Amazingly, the results are almost indistinguishable from the real thing drafted by human hands. Below is one column of writing by the team's AI model and another by humans, but can you tell which is which? Scroll down to reveal the answer!

artificial intelligence, deep learning, machine learning, (18 more...)

Daily Mail - Science & tech

Country:

North America > Canada (0.05)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.05)

Industry: Law (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

LLM-as-a-Coauthor: The Challenges of Detecting LLM-Human Mixcase

Gao, Chujie, Chen, Dongping, Zhang, Qihui, Huang, Yue, Wan, Yao, Sun, Lichao

arXiv.org Artificial IntelligenceJan-11-2024

With the remarkable development and widespread applications of large language models (LLMs), the use of machine-generated text (MGT) is becoming increasingly common. This trend brings potential risks, particularly to the quality and completeness of information in fields such as news and education. Current research predominantly addresses the detection of pure MGT without adequately addressing mixed scenarios including AI-revised Human-Written Text (HWT) or human-revised MGT. To confront this challenge, we introduce mixcase, a novel concept representing a hybrid text form involving both machine-generated and human-generated content. We collected mixcase instances generated from multiple daily text-editing scenarios and composed MixSet, the first dataset dedicated to studying these mixed modification scenarios. We conduct experiments to evaluate the efficacy of popular MGT detectors, assessing their effectiveness, robustness, and generalization performance. Our findings reveal that existing detectors struggle to identify mixcase as a separate class or MGT, particularly in dealing with subtle modifications and style adaptability. This research underscores the urgent need for more fine-grain detectors tailored for mixcase, offering valuable insights for future research. Code and Models are available at https://github.com/Dongping-Chen/MixSet.

arxiv preprint arxiv, dataset, detector, (15 more...)

arXiv.org Artificial Intelligence

2401.05952

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media (0.68)
Health & Medicine (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Data Limitation With Contrastive Learning

Liu, Xiaoming, Zhang, Zhaohan, Wang, Yichen, Pu, Hang, Lan, Yu, Shen, Chao

arXiv.org Artificial IntelligenceOct-20-2023

Machine-Generated Text (MGT) detection, a task that discriminates MGT from Human-Written Text (HWT), plays a crucial role in preventing misuse of text generative models, which excel in mimicking human writing style recently. Latest proposed detectors usually take coarse text sequences as input and fine-tune pretrained models with standard cross-entropy loss. However, these methods fail to consider the linguistic structure of texts. Moreover, they lack the ability to handle the low-resource problem which could often happen in practice considering the enormous amount of textual data online. In this paper, we present a coherence-based contrastive learning model named CoCo to detect the possible MGT under low-resource scenario. To exploit the linguistic feature, we encode coherence information in form of graph into text representation. To tackle the challenges of low data resource, we employ a contrastive learning framework and propose an improved contrastive loss for preventing performance degradation brought by simple samples. The experiment results on two public datasets and two self-constructed datasets prove our approach outperforms the state-of-art methods significantly. Also, we surprisingly find that MGTs originated from up-to-date language models could be easier to detect than these from previous models, in our experiments. And we propose some preliminary explanations for this counter-intuitive phenomena. All the codes and datasets are open-sourced.

dataset, gpt-3, representation, (14 more...)

arXiv.org Artificial Intelligence

2212.10341

Country:

Asia > Middle East > UAE > Dubai Emirate > Dubai (0.04)
Asia > India (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
(17 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Government (1.00)
Media (0.93)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback