AITopics

2504.18428

Country: Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Pochinkov, Nicky, Volkova, Yulia, Vasileva, Anna, Chereddy, Sai V R

ParaScopes: What do Language Models Activations Encode About Future Text?

arXiv.org Artificial IntelligenceNov-4-2025

Interpretability studies in language models often investigate forward-looking representations of activations. However, as language models become capable of doing ever longer time horizon tasks, methods for understanding activations often remain limited to testing specific concepts or tokens. We develop a framework of Residual Stream Decoders as a method of probing model activations for paragraph-scale and document-scale plans. We test several methods and find information can be decoded equivalent to 5+ tokens of future context in small models. These results lay the groundwork for better monitoring of language models and better understanding how they might encode longer-term planning information.

large language model, machine learning, natural language, (18 more...)

2511.0018

Country: Europe (0.93)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Banking & Finance > Economy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-13-2025

Large Language Model Prompt Datasets: An In-depth Analysis and Insights

Zhang, Yuanming, Lin, Yan, Khan, Arijit, Wan, Huaiyu

A prompt is a natural language instruction that defines a specific task for a large language model (LLM) and serves as the primary interface for human-LLM interaction. With the growing deployment of LLMs, diverse prompt datasets are emerging from platforms such as GitHub and social media. These datasets span a wide array of applications and content types, facilitating both broader LLM utilization and improved prompt engineering. In this work, we--for the first time--have compiled an extensive list of prompt datasets sourced from various channels, representing a spectrum of downstream tasks, languages, engineering techniques, attributes, and modalities. We select key representative datasets for systematic analysis, revealing commonalities and differences in prompt construction across categories, distinguishing them from other text corpora like literature and web. We further propose a prompt optimization approach that leverages syntactic embeddings of part-of-speech and dependency structures. By identifying a centroid representation of prompts and guiding LLMs to rewrite prompts toward this centroid, our method improves the meaningfulness of model outputs. We have made our datasets and code available.

large language model, machine learning, natural language, (20 more...)

2510.09316

Country:

Asia > China (0.67)
Europe (0.67)
North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Instructional Material (0.92)
Overview (0.92)

Industry:

Health & Medicine (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceApr-18-2025

How to Detect and Defeat Molecular Mirage: A Metric-Driven Benchmark for Hallucination in LLM-based Molecular Comprehension

Li, Hao, Lv, Liuzhenghao, Cao, He, Liu, Zijing, Yan, Zhiyuan, Wang, Yu, Tian, Yonghong, Li, Yu, Yuan, Li

Large language models are increasingly used in scientific domains, especially for molecular understanding and analysis. However, existing models are affected by hallucination issues, resulting in errors in drug design and utilization. In this paper, we first analyze the sources of hallucination in LLMs for molecular comprehension tasks, specifically the knowledge shortcut phenomenon observed in the PubChem dataset. To evaluate hallucination in molecular comprehension tasks with computational efficiency, we introduce \textbf{Mol-Hallu}, a novel free-form evaluation metric that quantifies the degree of hallucination based on the scientific entailment relationship between generated text and actual molecular properties. Utilizing the Mol-Hallu metric, we reassess and analyze the extent of hallucination in various LLMs performing molecular comprehension tasks. Furthermore, the Hallucination Reduction Post-processing stage~(HRPP) is proposed to alleviate molecular hallucinations, Experiments show the effectiveness of HRPP on decoder-only and encoder-decoder molecular LLMs. Our findings provide critical insights into mitigating hallucination and improving the reliability of LLMs in scientific applications.

large language model, machine learning, natural language, (16 more...)

2504.12314

Country:

Asia > China (0.67)
North America > United States (0.46)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Law (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJun-6-2024

Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents

Poupart, Yoann

AI led chess systems to a superhuman level, yet these systems heavily rely on black-box algorithms. This is unsustainable in ensuring transparency to the end-user, particularly when these systems are responsible for sensitive decision-making. Recent interpretability work has shown that the inner representations of Deep Neural Networks (DNNs) were fathomable and contained human-understandable concepts. Yet, these methods are seldom contextualised and are often based on a single hidden state, which makes them unable to interpret multi-step reasoning, e.g. planning. In this respect, we propose contrastive sparse autoencoders (CSAE), a novel framework for studying pairs of game trajectories. Using CSAE, we are able to extract and interpret concepts that are meaningful to the chess-agent plans. We primarily focused on a qualitative analysis of the CSAE features before proposing an automated feature taxonomy. Furthermore, to evaluate the quality of our trained CSAE, we devise sanity checks to wave spurious correlations in our results.

corpusid, semanticscholar, trajectory, (16 more...)

2406.04028

Country: North America > United States (0.15)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment > Games > Chess (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

#artificialintelligenceSep-6-2022, 04:55:08 GMT

Top GPUs For Deep Learning and Machine Learning in 2022

As we walk into the age of AI, there is an exponential rise in the demand for GPU. The not-so-old method of parallel computing is applied to process computations in GPUs. Moreover, with the availability of very high numbers of ALUs or processing units, GPUs have become very suitable for powerful computations in AI. Furthermore, with the recent advent of Deep Learning in the current decade, most of the Deep Learning frameworks, including vastly popular TensorFlow, Pytorch, Theano, etc., enable advanced optimization of computations with GPU. Currently, a vast number of GPUs are available, with many differences in their features, like no. of processing units, memory capacity, clock frequency, etc.

batch size, clock speed, deep learning, (13 more...)

Country: Asia > India > West Bengal > Kolkata (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceJul-31-2022, 18:10:17 GMT

6 Papers Every Modern Data Scientist Must Read

Data Scientist, Machine Learning Expert, Algorithm Engineer, Deep Learning Researcher -- whatever your title might be, if using advanced concepts of Machine Learning is part of your career, then keeping up to date with the latest innovations is also a part of your everyday tasks. But in order to be on-top of all the latest ingenuities and truly understand how they work, we must also be familiar with the building blocks and foundations they rely on. The field of Deep Learning is moving fast, breaking and setting new records in each and every possible metric exists. And as it evolves, it creates new fundamental concepts, allowing new architectures and concepts never seen before. While I tend to assume all modern ML-practitioners are familiar with the basics fundamentals, such as CNN, RNN, LSTM and GAN, some of the newer ones are occasionally missed or left out.

data scientist, modern data scientist, simulation, (15 more...)

Genre: Research Report (0.48)

Industry: Leisure & Entertainment > Games > Go (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceDec-10-2021, 23:30:41 GMT

Deep Netts v2.0 Has Been Released - Deep Netts Blog

Deep Netts 2.0.0 is out! With 2.0 release Deep Netts has reached an important milestone after testing through real world use cases and pilot projects. Deep Netts 2.0 provides ease of use with competitive performance and simplified integration. Deep Netts is now free for development and we also provide opportunities for free low volume production licenses. All examples can be used as starter projects for corresponding problems.

deep nett blog, java code example, released, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.62)

#artificialintelligenceNov-23-2021, 10:20:11 GMT

OpenAI GPT-3 Waiting List Dropped as GPT-3 Is Fully Released for Developer and Enterprise Use

When OpenAI first debuted its powerful GPT-3 natural language model in June of 2020, it debuted in a limited beta capacity and featured a waiting list where developers could sign up to use its infrastructure and capabilities. Now, the waiting list has been dropped and GPT-3's capabilities are immediately available to developers and enterprises to work on their most challenging language problems, according to a Nov. 18 (Thursday) announcement by OpenAI, an independent AI research and deployment company. But there are some caveats – the general release adds conditions to prevent GPT-3 from being used to harm people, as well as conditions that only allow its use in certain nations around the world. That means that developers in some nations, including Cuba, Iran and Russia, cannot currently access it. "OpenAI is committed to the safe deployment of AI," the organization said in a statement.

application, gpt-3, openai, (16 more...)

Country:

North America > Cuba (0.25)
Europe > Russia (0.25)
Asia > Russia (0.25)
Asia > Middle East > Iran (0.25)

Industry: Information Technology (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

#artificialintelligenceJul-25-2021, 15:20:29 GMT

spaCy Version 3.0 Released: All Features & Specifications

The 3.0 version has state of the art transformer-based pipelines and pre-trained models in seventeen languages. The first version of spaCy was a preliminary version with little support for deep-learning workflows. The second version, however, introduced convoluted neural network models in seven different languages. The third version is a massive improvement over both of these versions. The 3.0 version has completed dropped support for Python 2 and only works on Python 3.6.

feature & specification, pipeline, spacy, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.76)