AITopics | dominant topic

Collaborating Authors

dominant topic

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-2-2025, 21:03:41 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper extends the work of Arora et al by relaxing one assumption and adding another. Topics can have multiple distinctive words rather than a single distinctive word, and each document has a single dominant topic. The authors show that an SVD on a truncated version of the document-words matrix can reconstruct topics under these assumptions to within a provable bound, assuming that the model is valid. I like that the authors attempt to validate the assumptions.

artificial intelligence, assumption, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.97)

Add feedback

Swallowing the Poison Pills: Insights from Vulnerability Disparity Among LLMs

Yifeng, Peng, Zhizheng, Wu, Chen, Chen

arXiv.org Artificial IntelligenceFeb-23-2025

Modern large language models (LLMs) exhibit critical vulnerabilities to poison pill attacks: localized data poisoning that alters specific factual knowledge while preserving overall model utility. We systematically demonstrate these attacks exploit inherent architectural properties of LLMs, achieving 54.6% increased retrieval inaccuracy on long-tail knowledge versus dominant topics and up to 25.5% increase retrieval inaccuracy on compressed models versus original architectures. Through controlled mutations (e.g., temporal/spatial/entity alterations) and, our method induces localized memorization deterioration with negligible impact on models' performance on regular standard benchmarks (e.g., <2% performance drop on MMLU/GPQA), leading to potential detection evasion. Our findings suggest: (1) Disproportionate vulnerability in long-tail knowledge may result from reduced parameter redundancy; (2) Model compression may increase attack surfaces, with pruned/distilled models requiring 30% fewer poison samples for equivalent damage; (3) Associative memory enables both spread of collateral damage to related concepts and amplification of damage from simultaneous attack, particularly for dominant topics. These findings raise concerns over current scaling paradigms since attack costs are lowering while defense complexity is rising. Our work establishes poison pills as both a security threat and diagnostic tool, revealing critical security-efficiency trade-offs in language model compression that challenges prevailing safety assumptions.

knowledge, poison pill, vulnerability, (11 more...)

arXiv.org Artificial Intelligence

2502.18518

Country: Asia > China (0.15)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.66)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

A provable SVD-based algorithm for learning topics in dominant admixture corpus

Trapit Bansal, Chiranjib Bhattacharyya, Ravindran Kannan

Neural Information Processing SystemsFeb-9-2025, 17:08:25 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

Add feedback

Get out of the BAG! Silos in AI Ethics Education: Unsupervised Topic Modeling Analysis of Global AI Curricula

Javed, Rana Tallal, Nasir, Osama, Borit, Melania, Vanhée, Loïs, Zea, Elias, Gupta, Shivam, Vinuesa, Ricardo, Qadir, Junaid

Journal of Artificial Intelligence ResearchMar-26-2022

The domain of Artificial Intelligence (AI) ethics is not new, with discussions going back at least 40 years. Teaching the principles and requirements of ethical AI to students is considered an essential part of this domain, with an increasing number of technical AI courses taught at several higher-education institutions around the globe including content related to ethics. By using Latent Dirichlet Allocation (LDA), a generative probabilistic topic model, this study uncovers topics in teaching ethics in AI courses and their trends related to where the courses are taught, by whom, and at what level of cognitive complexity and specificity according to Bloom’s taxonomy. In this exploratory study based on unsupervised machine learning, we analyzed a total of 166 courses: 116 from North American universities, 11 from Asia, 36 from Europe, and 10 from other regions. Based on this analysis, we were able to synthesize a model of teaching approaches, which we call BAG (Build, Assess, and Govern), that combines specific cognitive levels, course content topics, and disciplines affiliated with the department(s) in charge of the course. We critically assess the implications of this teaching paradigm and provide suggestions about how to move away from these practices. We challenge teaching practitioners and program coordinators to reflect on their usual procedures so that they may expand their methodology beyond the confines of stereotypical thought and traditional biases regarding what disciplines should teach and how. This article appears in the AI & Society track.

artificial intelligence, bloom cognitive level, natural language, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.13550

AI Access Foundation

13550

Journal of Artificial Intelligence Research

Country:

Asia > China (0.04)
Europe > Sweden > Västerbotten County > Umeå (0.04)
Asia > India (0.04)
(21 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Health & Medicine (1.00)
Government (1.00)
Education > Curriculum > Subject-Specific Education (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

How Pandemic Spread in News: Text Analysis Using Topic Model

Wang, Minghao, Mengoni, Paolo

arXiv.org Artificial IntelligenceFeb-19-2021

Researches about COVID-19 has increased largely, no matter in the biology field or the others. This research conducted a text analysis using LDA topic model. We firstly scraped totally 1127 articles and 5563 comments on SCMP covering COVID-19 from Jan 20 to May 19, then we trained the LDA model and tuned parameters based on the Cv coherence as the model evaluation method. With the optimal model, dominant topics, representative documents of each topic and the inconsistence between articles and comments are analyzed. 3 possible improvements are discussed at last.

artificial intelligence, natural language, pandemic spread, (18 more...)

arXiv.org Artificial Intelligence

2102.04205

Country:

Europe (0.14)
Asia > China > Hubei Province > Wuhan (0.05)
Asia > China > Hong Kong (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.90)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.73)

Add feedback

Machine Learning and #MeToo: an analysis of sexism in STEM

#artificialintelligenceMar-21-2020, 10:53:14 GMT

I recently completed a data science bootcamp, and during my last week, my class and the graduating UX/UI design class got headshots done. I was the first person to go up, and I started chatting with the photographer. She mentioned how she also does headshots for another coding bootcamp as well and coders are much more awkward in front of a camera. Then, she asked me what I did in my design class. Machine learning is quickly becoming part of the fabric of our lives and the world we live in.

dominant topic, machine learning, natural language, (18 more...)

#artificialintelligence

Country: North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.05)

Industry:

Media (0.70)
Law > Criminal Law (0.67)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.65)

Add feedback

Yoga-Veganism: Correlation Mining of Twitter Health Data

Islam, Tunazzina

arXiv.org Artificial IntelligenceJun-15-2019

Nowadays social media is a huge platform of data. People usually share their interest, thoughts via discussions, tweets, status. It is not possible to go through all the data manually. We need to mine the data to explore hidden patterns or unknown correlations, find out the dominant topic in data and understand people's interest through the discussions. In this work, we explore Twitter data related to health. We extract the popular topics under different categories (e.g. diet, exercise) discussed in Twitter via topic modeling, observe model behavior on new tweets, discover interesting correlation (i.e. Yoga-Veganism). We evaluate accuracy by comparing with ground truth using manual annotation both for train and test data.

artificial intelligence, natural language, tweet, (17 more...)

arXiv.org Artificial Intelligence

1906.07668

Country:

North America > United States > Alaska > Anchorage Municipality > Anchorage (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Information Technology > Services (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.89)

Add feedback

A provable SVD-based algorithm for learning topics in dominant admixture corpus

Bansal, Trapit, Bhattacharyya, Chiranjib, Kannan, Ravindran

Neural Information Processing SystemsDec-31-2014

Topic models, such as Latent Dirichlet Allocation (LDA), posit that documents are drawn from admixtures of distributions over words, known as topics. The inference problem of recovering topics from such a collection of documents drawn from admixtures, is NP-hard. Making a strong assumption called separability, [4] gave the first provable algorithm for inference. For the widely used LDA model, [6] gave a provable algorithm using clever tensor-methods. But [4, 6] do not learn topic vectors with bounded $l_1$ error (a natural measure for probability vectors). Our aim is to develop a model which makes intuitive and empirically supported assumptions and to design an algorithm with natural, simple components such as SVD, which provably solves the inference problem for the model with bounded $l_1$ error. A topic in LDA and other models is essentially characterized by a group of co-occurring words. Motivated by this, we introduce topic specific Catchwords, a group of words which occur with strictly greater frequency in a topic than any other topic individually and are required to have high frequency together rather than individually. A major contribution of the paper is to show that under this more realistic assumption, which is empirically verified on real corpora, a singular value decomposition (SVD) based algorithm with a crucial pre-processing step of thresholding, can provably recover the topics from a collection of documents drawn from Dominant admixtures. Dominant admixtures are convex combination of distributions in which one distribution has a significantly higher contribution than the others. Apart from the simplicity of the algorithm, the sample complexity has near optimal dependence on $w_0$, the lowest probability that a topic is dominant, and is better than [4]. Empirical evidence shows that on several real world corpora, both Catchwords and Dominant admixture assumptions hold and the proposed algorithm substantially outperforms the state of the art [5].

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

Add feedback

A provable SVD-based algorithm for learning topics in dominant admixture corpus

Bansal, Trapit, Bhattacharyya, Chiranjib, Kannan, Ravindran

arXiv.org Machine LearningNov-4-2014

Topic models, such as Latent Dirichlet Allocation (LDA), posit that documents are drawn from admixtures of distributions over words, known as topics. The inference problem of recovering topics from admixtures, is NP-hard. Assuming separability, a strong assumption, [4] gave the first provable algorithm for inference. For LDA model, [6] gave a provable algorithm using tensor-methods. But [4,6] do not learn topic vectors with bounded $l_1$ error (a natural measure for probability vectors). Our aim is to develop a model which makes intuitive and empirically supported assumptions and to design an algorithm with natural, simple components such as SVD, which provably solves the inference problem for the model with bounded $l_1$ error. A topic in LDA and other models is essentially characterized by a group of co-occurring words. Motivated by this, we introduce topic specific Catchwords, group of words which occur with strictly greater frequency in a topic than any other topic individually and are required to have high frequency together rather than individually. A major contribution of the paper is to show that under this more realistic assumption, which is empirically verified on real corpora, a singular value decomposition (SVD) based algorithm with a crucial pre-processing step of thresholding, can provably recover the topics from a collection of documents drawn from Dominant admixtures. Dominant admixtures are convex combination of distributions in which one distribution has a significantly higher contribution than others. Apart from the simplicity of the algorithm, the sample complexity has near optimal dependence on $w_0$, the lowest probability that a topic is dominant, and is better than [4]. Empirical evidence shows that on several real world corpora, both Catchwords and Dominant admixture assumptions hold and the proposed algorithm substantially outperforms the state of the art [5].

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1410.6991

Country:

Asia > Russia (0.28)
Asia > Middle East > Iraq (0.14)
Europe > Kosovo (0.14)
(15 more...)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment (1.00)
Government > Regional Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback