AITopics | topic inference

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

Add feedback

Geometric Dirichlet Means Algorithm for topic inference

Neural Information Processing SystemsNov-21-2025, 15:08:18 GMT

We propose a geometric algorithm for topic learning and inference that is built on the convex geometry of topics arising from the Latent Dirichlet Allocation (LDA) model and its nonparametric extensions. To this end we study the optimization of a geometric loss function, which is a surrogate to the LDA's likelihood. Our method involves a fast optimization based weighted clustering procedure augmented with geometric corrections, which overcomes the computational and statistical inefficiencies encountered by other techniques based on Gibbs sampling and variational inference, while achieving the accuracy comparable to that of a Gibbs sampler. The topic estimates produced by our method are shown to be statistically consistent under some conditions. The algorithm is evaluated with extensive experiments on simulated and real data.

geometric dirichlet, inference, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Geometric Dirichlet Means Algorithm for topic inference

Neural Information Processing SystemsFeb-11-2025, 20:01:49 GMT

We propose a geometric algorithm for topic learning and inference that is built on the convex geometry of topics arising from the Latent Dirichlet Allocation (LDA) model and its nonparametric extensions. To this end we study the optimization of a geometric loss function, which is a surrogate to the LDA's likelihood. Our method involves a fast optimization based weighted clustering procedure augmented with geometric corrections, which overcomes the computational and statistical inefficiencies encountered by other techniques based on Gibbs sampling and variational inference, while achieving the accuracy comparable to that of a Gibbs sampler. The topic estimates produced by our method are shown to be statistically consistent under some conditions. The algorithm is evaluated with extensive experiments on simulated and real data.

geometric dirichlet, inference, topic inference, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Reviews: Geometric Dirichlet Means Algorithm for topic inference

Neural Information Processing SystemsJan-20-2025, 16:47:46 GMT

I like this paper for two different reasons. After RecoverKL and the spectral algorithm, this paper brings a very novel and useful perspective into the topic inference problem for LDA, without apparently making strong assumptions about topics, such as separability via anchor words, etc. Secondly, it seems to be extremely good in practice meeting the speed of RecoverKL with the accuracy of Gibbs sampling algorithms. A. The algorithm: Aspects of this work were known before. For example, Blei pointed out the convex geometry in the original LDA paper, and the connection between LDA/NMF and K-Means was also known. However, the novel aspect of this paper is that it has used these connections to propose an inference algorithm for LDA completely based on the geometry of the topic and word simplexes. This is done by making an additional connection between the topic inference problem and that of Centroidal Voronoi Tesselations of a convex simplex.

algorithm, geometric dirichlet, recoverkl, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.59)

Add feedback

MixEHR-Nest: Identifying Subphenotypes within Electronic Health Records through Hierarchical Guided-Topic Modeling

Wang, Ruohan, Wang, Zilong, Song, Ziyang, Buckeridge, David, Li, Yue

arXiv.org Artificial IntelligenceOct-17-2024

Automatic subphenotyping from electronic health records (EHRs)provides numerous opportunities to understand diseases with unique subgroups and enhance personalized medicine for patients. However, existing machine learning algorithms either focus on specific diseases for better interpretability or produce coarse-grained phenotype topics without considering nuanced disease patterns. In this study, we propose a guided topic model, MixEHR-Nest, to infer sub-phenotype topics from thousands of disease using multi-modal EHR data. Specifically, MixEHR-Nest detects multiple subtopics from each phenotype topic, whose prior is guided by the expert-curated phenotype concepts such as Phenotype Codes (PheCodes) or Clinical Classification Software (CCS) codes. We evaluated MixEHR-Nest on two EHR datasets: (1) the MIMIC-III dataset consisting of over 38 thousand patients from intensive care unit (ICU) from Beth Israel Deaconess Medical Center (BIDMC) in Boston, USA; (2) the healthcare administrative database PopHR, comprising 1.3 million patients from Montreal, Canada. Experimental results demonstrate that MixEHR-Nest can identify subphenotypes with distinct patterns within each phenotype, which are predictive for disease progression and severity. Consequently, MixEHR-Nest distinguishes between type 1 and type 2 diabetes by inferring subphenotypes using CCS codes, which do not differentiate these two subtype concepts. Additionally, MixEHR-Nest not only improved the prediction accuracy of short-term mortality of ICU patients and initial insulin treatment in diabetic patients but also revealed the contributions of subphenotypes. For longitudinal analysis, MixEHR-Nest identified subphenotypes of distinct age prevalence under the same phenotypes, such as asthma, leukemia, epilepsy, and depression. The MixEHR-Nest software is available at GitHub: https://github.com/li-lab-mcgill/MixEHR-Nest.

machine learning, natural language, subphenotype, (18 more...)

arXiv.org Artificial Intelligence

2410.13217

Country:

North America > Canada > Quebec > Montreal (0.67)
Asia > Middle East > Israel (0.24)
Asia > China > Guangdong Province > Shenzhen (0.05)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength High (0.68)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.35)

Add feedback

Zero-Shot Multi-Label Topic Inference with Sentence Encoders

Sarkar, Souvika, Feng, Dongji, Santu, Shubhra Kanti Karmaker

arXiv.org Artificial IntelligenceApr-14-2023

In this paper, we focus on Zero-shot approaches 2018b)] for topic inference tasks and subsequently, (Yin et al., 2019; Xie et al., 2016; Veeranna establish a benchmark for future study in this crucial et al., 2016) for inferring topics from documents direction. To achieve this, we conducted extensive where document and topics were never seen experiments with multiple real-world datasets, previously by a model. Furthermore, for developing including online product reviews, news articles, Zero-shot methods, we exclusively focus on and health-related blog articles. We also implemented leveraging the recent powerful sentence encoders.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2304.07382

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.05)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(7 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)

Add feedback

Geometric Dirichlet Means Algorithm for topic inference

Yurochkin, Mikhail, Nguyen, XuanLong

Neural Information Processing SystemsFeb-14-2020, 11:42:35 GMT

We propose a geometric algorithm for topic learning and inference that is built on the convex geometry of topics arising from the Latent Dirichlet Allocation (LDA) model and its nonparametric extensions. To this end we study the optimization of a geometric loss function, which is a surrogate to the LDA's likelihood. Our method involves a fast optimization based weighted clustering procedure augmented with geometric corrections, which overcomes the computational and statistical inefficiencies encountered by other techniques based on Gibbs sampling and variational inference, while achieving the accuracy comparable to that of a Gibbs sampler. The topic estimates produced by our method are shown to be statistically consistent under some conditions. The algorithm is evaluated with extensive experiments on simulated and real data.

artificial intelligence, inference, machine learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.54)

Add feedback