Palau
How to Approximate Inference with Subtractive Mixture Models
Zellinger, Lena, Branchini, Nicola, De Smet, Lennert, Elvira, Víctor, Malkin, Nikolay, Vergari, Antonio
Classical mixture models (MMs) are widely used tractable proposals for approximate inference settings such as variational inference (VI) and importance sampling (IS). Recently, mixture models with negative coefficients, called subtractive mixture models (SMMs), have been proposed as a potentially more expressive alternative. However, how to effectively use SMMs for VI and IS is still an open question as they do not provide latent variable semantics and therefore cannot use sampling schemes for classical MMs. In this work, we study how to circumvent this issue by designing several expectation estimators for IS and learning schemes for VI with SMMs, and we empirically evaluate them for distribution approximation. Finally, we discuss the additional challenges in estimation stability and learning efficiency that they carry and propose ways to overcome them. Code is available at: https://github.com/april-tools/delta-vi.
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > Jordan (0.04)
- Oceania > Palau (0.04)
- (10 more...)
Visit a WWII destroyer without leaving your sofa
The USS Cassin Young is one of the last of the war's Fletcher-class destroyers. More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results. The USS Cassin Young is one of four remaining Fletcher-class destroyers still afloat. Breakthroughs, discoveries, and DIY tips sent six days a week. Although its name may not sound immediately familiar, the over 360-foot-long ship's recognizable silhouette remains a hallmark example of World War II imagery.
- Oceania > Palau (0.05)
- Oceania > Guam (0.05)
- North America > United States > Massachusetts (0.05)
- (4 more...)
- Government > Military > Navy (1.00)
- Government > Regional Government > North America Government > United States Government (0.30)
The Infinite-Dimensional Nature of Spectroscopy and Why Models Succeed, Fail, and Mislead
Michelucci, Umberto, Venturini, Francesca
Machine learning (ML) models have achieved strikingly high accuracies in spectroscopic classification tasks, often without a clear proof that those models used chemically meaningful features. Existing studies have linked these results to data preprocessing choices, noise sensitivity, and model complexity, but no unifying explanation is available so far. In this work, we show that these phenomena arise naturally from the intrinsic high dimensionality of spectral data. Using a theoretical analysis grounded in the Feldman-Hajek theorem and the concentration of measure, we show that even infinitesimal distributional differences, caused by noise, normalisation, or instrumental artefacts, may become perfectly separable in high-dimensional spaces. Through a series of specific experiments on synthetic and real fluorescence spectra, we illustrate how models can achieve near-perfect accuracy even when chemical distinctions are absent, and why feature-importance maps may highlight spectrally irrelevant regions. We provide a rigorous theoretical framework, confirm the effect experimentally, and conclude with practical recommendations for building and interpreting ML models in spectroscopy.
- Europe > Switzerland (0.04)
- Oceania > Palau (0.04)
- Europe > Spain > Andalusia > Granada Province > Granada (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
Enhancing Online Support Group Formation Using Topic Modeling Techniques
Barman, Pronob Kumar, Reynolds, Tera L., Foulds, James
Online health communities (OHCs) are vital for fostering peer support and improving health outcomes. Support groups within these platforms can provide more personalized and cohesive peer support, yet traditional support group formation methods face challenges related to scalability, static categorization, and insufficient personalization. To overcome these limitations, we propose two novel machine learning models for automated support group formation: the Group specific Dirichlet Multinomial Regression (gDMR) and the Group specific Structured Topic Model (gSTM). These models integrate user generated textual content, demographic profiles, and interaction data represented through node embeddings derived from user networks to systematically automate personalized, semantically coherent support group formation. We evaluate the models on a large scale dataset from MedHelp, comprising over 2 million user posts. Both models substantially outperform baseline methods including LDA, DMR, and STM in predictive accuracy (held out log likelihood), semantic coherence (UMass metric), and internal group consistency. The gDMR model yields group covariates that facilitate practical implementation by leveraging relational patterns from network structures and demographic data. In contrast, gSTM emphasizes sparsity constraints to generate more distinct and thematically specific groups. Qualitative analysis further validates the alignment between model generated groups and manually coded themes, showing the practical relevance of the models in informing groups that address diverse health concerns such as chronic illness management, diagnostic uncertainty, and mental health. By reducing reliance on manual curation, these frameworks provide scalable solutions that enhance peer interactions within OHCs, with implications for patient engagement, community resilience, and health outcomes.
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Consumer Health (1.00)
- Health & Medicine > Therapeutic Area > Immunology (0.94)
- Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.68)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > China > Guangdong Province (0.14)
- (16 more...)
- Government (1.00)
- Law (0.68)
- Africa > Rwanda > Kigali > Kigali (0.04)
- North America > United States > North Carolina > Durham County > Durham (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (13 more...)
- Health & Medicine (1.00)
- Information Technology (0.92)
- Banking & Finance > Economy (0.45)
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks
Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge. However, deployment of the LLMs in real-world applications can be challenging due to their high computational requirements and concerns on data privacy.
- Oceania > Palau (0.14)
- Asia > Bangladesh (0.14)
- Asia > Azerbaijan (0.14)
- (14 more...)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Oceania > Palau (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
- Information Technology > Services (1.00)
- Information Technology > Security & Privacy (0.92)
- Marketing (0.67)
- Government (0.67)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Security & Privacy (0.92)
- Information Technology > Data Science > Data Mining > Big Data (0.46)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Oceania > Palau (0.04)
- Oceania > Australia > Queensland (0.04)
- (7 more...)
An Efficient Algorithm for Thresholding Monte Carlo Tree Search
Nameki, Shoma, Nakamura, Atsuyoshi, Komiyama, Junpei, Tabata, Koji
We introduce the Thresholding Monte Carlo Tree Search problem, in which, given a tree $\mathcal{T}$ and a threshold $θ$, a player must answer whether the root node value of $\mathcal{T}$ is at least $θ$ or not. In the given tree, `MAX' or `MIN' is labeled on each internal node, and the value of a `MAX'-labeled (`MIN'-labeled) internal node is the maximum (minimum) of its child values. The value of a leaf node is the mean reward of an unknown distribution, from which the player can sample rewards. For this problem, we develop a $δ$-correct sequential sampling algorithm based on the Track-and-Stop strategy that has asymptotically optimal sample complexity. We show that a ratio-based modification of the D-Tracking arm-pulling strategy leads to a substantial improvement in empirical sample complexity, as well as reducing the per-round computational cost from linear to logarithmic in the number of arms.
- North America > United States > New York > New York County > New York City (0.14)
- Europe > Austria > Vienna (0.14)
- Asia > Japan > Hokkaidō (0.04)
- (16 more...)