cornell university
Beyond Predictive Uncertainty: Reliable Representation Learning with Structural Constraints
Uncertainty estimation in machine learning has traditionally focused on the prediction stage, aiming to quantify confidence in model outputs while treating learned representations as deterministic and reliable by default. In this work, we challenge this implicit assumption and argue that reliability should be regarded as a first-class property of learned representations themselves. We propose a principled framework for reliable representation learning that explicitly models representation-level uncertainty and leverages structural constraints as inductive biases to regularize the space of feasible representations. Our approach introduces uncertainty-aware regularization directly in the representation space, encouraging representations that are not only predictive but also stable, well-calibrated, and robust to noise and structural perturbations. Structural constraints, such as sparsity, relational structure, or feature-group dependencies, are incorporated to define meaningful geometry and reduce spurious variability in learned representations, without assuming fully correct or noise-free structure. Importantly, the proposed framework is independent of specific model architectures and can be integrated with a wide range of representation learning methods.
- Asia > China (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Africa > Eswatini > Manzini > Manzini (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
- Health & Medicine (0.67)
- Information Technology (0.67)
- Education (0.67)
Bridging Language Gaps with Adaptive RAG: Improving Indonesian Language Question Answering
Christian, William, Adamlu, Daniel, Yu, Adrian, Suhartono, Derwin
Abstract--Question Answering (QA) has seen significant improvements with the advancement of machine learning models, further studies enhanced this question answering system by retrieving external information, called Retrieval-Augmented Generation (RAG) to produce more accurate and informative answers. However, these state-of-the-art-performance is predominantly in English language. T o address this gap we made an effort of bridging language gaps by incorporating Adaptive RAG system to Indonesian language. Adaptive RAG system integrates a classifier whose task is to distinguish the question complexity, which in turn determines the strategy for answering the question. T o overcome the limited availability of Indonesian language dataset, our study employs machine translation as data augmentation approach. Experiments show reliable question complexity classifier; however, we observed significant inconsistencies in multi-retrieval answering strategy which negatively impacted the overall evaluation when this strategy was applied. Recent Large Language Models (LLMs) have shown incredible performance for a lot of Natural Language tasks. However, despite the advancement of LLMs in all tasks in natural language processing, they still have problems answering questions that require a knowledge-intensive background, often resulting in hallucination answers [7]. LLMs often provide accurate answers when entities mentioned in the question are present in their training data. Furthermore, the performance of the models has a significant correlation with the entity popularity; less popular entities are often not answered accurately by LLMs [8]. Updating the LLM's knowledge frequently is not a good solution since the training of LLM with billions or even trillions of data from all over the internet takes too much time. In contrast, recent studies have demonstrated that augmenting non-parametric knowledge (information not contained in the model's training data) to the question-answering method commonly referred to as Retrieval Augmented Generation (RAG) [9], even smaller models outperform larger models in terms of parameters [10].
- Asia > Indonesia > Java > Jakarta > Jakarta (0.05)
- Asia > Indonesia > Borneo > Kalimantan > East Kalimantan > Nusantara (0.05)
- Asia > Armenia (0.04)
- (2 more...)
- Asia > China (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Africa > Eswatini > Manzini > Manzini (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
- Health & Medicine (0.67)
- Information Technology (0.67)
- Education (0.67)
Emulating Public Opinion: A Proof-of-Concept of AI-Generated Synthetic Survey Responses for the Chilean Case
González-Bustamante, Bastián, Verelst, Nando, Cisternas, Carla
Traditional public opinion surveys face a number of challenges and risks related to measurement and representation dimensions, including, for example, coverage error due to incomplete frames and hard-to-reach groups, sampling error resulting from finite samples and complex designs, nonresponse error stemming from low participation and interview fatigue, measurement error introduced by questionnaire wording, and processing errors in coding and post-survey adjustments, among others (Groves, 1989; Groves and Lyberg, 2010; Weisberg, 2005). These errors could be amplified by substantial financial, human, and logistical demands, such as time spent on instrument design, piloting, and fieldwork that often forces a cost-quality trade-off that may distort population inferences. Consequently, there is a growing demand in the social sciences and market research for methods that reduce burden and cost while maintaining and improving overall data quality. Against this backdrop, Large Language Models (LLMs), trained extensively on vast and diverse data, emerge as promising alternatives for new research possibilities and applied research, including handling the abovementioned survey research limitations and measurement and representation errors. Indeed, recent advances in generative artificial intelligence (AI) suggest LLMs could serve for a number of classification tasks, including the creation of synthetic samples, providing simulated responses reflective of broader societal attitudes and behaviours (Argyle et al., 2023; Gilardi et al., 2023; González-Bustamante, 2024). The synthetic samples specifically may leverage the ability of LLMs 2 to generate contextually informed responses based on individual-level demographic characteristics and attitudes, and, in this way, potentially emulate public opinion without direct interaction with human respondents. This methodological innovation opens new avenues for rapid data collection, experimentation with sensitive topics, and a deeper understanding of complex public opinion dynamics that complement or even partially substitute for traditional surveys. Thus, the primary objective of this working paper is to evaluate the effectiveness and reliability of LLM-generated synthetic survey responses in reflecting real-world public opinion in Chile. Specifically, we aim to assess the predictive accuracy of a number of state-of-the-art private and open-source LLMs by comparing their synthetic respondents against human probabilistic responses.
- Europe > Netherlands > South Holland > Leiden (0.41)
- South America > Chile (0.25)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (5 more...)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study (0.47)
- Government (1.00)
- Health & Medicine (0.68)
Communicative Agents for Slideshow Storytelling Video Generation based on LLMs
Fan, Jingxing, Shen, Jinrong, Yao, Yusheng, Wang, Shuangqing, Wang, Qian, Wang, Yuling
With the rapid advancement of artificial intelligence (AI), the proliferation of AI-generated content (AIGC) tasks has significantly accelerated developments in text-to-video generation. As a result, the field of video production is undergoing a transformative shift. However, conventional text-to-video models are typically constrained by high computational costs. In this study, we propose Video-Generation-Team (VGTeam), a novel slide show video generation system designed to redefine the video creation pipeline through the integration of large language models (LLMs). VGTeam is composed of a suite of communicative agents, each responsible for a distinct aspect of video generation, such as scriptwriting, scene creation, and audio design. These agents operate collaboratively within a chat tower workflow, transforming user-provided textual prompts into coherent, slide-style narrative videos. By emulating the sequential stages of traditional video production, VGTeam achieves remarkable improvements in both efficiency and scalability, while substantially reducing computational overhead. On average, the system generates videos at a cost of only $0.103, with a successful generation rate of 98.4%. Importantly, this framework maintains a high degree of creative fidelity and customization. The implications of VGTeam are far-reaching. It democratizes video production by enabling broader access to high-quality content creation without the need for extensive resources. Furthermore, it highlights the transformative potential of language models in creative domains and positions VGTeam as a pioneering system for next-generation content creation.
- Asia > China > Shanghai > Shanghai (0.05)
- Europe > United Kingdom > England > Greater London > London (0.04)
HiFACTMix: A Code-Mixed Benchmark and Graph-Aware Model for EvidenceBased Political Claim Verification in Hinglish
Thakur, Rakesh, Sharma, Sneha, Chopra, Gauri
Fact-checking in code-mixed, low-resource languages such as Hinglish remains an underexplored challenge in natural language processing. Existing fact-verification systems largely focus on high-resource, monolingual settings and fail to generalize to real-world political discourse in linguis - tically diverse regions like India. Given the widespread use of Hinglish by public figures, particularly political figures, and the growing influence of social media on public opin - ion, there's a critical need for robust, multilingual and con - text-aware fact-checking tools. To address this gap a novel benchmark HiFACT dataset is introduced with 1,500 real-world factual claims made by 28 Indian state Chief Minis - ters in Hinglish, under a highly code-mixed low-resource setting. Each claim is annotated with textual evidence and veracity labels. To evaluate this benchmark, a novel graph-aware, retrieval-augmented fact-checking model is proposed that combines multilingual contextual encoding, claim-evi - dence semantic alignment, evidence graph construction, graph neural reasoning, and natural language explanation generation. Experimental results show that HiFACTMix outperformed accuracy in comparison to state of art multi - lingual baselines models and provides faithful justifications for its verdicts. This work opens a new direction for multi - lingual, code-mixed, and politically grounded fact verifica - tion research..
Is this how the world will end? Scientists give terrifying glimpse into the 'Big Crunch' - and reveal the exact date it could happen
This means that galaxies had to be closer to each other in the past. In 1964, Wilson and Penzias discovered the cosmic background radiation, which is a like a fossil of radiation emitted during the beginning of the universe, when it was hot and dense. The cosmic background radiation is observable everywhere in the universe. The composition of the universe - that is, the the number of atoms of different elements - is consistent with the Big Bang Theory. So far, this theory is the only one that can explain why we observe an abundance of primordial elements in the universe.
Exploring a Hybrid Deep Learning Approach for Anomaly Detection in Mental Healthcare Provider Billing: Addressing Label Scarcity through Semi-Supervised Anomaly Detection
Bakker, Samirah, Ma, Yao, Ziabari, Seyed Sahand Mohammadi
The complexity of mental healthcare billing enables anomalies, including fraud. While machine learning methods have been applied to anomaly detection, they often struggle with class imbalance, label scarcity, and complex sequential patterns. This study explores a hybrid deep learning approach combining Long Short-Term Memory (LSTM) networks and Transformers, with pseudo-labeling via Isolation Forests (iForest) and Autoencoders (AE). Prior work has not evaluated such hybrid models trained on pseudo-labeled data in the context of healthcare billing. The approach is evaluated on two real-world billing datasets related to mental healthcare. The iForest LSTM baseline achieves the highest recall (0.963) on declaration-level data. On the operation-level data, the hybrid iForest-based model achieves the highest recall (0.744), though at the cost of lower precision. These findings highlight the potential of combining pseudo-labeling with hybrid deep learning in complex, imbalanced anomaly detection settings.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Europe > Netherlands > Gelderland > Arnhem (0.04)
- North America > United States > New York > Saratoga County > Saratoga Springs (0.04)
- Law Enforcement & Public Safety > Fraud (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.82)
Comparing Learning Paradigms for Egocentric Video Summarization
In this study, we investigate various computer vision paradigms - supervised learning, unsupervised learning, and prompt fine-tuning - by assessing their ability to understand and interpret egocentric video data. Specifically, we examine Shotluck Holmes (state-of-the-art supervised learning), TAC-SUM (state-of-the-art unsupervised learning), and GPT-4o (a prompt fine-tuned pre-trained model), evaluating their effectiveness in video summarization. Our results demonstrate that current state-of-the-art models perform less effectively on first-person videos compared to third-person videos, highlighting the need for further advancements in the egocentric video domain. Notably, a prompt fine-tuned general-purpose GPT-4o model outperforms these specialized models, emphasizing the limitations of existing approaches in adapting to the unique challenges of first-person perspectives. Although our evaluation is conducted on a small subset of egocentric videos from the Ego-Exo4D dataset due to resource constraints, the primary objective of this research is to provide a comprehensive proof-of-concept analysis aimed at advancing the application of computer vision techniques to first-person videos. By exploring novel methodologies and evaluating their potential, we aim to contribute to the ongoing development of models capable of effectively processing and interpreting egocentric perspectives.