AITopics | ugc

Collaborating Authors

ugc

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

UGC: Universal Graph Coarsening

Neural Information Processing SystemsNov-19-2025, 17:03:59 GMT

However, graph sizes often become unwieldy, leading to storage, computation, and analysis challenges.

data mining, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(4 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
(2 more...)

Add feedback

From Detection to Discovery: A Closed-Loop Approach for Simultaneous and Continuous Medical Knowledge Expansion and Depression Detection on Social Media

Geng, Shuang, Zhang, Wenli, Xie, Jiaheng, Wang, Rui, Ram, Sudha

arXiv.org Artificial IntelligenceOct-29-2025

Social media user-generated content (UGC) provides real-time, self-reported indicators of mental health conditions such as depression, offering a valuable source for predictive analytics. While prior studies integrate medical knowledge to improve prediction accuracy, they overlook the opportunity to simultaneously expand such knowledge through predictive processes. We develop a Closed-Loop Large Language Model (LLM)-Knowledge Graph framework that integrates prediction and knowledge expansion in an iterative learning cycle. In the knowledge-aware depression detection phase, the LLM jointly performs depression detection and entity extraction, while the knowledge graph represents and weights these entities to refine prediction performance. In the knowledge refinement and expansion phase, new entities, relationships, and entity types extracted by the LLM are incorporated into the knowledge graph under expert supervision, enabling continual knowledge evolution. Using large-scale UGC, the framework enhances both predictive accuracy and medical understanding. Expert evaluations confirmed the discovery of clinically meaningful symptoms, comorbidities, and social triggers complementary to existing literature. We conceptualize and operationalize prediction-through-learning and learning-through-prediction as mutually reinforcing processes, advancing both methodological and theoretical understanding in predictive analytics. The framework demonstrates the co-evolution of computational models and domain knowledge, offering a foundation for adaptive, data-driven knowledge systems applicable to other dynamic risk monitoring contexts.

data mining, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.23626

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > Iowa > Story County > Ames (0.04)
(4 more...)

Genre:

Overview (0.92)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (0.87)
Energy > Renewable > Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Advanced Geothermal System (AGS) (0.63)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

733209a1f12071a7ec979e8ffaeb1d99-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 06:07:08 GMT

data mining, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Texas (0.04)
North America > United States > Wisconsin (0.04)
(9 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
(2 more...)

Add feedback

Leveraging Multi-Source Textural UGC for Neighbourhood Housing Quality Assessment: A GPT-Enhanced Framework

Hong, Qiyuan, Zhao, Huimin, Long, Ying

arXiv.org Artificial IntelligenceAug-26-2025

This study leverages GPT-4o to assess neighbourhood housing quality using multi-source textural user-generated content (UGC) from Dianping, Weibo, and the Government Message Board. The analysis involves filtering relevant texts, extracting structured evaluation units, and conducting sentiment scoring. A refined housing quality assessment system with 46 indicators across 11 categories was developed, highlighting an objective-subjective method gap and platform-specific differences in focus. GPT-4o outperformed rule-based and BERT models, achieving 92.5% accuracy in fine-tuned settings. The findings underscore the value of integrating UGC and GPT-driven analysis for scalable, resident-centric urban assessments, offering practical insights for policymakers and urban planners.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2508.16657

Country: Asia > China > Beijing > Beijing (0.08)

Genre: Research Report > New Finding (0.35)

Industry:

Education (1.00)
Government (0.90)
Health & Medicine > Therapeutic Area (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)

Add feedback

Aligning Large Language Models with Implicit Preferences from User-Generated Content

Tan, Zhaoxuan, Li, Zheng, Liu, Tianyi, Wang, Haodong, Yun, Hyokun, Zeng, Ming, Chen, Pei, Zhang, Zhihan, Gao, Yifan, Wang, Ruijie, Nigam, Priyanka, Yin, Bing, Jiang, Meng

arXiv.org Artificial IntelligenceJun-6-2025

Learning from preference feedback is essential for aligning large language models (LLMs) with human values and improving the quality of generated responses. However, existing preference learning methods rely heavily on curated data from humans or advanced LLMs, which is costly and difficult to scale. In this work, we present PUGC, a novel framework that leverages implicit human Preferences in unlabeled User-Generated Content (UGC) to generate preference data. Although UGC is not explicitly created to guide LLMs in generating human-preferred responses, it often reflects valuable insights and implicit preferences from its creators that has the potential to address readers' questions. PUGC transforms UGC into user queries and generates responses from the policy model. The UGC is then leveraged as a reference text for response scoring, aligning the model with these implicit preferences. This approach improves the quality of preference data while enabling scalable, domain-specific alignment. Experimental results on Alpaca Eval 2 show that models trained with DPO and PUGC achieve a 9.37% performance improvement over traditional methods, setting a 35.93% state-of-the-art length-controlled win rate using Mistral-7B-Instruct. Further studies highlight gains in reward quality, domain-specific alignment effectiveness, robustness against UGC quality, and theory of mind capabilities. Our code and dataset are available at https://zhaoxuan.info/PUGC.github.io/

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.04463

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can Large Language Models Understand Internet Buzzwords Through User-Generated Content

Huang, Chen, Luo, Junkai, Wang, Xinzuo, Lei, Wenqiang, Lv, Jiancheng

arXiv.org Artificial IntelligenceMay-22-2025

The massive user-generated content (UGC) available in Chinese social media is giving rise to the possibility of studying internet buzzwords. In this paper, we study if large language models (LLMs) can generate accurate definitions for these buzzwords based on UGC as examples. Our work serves a threefold contribution. First, we introduce CHEER, the first dataset of Chinese internet buzzwords, each annotated with a definition and relevant UGC. Second, we propose a novel method, called RESS, to effectively steer the comprehending process of LLMs to produce more accurate buzzword definitions, mirroring the skills of human language learning. Third, with CHEER, we benchmark the strengths and weaknesses of various off-the-shelf definition generation methods and our RESS. Our benchmark demonstrates the effectiveness of RESS while revealing crucial shared challenges: over-reliance on prior exposure, underdeveloped inferential abilities, and difficulty identifying high-quality UGC to facilitate comprehension. We believe our work lays the groundwork for future advancements in LLM-based definition generation. Our dataset and code are available at https://github.com/SCUNLP/Buzzword.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.15071

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Safeguarding Marketing Research: The Generation, Identification, and Mitigation of AI-Fabricated Disinformation

Mukherjee, Anirban

arXiv.org Artificial IntelligenceMar-17-2024

Generative AI has ushered in the ability to generate content that closely mimics human contributions, introducing an unprecedented threat: Deployed en masse, these models can be used to manipulate public opinion and distort perceptions, resulting in a decline in trust towards digital platforms. This study contributes to marketing literature and practice in three ways. First, it demonstrates the proficiency of AI in fabricating disinformative user-generated content (UGC) that mimics the form of authentic content. Second, it quantifies the disruptive impact of such UGC on marketing research, highlighting the susceptibility of analytics frameworks to even minimal levels of disinformation. Third, it proposes and evaluates advanced detection frameworks, revealing that standard techniques are insufficient for filtering out AI-generated disinformation. We advocate for a comprehensive approach to safeguarding marketing research that integrates advanced algorithmic solutions, enhanced human oversight, and a reevaluation of regulatory and ethical frameworks. Our study seeks to serve as a catalyst, providing a foundation for future research and policy-making aimed at navigating the intricate challenges at the nexus of technology, ethics, and marketing.

disinformation, disinformative ugc, ugc, (16 more...)

arXiv.org Artificial Intelligence

2403.14706

Country:

North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.67)

Industry: Media > News (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Data Science > Data Mining (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.88)

Add feedback

Gradient Estimation for Binary Latent Variables via Gradient Variance Clipping

Kunes, Russell Z., Yin, Mingzhang, Land, Max, Haviv, Doron, Pe'er, Dana, Tavaré, Simon

arXiv.org Artificial IntelligenceAug-12-2022

Gradient estimation is often necessary for fitting generative models with discrete latent variables, in contexts such as reinforcement learning and variational autoencoder (VAE) training. The DisARM estimator (Yin et al. 2020; Dong, Mnih, and Tucker 2020) achieves state of the art gradient variance for Bernoulli latent variable models in many contexts. However, DisARM and other estimators have potentially exploding variance near the boundary of the parameter space, where solutions tend to lie. To ameliorate this issue, we propose a new gradient estimator \textit{bitflip}-1 that has lower variance at the boundaries of the parameter space. As bitflip-1 has complementary properties to existing estimators, we introduce an aggregated estimator, \textit{unbiased gradient variance clipping} (UGC) that uses either a bitflip-1 or a DisARM gradient update for each coordinate. We theoretically prove that UGC has uniformly lower variance than DisARM. Empirically, we observe that UGC achieves the optimal value of the optimization objectives in toy experiments, discrete VAE training, and in a best subset selection problem.

estimator, gradient, variance, (14 more...)

arXiv.org Artificial Intelligence

2208.06124

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Machine Translation for User-Generated Content

#artificialintelligenceApr-3-2020, 23:29:34 GMT

A specific use case worth exploring in this regard is MT for User Generated Content (UGC). Because of the speed with which UGC (comments, feedback, reviews) is being created and the corresponding costs of its professional translation, many organizations turn to MT. Popular examples of such companies are Skype (in addition to text translation, Microsoft developed the Automatic Speech Recognition (ASR) for audio speech translation in Skype) and Facebook. The social network is aiming to solve the challenge of fine-tuning each system relating to a specific language pair, using neural machine translation (NMT) and benefiting from various contexts for translations. One solution that tackles this issue is the technology developed by Language I/O. It takes into account the client's glossaries and TMs, selects the best MT engine output and then improves on the results using cultural intelligence and/or human linguists who compare machine translations post-facto to ensure that their MT Optimizer engine learns over time.

artificial intelligence, machine translation, natural language, (5 more...)

#artificialintelligence

Industry: Information Technology (0.59)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Using machine learning to yield useful market insight - Market Business News

#artificialintelligenceFeb-10-2019, 09:45:38 GMT

Gauging consumer needs is essential in marketing. Focus groups, interviews and surveys are currently the most common means of gathering this data. But the process can be time-consuming and expensive. The advent of machine learning technology and artificial intelligence (AI) has sparked interest in using the technology to yield valuable insights into consumer wants. Researchers at MIT devised a method of efficiently identifying customer needs from user-generated content (UCG) with machine learning, according to a study published in Marketing Science.

artificial intelligence, machine learning, useful market insight, (8 more...)

#artificialintelligence

Genre:

Questionnaire & Opinion Survey (0.59)
Research Report > New Finding (0.41)

Industry: Media > News (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback