AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.61)

Neural Information Processing SystemsOct-3-2025, 07:17:36 GMT

introduction reorganization, grammar correction, fluent expression as well as broader impacts for camera ready. 3 Reviewer # 1 and Reviewer # 4

The experiment is somewhat inadequate.

accuracy, introduction reorganization, reviewer, (12 more...)

Technology: Information Technology > Artificial Intelligence (0.66)

arXiv.org Artificial IntelligenceAug-27-2025

LaQual: A Novel Framework for Automated Evaluation of LLM App Quality

Wang, Yan, Hou, Xinyi, Zhao, Yanjie, Lin, Weiguo, Wang, Haoyu, Si, Junjun

LLM app stores are quickly emerging as platforms that gather a wide range of intelligent applications based on LLMs, giving users many choices for content creation, coding support, education, and more. However, the current methods for ranking and recommending apps in these stores mostly rely on static metrics like user activity and favorites, which makes it hard for users to efficiently find high-quality apps. To address these challenges, we propose LaQual, an automated framework for evaluating the quality of LLM apps. LaQual consists of three main stages: first, it labels and classifies LLM apps in a hierarchical way to accurately match them to different scenarios; second, it uses static indicators, such as time-weighted user engagement and functional capability metrics, to filter out low-quality apps; and third, it conducts a dynamic, scenario-adaptive evaluation, where the LLM itself generates scenario-specific evaluation metrics, scoring rules, and tasks for a thorough quality assessment. Experiments on a popular LLM app store show that LaQual is effective. Its automated scores are highly consistent with human judgments (with Spearman's rho of 0.62 and p=0.006 in legal consulting, and rho of 0.60 and p=0.009 in travel planning). By effectively screening, LaQual can reduce the pool of candidate LLM apps by 66.7% to 81.3%. User studies further confirm that LaQual significantly outperforms baseline systems in decision confidence, comparison efficiency (with average scores of 5.45 compared to 3.30), and the perceived value of its evaluation reports (4.75 versus 2.25). Overall, these results demonstrate that LaQual offers a scalable, objective, and user-centered solution for finding and recommending high-quality LLM apps in real-world use cases.

large language model, machine learning, natural language, (19 more...)

2508.18636

Country:

North America > United States > California (0.46)
Asia > Japan > Honshū (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (0.93)
Law (0.90)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Bernasconi, Valentine, Marfia, Gustavo

Speaking images. A novel framework for the automated self-description of artworks

arXiv.org Artificial IntelligenceJun-9-2025

Recent breakthroughs in generative AI have opened the door to new research perspectives in the domain of art and cultural heritage, where a large number of artifacts have been digitized. There is a need for innovation to ease the access and highlight the content of digital collections. Such innovations develop into creative explorations of the digital image in relation to its malleability and contemporary interpretation, in confrontation to the original historical object. Based on the concept of the autonomous image, we propose a new framework towards the production of self-explaining cultural artifacts using open-source large-language, face detection, text-to-speech and audio-to-animation models. The goal is to start from a digitized artwork and to automatically assemble a short video of the latter where the main character animates to explain its content. The whole process questions cultural biases encapsulated in large-language models, the potential of digital images and deepfakes of artworks for educational purposes, along with concerns of the field of art history regarding such creative diversions.

large language model, machine learning, natural language, (20 more...)

2506.05368

Country:

Europe (1.00)
North America > United States > New York (0.14)

Genre: Research Report (0.52)

Industry:

Media > Photography (0.54)
Education (0.46)
Health & Medicine (0.46)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Neural Information Processing SystemsMay-26-2025, 17:53:46 GMT

MTGS: A Novel Framework for Multi-Person Temporal Gaze Following and Social Gaze Prediction

Gaze following and social gaze prediction are fundamental tasks providing insights into human communication behaviors, intent, and social interactions. Most previous approaches addressed these tasks separately, either by designing highly specialized social gaze models that do not generalize to other social gaze tasks or by considering social gaze inference as an ad-hoc post-processing of the gaze following task. Furthermore, the vast majority of gaze following approaches have proposed models that can handle only one person at a time and are static, therefore failing to take advantage of social interactions and temporal dynamics. In this paper, we address these limitations and introduce a novel framework to jointly predict the gaze target and social gaze label for all people in the scene. It comprises (i) a temporal, transformer-based architecture that, in addition to frame tokens, handles person-specific tokens capturing the gaze information related to each individual; (ii) a new dataset, VSGaze, built from multiple gaze following and social gaze datasets by extending and validating head detections and tracks, and unifying annotation types. We demonstrate that our model can address and benefit from training on all tasks jointly, achieving state-of-the-art results for multi-person gaze following and social gaze prediction.

artificial intelligence, machine learning, multi-person temporal gaze, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Neural Information Processing SystemsJan-18-2025, 19:41:01 GMT

A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence

general parameterization and linear convergence, novel framework, policy mirror descent

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.65)

Wang, George Yuanji, Murugesan, Srisharan, Rohatgi, Aditya Prince

GAN-TAT: A Novel Framework Using Protein Interaction Networks in Druggable Gene Identification

arXiv.org Artificial IntelligenceDec-31-2024

Identifying druggable genes is essential for developing effective pharmaceuticals. With the availability of extensive, high-quality data, computational methods have become a significant asset. Protein Interaction Network (PIN) is valuable but challenging to implement due to its high dimensionality and sparsity. Previous methods relied on indirect integration, leading to resolution loss. This study proposes GAN-TAT, a framework utilizing an advanced graph embedding technology, ImGAGN, to directly integrate PIN for druggable gene inference work. Tested on three Pharos datasets, GAN-TAT achieved the highest AUC-ROC score of 0.951 on Tclin. Further evaluation shows that GAN-TAT's predictions are supported by clinical evidence, highlighting its potential practical applications in pharmacogenomics. This research represents a methodological attempt with the direct utilization of PIN, expanding potential new solutions for developing drug targets.

artificial intelligence, data mining, machine learning, (16 more...)

2501.01458

Country: Asia > Singapore > Central Region > Singapore (0.04)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.47)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

arXiv.org Artificial IntelligenceDec-26-2024

A novel framework for MCDM based on Z numbers and soft likelihood function

He, Yuanpeng

The optimization on the structure of process of information management under uncertain environment has attracted lots of attention from researchers around the world. Nevertheless, how to obtain accurate and rational evaluation from assessments produced by experts is still an open problem. Specially, intuitionistic fuzzy set provides an effective solution in handling indeterminate information. And Yager proposes a novel method for fusion of probabilistic evidence to handle uncertain and conflicting information lately which is called soft likelihood function. This paper devises a novel framework of soft likelihood function based on information volume of fuzzy membership and credibility measure for extracting truly useful and valuable information from uncertainty. An application is provided to verify the validity and correctness of the proposed framework. Besides, the comparisons with other existing methods further demonstrate the superiority of the novel framework of soft likelihood function.

artificial intelligence, likelihood function, machine learning, (19 more...)

2412.19321

Country:

South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
North America > United States > New York (0.04)
Asia > China (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.67)

Huang, Jia-Hong, Zhu, Hongyi, Shen, Yixian, Rudinac, Stevan, Kanoulas, Evangelos

Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models

arXiv.org Artificial IntelligenceNov-8-2024

Evaluating the quality of automatically generated image descriptions is a complex task that requires metrics capturing various dimensions, such as grammaticality, coverage, accuracy, and truthfulness. Although human evaluation provides valuable insights, its cost and time-consuming nature pose limitations. Existing automated metrics like BLEU, ROUGE, METEOR, and CIDEr attempt to fill this gap, but they often exhibit weak correlations with human judgment. To address this challenge, we propose a novel evaluation framework called Image2Text2Image, which leverages diffusion models, such as Stable Diffusion or DALL-E, for text-to-image generation. In the Image2Text2Image framework, an input image is first processed by a selected image captioning model, chosen for evaluation, to generate a textual description. Using this generated description, a diffusion model then creates a new image. By comparing features extracted from the original and generated images, we measure their similarity using a designated similarity metric. A high similarity score suggests that the model has produced a faithful textual description, while a low score highlights discrepancies, revealing potential weaknesses in the model's performance. Notably, our framework does not rely on human-annotated reference captions, making it a valuable tool for assessing image captioning models. Extensive experiments and human evaluations validate the efficacy of our proposed Image2Text2Image evaluation framework. The code and dataset will be published to support further research in the community.

artificial intelligence, caption, machine learning, (15 more...)

2411.05706

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceSep-30-2024

1 Trillion Token (1TT) Platform: A Novel Framework for Efficient Data Sharing and Compensation in Large Language Models

Park, Chanjun, Ha, Hyunsoo, Kim, Jihoo, Kim, Yungi, Kim, Dahyun, Lee, Sukyung, Yang, Seonghoon

In this paper, we propose the 1 Trillion Token Platform (1TT Platform), a novel framework designed to facilitate efficient data sharing with a transparent and equitable profit-sharing mechanism. The platform fosters collaboration between data contributors, who provide otherwise non-disclosed datasets, and a data consumer, who utilizes these datasets to enhance their own services. Data contributors are compensated in monetary terms, receiving a share of the revenue generated by the services of the data consumer. The data consumer is committed to sharing a portion of the revenue with contributors, according to predefined profit-sharing arrangements. By incorporating a transparent profit-sharing paradigm to incentivize large-scale data sharing, the 1TT Platform creates a collaborative environment to drive the advancement of NLP and LLM technologies.

arxiv preprint arxiv, contributor, data contributor, (10 more...)

2409.20149

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)