AITopics | Huang, Ying

Collaborating Authors

Huang, Ying

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AdaCM$^2$: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction

Man, Yuanbin, Huang, Ying, Zhang, Chengming, Li, Bingzhe, Niu, Wei, Yin, Miao

arXiv.org Artificial IntelligenceNov-19-2024

The advancements in large language models (LLMs) have propelled the improvement of video understanding tasks by incorporating LLMs with visual models. However, most existing LLM-based models (e.g., VideoLLaMA, VideoChat) are constrained to processing short-duration videos. Recent attempts to understand long-term videos by extracting and compressing visual features into a fixed memory size. Nevertheless, those methods leverage only visual modality to merge video tokens and overlook the correlation between visual and textual queries, leading to difficulties in effectively handling complex question-answering tasks. To address the challenges of long videos and complex prompts, we propose AdaCM$^2$, which, for the first time, introduces an adaptive cross-modality memory reduction approach to video-text alignment in an auto-regressive manner on video streams. Our extensive experiments on various video understanding tasks, such as video captioning, video question answering, and video classification, demonstrate that AdaCM$^2$ achieves state-of-the-art performance across multiple datasets while significantly reducing memory usage. Notably, it achieves a 4.5% improvement across multiple tasks in the LVU dataset with a GPU memory consumption reduction of up to 65%.

adacm 2, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2411.12593

Country:

North America > United States > Texas (0.14)
North America > United States > Michigan (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Practical considerations for variable screening in the Super Learner

Williamson, Brian D., King, Drew, Huang, Ying

arXiv.org Machine LearningNov-6-2023

Estimating a prediction function is a fundamental component of many data analyses. The Super Learner ensemble, a particular implementation of stacking, has desirable theoretical properties and has been used successfully in many applications. Dimension reduction can be accomplished by using variable screening algorithms, including the lasso, within the ensemble prior to fitting other prediction algorithms. However, the performance of a Super Learner using the lasso for dimension reduction has not been fully explored in cases where the lasso is known to perform poorly. We provide empirical results that suggest that a diverse set of candidate screening algorithms should be used to protect against poor performance of any one screen, similar to the guidance for choosing a library of prediction algorithms for the Super Learner.

artificial intelligence, machine learning, super learner, (16 more...)

arXiv.org Machine Learning

2311.03313

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Immunology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

On the robust learning mixtures of linear regressions

Huang, Ying, Chen, Liang

arXiv.org Artificial IntelligenceMay-22-2023

In this note, we consider the problem of robust learning mixtures of linear regressions. We connect mixtures of linear regressions and mixtures of Gaussians with a simple thresholding, so that a quasi-polynomial time algorithm can be obtained under some mild separation condition. This algorithm has significantly better robustness than the previous result.

artificial intelligence, linear regression, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2305.15317

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.96)

Add feedback

SsciBERT: A Pre-trained Language Model for Social Science Texts

Shen, Si, Liu, Jiangfeng, Lin, Litao, Huang, Ying, Zhang, Lin, Liu, Chang, Feng, Yutong, Wang, Dongbo

arXiv.org Artificial IntelligenceNov-24-2022

With its large-scale growth, the ways to quickly find existing research on relevant issues have become an urgent demand for researchers. Previous studies, such as SciBERT, have shown that pre-training using domain-specific texts can improve the performance of natural language processing tasks. However, the pre-trained language model for social sciences is not available so far. In light of this, the present research proposes a pre-trained model based on the abstracts published in the Social Science Citation Index (SSCI) journals.

machine learning, natural language, pre-trained model, (22 more...)

arXiv.org Artificial Intelligence

2206.0451

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Industry:

Education (0.68)
Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Estimates of daily ground-level NO2 concentrations in China based on big data and machine learning approaches

Dou, Xinyu, Liao, Cuijuan, Wang, Hengqi, Huang, Ying, Tu, Ying, Huang, Xiaomeng, Peng, Yiran, Zhu, Biqing, Tan, Jianguang, Deng, Zhu, Wu, Nana, Sun, Taochun, Ke, Piyu, Liu, Zhu

arXiv.org Artificial IntelligenceNov-17-2020

Nitrogen dioxide (NO2) is one of the most important atmospheric pollutants. However, current ground-level NO2 concentration data are lack of either high-resolution coverage or full coverage national wide, due to the poor quality of source data and the computing power of the models. To our knowledge, this study is the first to estimate the ground-level NO2 concentration in China with national coverage as well as relatively high spatiotemporal resolution (0.25 degree; daily intervals) over the newest past 6 years (2013-2018). We advanced a Random Forest model integrated K-means (RF-K) for the estimates with multi-source parameters. Besides meteorological parameters, satellite retrievals parameters, we also, for the first time, introduce socio-economic parameters to assess the impact by human activities. The results show that: (1) the RF-K model we developed shows better prediction performance than other models, with cross-validation R2 = 0.64 (MAPE = 34.78%). (2) The annual average concentration of NO2 in China showed a weak increasing trend . While in the economic zones such as Beijing-Tianjin-Hebei region, Yangtze River Delta, and Pearl River Delta, the NO2 concentration there even decreased or remained unchanged, especially in spring. Our dataset has verified that pollutant controlling targets have been achieved in these areas. With mapping daily nationwide ground-level NO2 concentrations, this study provides timely data with high quality for air quality management for China. We provide a universal model framework to quickly generate a timely national atmospheric pollutants concentration map with a high spatial-temporal resolution, based on improved machine learning methods.

artificial intelligence, concentration, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.adapen.2021.100017

2011.09013

Country:

Asia > China > Beijing > Beijing (0.25)
Asia > China > Tianjin Province > Tianjin (0.25)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Selecting Biomarkers for building optimal treatment selection rules using Kernel Machines

Dasgupta, Sayan, Huang, Ying

arXiv.org Machine LearningJun-5-2019

Optimal biomarker combinations for treatment-selection can be derived by minimizing total burden to the population caused by the targeted disease and its treatment. However, when multiple biomarkers are present, including all in the model can be expensive and hurt model performance. To remedy this, we consider feature selection in optimization by minimizing an extended total burden that additionally incorporates biomarker measurement costs. Formulating it as a 0-norm penalized weighted classification, we develop various procedures for estimating linear and nonlinear combinations. Through simulations and a real data example, we demonstrate the importance of incorporating feature-selection and marker cost when deriving treatment-selection rules.

immunology, internal medicine, prop, (20 more...)

arXiv.org Machine Learning

1906.02384

Country: North America > United States (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.93)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Internal Medicine (0.69)
Health & Medicine > Therapeutic Area > Immunology > HIV (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.53)

Add feedback

Efficient Correlated Topic Modeling with Topic Embedding

He, Junxian, Hu, Zhiting, Berg-Kirkpatrick, Taylor, Huang, Ying, Xing, Eric P.

arXiv.org Machine LearningJul-1-2017

Correlated topic modeling has been limited to small model and problem sizes due to their high computational cost and poor scaling. In this paper, we propose a new model which learns compact topic embeddings and captures topic correlations through the closeness between the topic vectors. Our method enables efficient inference in the low-dimensional embedding space, reducing previous cubic or quadratic time complexity to linear w.r.t the topic size. We further speedup variational inference with a fast sampler to exploit sparsity of topic occurrence. Extensive experiments show that our approach is capable of handling model and data scales which are several orders of magnitude larger than existing correlation results, without sacrificing modeling quality by providing competitive or superior performance in document classification and retrieval.

complexity, text processing, us government, (22 more...)

arXiv.org Machine Learning

1707.00206

Country:

Asia > Middle East (0.46)
North America > United States (0.46)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback