AITopics | Han, Feng

Collaborating Authors

Han, Feng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gemini Embedding: Generalizable Embeddings from Gemini

Lee, Jinhyuk, Chen, Feiyang, Dua, Sahil, Cer, Daniel, Shanbhogue, Madhuri, Naim, Iftekhar, Ábrego, Gustavo Hernández, Li, Zhe, Chen, Kaifeng, Vera, Henrique Schechter, Ren, Xiaoqi, Zhang, Shanfeng, Salz, Daniel, Boratko, Michael, Han, Jay, Chen, Blair, Huang, Shuo, Rao, Vikram, Suganthan, Paul, Han, Feng, Doumanoglou, Andreas, Gupta, Nithi, Moiseev, Fedor, Yip, Cathy, Jain, Aashi, Baumgartner, Simon, Shahi, Shahrokh, Gomez, Frank Palma, Mariserla, Sandeep, Choi, Min, Shah, Parashar, Goenka, Sonam, Chen, Ke, Xia, Ye, Chen, Koert, Duddu, Sai Meher Karthik, Chen, Yichang, Walker, Trevor, Zhou, Wenlei, Ghiya, Rakesh, Gleicher, Zach, Gill, Karan, Dong, Zhe, Seyedhosseini, Mojtaba, Sung, Yunhsuan, Hoffmann, Raphael, Duerig, Tom

arXiv.org Artificial IntelligenceMar-10-2025

Embedding models, which transform inputs into dense vector representations, are pivotal for capturing semantic information across various domains and modalities. Text embedding models represent words and sentences as vectors, strategically positioning semantically similar texts in close proximity within the embedding space (Gao et al., 2021; Le and Mikolov, 2014; Reimers and Gurevych, 2019). Recent research has focused on developing general-purpose embedding models capable of excelling in diverse downstream tasks, including information retrieval, clustering, and classification (Cer et al., 2018; Muennighoff et al., 2023). Leveraging their vast pre-training knowledge, large language models (LLMs) have emerged as a promising avenue for constructing such general-purpose embedding models, with the potential to significantly enhance performance across a broad spectrum of applications (Anil et al., 2023a,b; Brown et al., 2020). The integration of LLMs has revolutionized the development of high-quality embedding models through two primary approaches. Firstly, LLMs have been employed to refine training datasets by generating higher quality examples. Techniques such as hard negative mining (Lee et al., 2024) and synthetic data generation (Dai et al., 2022; Wang et al., 2023) enable the distillation of LLM knowledge into smaller, more efficient embedding models, leading to substantial performance gains. Secondly, recognizing that the embedding model parameters are frequently initialized from language models (Devlin et al., 2019; Karpukhin et al., 2020), researchers have explored leveraging LLM parameters directly for initialization (Ni et al., 2021).

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.07891

Country:

Asia > India (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Explaining Model Overfitting in CNNs via GMM Clustering

Dou, Hui, Mu, Xinyu, Yi, Mengjun, Han, Feng, Zhao, Jian, Shen, Furao

arXiv.org Artificial IntelligenceDec-12-2024

Convolutional Neural Networks (CNNs) have demonstrated remarkable prowess in the field of computer vision. However, their opaque decision-making processes pose significant challenges for practical applications. In this study, we provide quantitative metrics for assessing CNN filters by clustering the feature maps corresponding to individual filters in the model via Gaussian Mixture Model (GMM). By analyzing the clustering results, we screen out some anomaly filters associated with outlier samples. We further analyze the relationship between the anomaly filters and model overfitting, proposing three hypotheses. This method is universally applicable across diverse CNN architectures without modifications, as evidenced by its successful application to models like AlexNet and LeNet-5. We present three meticulously designed experiments demonstrating our hypotheses from the perspectives of model behavior, dataset characteristics, and filter impacts. Through this work, we offer a novel perspective for evaluating the CNN performance and gain new insights into the operational behavior of model overfitting.

anomaly filter, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2412.10457

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs

Zhang, Rongzhi, Shen, Jiaming, Liu, Tianqi, Wang, Haorui, Qin, Zhen, Han, Feng, Liu, Jialu, Baumgartner, Simon, Bendersky, Michael, Zhang, Chao

arXiv.org Artificial IntelligenceJun-6-2024

Large Language Models (LLMs) have exhibited impressive capabilities in various tasks, yet their vast parameter sizes restrict their applicability in resource-constrained settings. Knowledge distillation (KD) offers a viable solution by transferring expertise from large teacher models to compact student models. However, traditional KD techniques face specific challenges when applied to LLMs, including restricted access to LLM outputs, significant teacher-student capacity gaps, and the inherited mis-calibration issue. In this work, we present PLaD, a novel preference-based LLM distillation framework. PLaD exploits the teacher-student capacity discrepancy to generate pseudo-preference pairs where teacher outputs are preferred over student outputs. Then, PLaD leverages a ranking loss to re-calibrate student's estimation of sequence likelihood, which steers the student's focus towards understanding the relative quality of outputs instead of simply imitating the teacher. PLaD bypasses the need for access to teacher LLM's internal states, tackles the student's expressivity limitations, and mitigates the student mis-calibration issue. Through extensive experiments on two sequence generation tasks and with various LLMs, we demonstrate the effectiveness of our proposed PLaD framework.

large language model, machine learning, student model, (16 more...)

arXiv.org Artificial Intelligence

2406.02886

Genre: Research Report (1.00)

Industry:

Information Technology (0.66)
Education (0.57)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-Scale Dilated Convolution Network for Long-Term Time Series Forecasting

Li, Feifei, Guo, Suhan, Han, Feng, Zhao, Jian, Shen, Furao

arXiv.org Artificial IntelligenceMay-14-2024

Accurate forecasting of long-term time series has important applications for decision making and planning. However, it remains challenging to capture the long-term dependencies in time series data. To better extract long-term dependencies, We propose Multi Scale Dilated Convolution Network (MSDCN), a method that utilizes a shallow dilated convolution architecture to capture the period and trend characteristics of long time series. We design different convolution blocks with exponentially growing dilations and varying kernel sizes to sample time series data at different scales. Furthermore, we utilize traditional autoregressive model to capture the linear relationships within the data. To validate the effectiveness of the proposed approach, we conduct experiments on eight challenging long-term time series forecasting benchmark datasets. The experimental results show that our approach outperforms the prior state-of-the-art approaches and shows significant inference speed improvements compared to several strong baseline methods.

data mining, forecasting, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2405.05499

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Gaussian Process-Based Learning Control of Underactuated Balance Robots with an External and Internal Convertible Modeling Structure

Han, Feng, Yi, Jingang

arXiv.org Artificial IntelligenceDec-15-2023

External and internal convertible (EIC) form-based motion control is one of the effective designs of simultaneously trajectory tracking and balance for underactuated balance robots. Under certain conditions, the EIC-based control design however leads to uncontrolled robot motion. We present a Gaussian process (GP)-based data-driven learning control for underactuated balance robots with the EIC modeling structure. Two GP-based learning controllers are presented by using the EIC structure property. The partial EIC (PEIC)-based control design partitions the robotic dynamics into a fully actuated subsystem and one reduced-order underactuated system. The null-space EIC (NEIC)-based control compensates for the uncontrolled motion in a subspace, while the other closed-loop dynamics are not affected. Under the PEIC- and NEIC-based, the tracking and balance tasks are guaranteed and convergence rate and bounded errors are achieved without causing any uncontrolled motion by the original EIC-based control. We validate the results and demonstrate the GP-based learning control design performance using two inverted pendulum platforms.

artificial intelligence, control design, eic-based control, (15 more...)

arXiv.org Artificial Intelligence

2312.10155

Country: North America > United States > New Jersey > Middlesex County > Piscataway (0.14)

Genre: Research Report (0.50)

Industry: Energy (0.48)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Video Summarization: Towards Entity-Aware Captions

Ayyubi, Hammad A., Liu, Tianqi, Nagrani, Arsha, Lin, Xudong, Zhang, Mingda, Arnab, Anurag, Han, Feng, Zhu, Yukun, Liu, Jialu, Chang, Shih-Fu

arXiv.org Artificial IntelligenceDec-1-2023

Existing popular video captioning benchmarks and models deal with generic captions devoid of specific person, place or organization named entities. In contrast, news videos present a challenging setting where the caption requires such named entities for meaningful summarization. As such, we propose the task of summarizing news video directly to entity-aware captions. We also release a large-scale dataset, VIEWS (VIdeo NEWS), to support research on this task. Further, we propose a method that augments visual information from videos with context retrieved from external world knowledge to generate entity-aware captions. We demonstrate the effectiveness of our approach on three video captioning models. We also show that our approach generalizes to existing news image captions dataset. With all the extensive experiments and insights, we believe we establish a solid basis for future research on this challenging task.

caption, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2312.02188

Country:

Europe (1.00)
Asia > Middle East > Iraq (1.00)
Africa (0.93)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)
Leisure & Entertainment (0.93)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Add feedback

Cascaded Nonlinear Control Design for Highly Underactuated Balance Robots

Han, Feng, Yi, Jingang

arXiv.org Artificial IntelligenceOct-2-2023

This paper presents a nonlinear control design for highly underactuated balance robots, which possess more numbers of unactuated degree-of-freedom (DOF) than actuated ones. To address the challenge of simultaneously trajectory tracking of actuated coordinates and balancing of unactuated coordinates, the proposed control converts a robot dynamics into a series of cascaded subsystems and each of them is considered virtually actuated. To achieve the control goal, we sequentially design and update the virtual and actual control inputs to incorporate the balance task such that the unactuated coordinates are balanced to their instantaneous equilibrium. The closed-loop dynamics are shown to be stable and the tracking errors exponentially converge towards a neighborhood near the origin. The simulation results demonstrate the effectiveness of the proposed control design by using a triple-inverted pendulum cart system.

artificial intelligence, subsystem, underactuated balance robot, (14 more...)

arXiv.org Artificial Intelligence

2309.16805

Country: North America > United States > New Mexico (0.14)

Genre: Research Report (0.70)

Industry: Energy (0.34)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Gaussian Process-Enhanced, External and Internal Convertible (EIC) Form-Based Control of Underactuated Balance Robots

Han, Feng, Yi, Jingang

arXiv.org Artificial IntelligenceSep-27-2023

External and internal convertible (EIC) form-based motion control (i.e., EIC-based control) is one of the effective approaches for underactuated balance robots. By sequentially controller design, trajectory tracking of the actuated subsystem and balance of the unactuated subsystem can be achieved simultaneously. However, with certain conditions, there exists uncontrolled robot motion under the EIC-based control. We first identify these conditions and then propose an enhanced EIC-based control with a Gaussian process data-driven robot dynamic model. Under the new enhanced EIC-based control, the stability and performance of the closed-loop system is guaranteed. We demonstrate the GP-enhanced EIC-based control experimentally using two examples of underactuated balance robots.

artificial intelligence, external and internal convertible, underactuated balance robot, (3 more...)

arXiv.org Artificial Intelligence

2309.15784

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

What do LLMs Know about Financial Markets? A Case Study on Reddit Market Sentiment Analysis

Deng, Xiang, Bashlovkina, Vasilisa, Han, Feng, Baumgartner, Simon, Bendersky, Michael

arXiv.org Artificial IntelligenceDec-21-2022

Market sentiment analysis on social media content requires knowledge of both financial markets and social media jargon, which makes it a challenging task for human raters. The resulting lack of high-quality labeled data stands in the way of conventional supervised learning methods. Instead, we approach this problem using semi-supervised learning with a large language model (LLM). Our pipeline generates weak financial sentiment labels for Reddit posts with an LLM and then uses that data to train a small model that can be served in production. We find that prompting the LLM to produce Chain-of-Thought summaries and forcing it through several reasoning paths helps generate more stable and accurate labels, while using a regression loss further improves distillation quality. With only a handful of prompts, the final model performs on par with existing supervised models. Though production applications of our model are limited by ethical considerations, the model's competitive performance points to the great potential of using LLMs for tasks that otherwise require skill-intensive annotation.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2212.11311

Country:

North America > United States (0.29)
Europe > France (0.28)

Genre: Research Report (0.50)

Industry:

Banking & Finance > Trading (0.89)
Media > News (0.65)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

C-Face: Using Compare Face on Face Hallucination for Low-Resolution Face Recognition

Han, Feng, Wang, Xudong, Shen, Furao, Zhao, Jian

Journal of Artificial Intelligence ResearchAug-16-2022

Face hallucination is a task of generating high-resolution (HR) face images from low-resolution (LR) inputs, which is a subfield of the general image super-resolution. However, most of the previous methods only consider the visual effect, ignoring how to maintain the identity of the face. In this work, we propose a novel face hallucination model, called C-Face network, which can generate HR images with high visual quality while preserving the identity information. A face recognition network is used to extract the identity features in the training process. In order to make the reconstructed face images keep the identity information to a great extent, a novel metric, i.e., C-Face loss, is proposed. We also propose a new training algorithm to deal with the convergence problem. Moreover, since our work mainly focuses on the recognition accuracy of the output, we integrate face recognition into the face hallucination process which ensures that the model can be used in real scenarios. Extensive experiments on two large scale face datasets demonstrate that our C-Face network has the best performance compared with other state-of-the-art methods.

artificial intelligence, information, machine learning, (17 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.13816

AI Access Foundation

13816

Journal of Artificial Intelligence Research

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback