AITopics | Yeh, Mi-Yen

Collaborating Authors

Yeh, Mi-Yen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

STAR: Spectral Truncation and Rescale for Model Merging

Lee, Yu-Ang, Ko, Ching-Yun, Pedapati, Tejaswini, Chung, I-Hsin, Yeh, Mi-Yen, Chen, Pin-Yu

arXiv.org Artificial IntelligenceFeb-14-2025

Model merging is an efficient way of obtaining a multi-task model from several pretrained models without further fine-tuning, and it has gained attention in various domains, including natural language processing (NLP). Despite the efficiency, a key challenge in model merging is the seemingly inevitable decrease in task performance as the number of models increases. In this paper, we propose $\mathbf{S}$pectral $\mathbf{T}$runcation $\mathbf{A}$nd $\mathbf{R}$escale (STAR) that aims at mitigating ``merging conflicts'' by truncating small components in the respective spectral spaces, which is followed by an automatic parameter rescaling scheme to retain the nuclear norm of the original matrix. STAR requires no additional inference on original training data and is robust to hyperparamater choice. We demonstrate the effectiveness of STAR through extensive model merging cases on diverse NLP tasks. Specifically, STAR works robustly across varying model sizes, and can outperform baselines by 4.2$\%$ when merging 12 models on Flan-T5. Our code is publicly available at https://github.com/IBM/STAR.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.10339

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Accelerating Continuous Normalizing Flow with Trajectory Polynomial Regularization

Huang, Han-Hsien, Yeh, Mi-Yen

arXiv.org Machine LearningDec-8-2020

In this paper, we propose an approach to effectively accelerating the computation of continuous normalizing flow (CNF), which has been proven to be a powerful tool for the tasks such as variational inference and density estimation. The training time cost of CNF can be extremely high because the required number of function evaluations (NFE) for solving corresponding ordinary differential equations (ODE) is very large. We think that the high NFE results from large truncation errors of solving ODEs. To address the problem, we propose to add a regularization. The regularization penalizes the difference between the trajectory of the ODE and its fitted polynomial regression. The trajectory of ODE will approximate a polynomial function, and thus the truncation error will be smaller. Furthermore, we provide two proofs and claim that the additional regularization does not harm training quality. Experimental results show that our proposed method can result in 42.3% to 71.3% reduction of NFE on the task of density estimation, and 19.3% to 32.1% reduction of NFE on variational auto-encoder, while the testing losses are not affected at all.

artificial intelligence, equation, neural network, (19 more...)

arXiv.org Machine Learning

2012.04228

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Attribute-aware Collaborative Filtering: Survey and Classification

Chen, Wen-Hao, Hsu, Chin-Chi, Lai, Yi-An, Liu, Vincent, Yeh, Mi-Yen, Lin, Shou-De

arXiv.org Machine LearningOct-20-2018

Attribute-aware CF models aims at rating prediction given not only the historical rating from users to items, but also the information associated with users (e.g. age), items (e.g. price), or even ratings (e.g. rating time). This paper surveys works in the past decade developing attribute-aware CF systems, and discovered that mathematically they can be classified into four different categories. We provide the readers not only the high level mathematical interpretation of the existing works in this area but also the mathematical insight for each category of models. Finally we provide in-depth experiment results comparing the effectiveness of the major works in each category.

matrix factorization, neural network, survey article, (21 more...)

arXiv.org Machine Learning

1810.08765

Country:

Europe (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment (0.88)
Information Technology (0.68)
Media > Film (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(2 more...)

Add feedback

PRUNE: Preserving Proximity and Global Ranking for Network Embedding

Lai, Yi-An, Hsu, Chin-Chi, Chen, Wen Hao, Yeh, Mi-Yen, Lin, Shou-De

Neural Information Processing SystemsDec-31-2017

We investigate an unsupervised generative approach for network embedding. A multi-task Siamese neural network structure is formulated to connect embedding vectors and our objective to preserve the global node ranking and local proximity of nodes. We provide deeper analysis to connect the proposed proximity objective to link prediction and community detection in the network. We show our model can satisfy the following design properties: scalability, asymmetry, unity and simplicity. Experiment results not only verify the above design properties but also demonstrate the superior performance in learning-to-rank, classification, regression, and link prediction tasks.

inductive learning, neural network, proximity, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
(4 more...)

Add feedback