AITopics | Huang, Zhijie

Collaborating Authors

Huang, Zhijie

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

KALE-LM: Unleash The Power Of AI For Science Via Knowledge And Logic Enhanced Large Model

Dai, Weichen, Chen, Yezeng, Dai, Zijie, Huang, Zhijie, Liu, Yubo, Pan, Yixuan, Song, Baiyang, Zhong, Chengli, Li, Xinhe, Wang, Zeyu, Feng, Zhuoying, Zhou, Yi

arXiv.org Artificial IntelligenceSep-27-2024

In recent years, the rapid development of artificial intelligence (AI) technology has enabled it to achieve, and in some cases surpass, top human performance in various high-intelligence tasks. These include recognition in speech [1], facial [2], and image [3], games such as Go [4], StarCraft [5], and Dota2 [6], as well as tasks related to text [7], image [8], and video generation, machine translation [9], knowledge-based question answering [10], debates, and solving advanced mathematical problems [11]. Science is one of the most important fields for the application of AI. As the crown jewel of human civilization and the cornerstone of various industries, science is a core driver of human progress, and its development can significantly accelerate and even revolutionize many fields. Historically, there have been three major research paradigms in science: the first paradigm, experiment, which emerged from Newtonian empiricism; the second paradigm, theory, born from Einstein's rationalism; and the third paradigm, simulation/computation, which arose from the third industrial revolution, the computation and information revolution.

knowledge management, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2409.18695

Country: Asia (0.28)

Genre:

Research Report (0.50)
Workflow (0.46)

Industry:

Leisure & Entertainment > Games > Computer Games (0.54)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.31)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning

Chen, Zui, Chen, Yezeng, Han, Jiaqi, Huang, Zhijie, Qi, Ji, Zhou, Yi

arXiv.org Artificial IntelligenceFeb-23-2024

Large language models (LLMs) are displaying emergent abilities for math reasoning tasks,and there is a growing attention on enhancing the ability of open-source LLMs through supervised fine-tuning (SFT).In this paper, we aim to explore a general data strategy for supervised data to help optimize and expand math reasoning ability.Firstly, we determine the ability boundary of reasoning paths augmentation by identifying these paths' minimal optimal set.Secondly, we validate that different abilities of the model can be cumulatively enhanced by Mix of Minimal Optimal Sets of corresponding types of data, while our models MMOS achieve SOTA performance on series base models under much lower construction costs.Besides, we point out GSM-HARD is not really hard and today's LLMs no longer lack numerical robustness.Also, we provide an Auto Problem Generator for robustness testing and educational applications.Our code and data are publicly available at https://github.com/cyzhh/MMOS.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.00799

Country: North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Education (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation

Wu, Xinda, Huang, Zhijie, Zhang, Kejun, Yu, Jiaxing, Tan, Xu, Zhang, Tieyao, Wang, Zihao, Sun, Lingyun

arXiv.org Artificial IntelligenceSep-20-2023

Pre-trained language models have achieved impressive results in various music understanding and generation tasks. However, existing pre-training methods for symbolic melody generation struggle to capture multi-scale, multi-dimensional structural information in note sequences, due to the domain knowledge discrepancy between text and music. Moreover, the lack of available large-scale symbolic melody datasets limits the pre-training improvement. In this paper, we propose MelodyGLM, a multi-task pre-training framework for generating melodies with long-term structure. We design the melodic n-gram and long span sampling strategies to create local and global blank infilling tasks for modeling the local and global structures in melodies. Specifically, we incorporate pitch n-grams, rhythm n-grams, and their combined n-grams into the melodic n-gram blank infilling tasks for modeling the multi-dimensional structures in melodies. To this end, we have constructed a large-scale symbolic melody dataset, MelodyNet, containing more than 0.4 million melody pieces. MelodyNet is utilized for large-scale pre-training and domain-specific n-gram lexicon construction. Both subjective and objective evaluations demonstrate that MelodyGLM surpasses the standard and previous pre-training methods. In particular, subjective evaluations show that, on the melody continuation task, MelodyGLM gains average improvements of 0.82, 0.87, 0.78, and 0.94 in consistency, rhythmicity, structure, and overall quality, respectively. Notably, MelodyGLM nearly matches the quality of human-composed melodies on the melody inpainting task.

artificial intelligence, machine learning, symbolic melody generation, (2 more...)

arXiv.org Artificial Intelligence

2309.10738

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Efficient Multi-Scale Attention Module with Cross-Spatial Learning

Ouyang, Daliang, He, Su, Zhang, Guozhong, Luo, Mingzhu, Guo, Huaiyong, Zhan, Jian, Huang, Zhijie

arXiv.org Artificial IntelligenceJun-6-2023

Remarkable effectiveness of the channel or spatial attention mechanisms for producing more discernible feature representation are illustrated in various computer vision tasks. However, modeling the cross-channel relationships with channel dimensionality reduction may bring side effect in extracting deep visual representations. In this paper, a novel efficient multi-scale attention (EMA) module is proposed. Focusing on retaining the information on per channel and decreasing the computational overhead, we reshape the partly channels into the batch dimensions and group the channel dimensions into multiple sub-features which make the spatial semantic features well-distributed inside each feature group. Specifically, apart from encoding the global information to re-calibrate the channel-wise weight in each parallel branch, the output features of the two parallel branches are further aggregated by a cross-dimension interaction for capturing pixel-level pairwise relationship. We conduct extensive ablation studies and experiments on image classification and object detection tasks with popular benchmarks (e.g., CIFAR-100, ImageNet-1k, MS COCO and VisDrone2019) for evaluating its performance.

artificial intelligence, information, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICASSP49357.2023.10096516

2305.13563

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

WuYun: Exploring hierarchical skeleton-guided melody generation using knowledge-enhanced deep learning

Zhang, Kejun, Wu, Xinda, Zhang, Tieyao, Huang, Zhijie, Tan, Xu, Liang, Qihao, Wu, Songruoyao, Sun, Lingyun

arXiv.org Artificial IntelligenceMar-14-2023

Although deep learning has revolutionized music generation, existing methods for structured melody generation follow an end-to-end left-to-right note-by-note generative paradigm and treat each note equally. Here, we present WuYun, a knowledge-enhanced deep learning architecture for improving the structure of generated melodies, which first generates the most structurally important notes to construct a melodic skeleton and subsequently infills it with dynamically decorative notes into a full-fledged melody. Specifically, we use music domain knowledge to extract melodic skeletons and employ sequence learning to reconstruct them, which serve as additional knowledge to provide auxiliary guidance for the melody generation process. We demonstrate that WuYun can generate melodies with better long-term structure and musicality and outperforms other state-of-the-art methods by 0.51 on average on all subjective evaluation metrics. Our study provides a multidisciplinary lens to design melodic hierarchical structures and bridge the gap between data-driven and knowledge-based approaches for numerous music generation tasks.

artificial intelligence, machine learning, skeleton, (16 more...)

arXiv.org Artificial Intelligence

2301.04488

Country: Asia (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback