AITopics | Yang, Jinlong

Collaborating Authors

Yang, Jinlong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GarmentCrafter: Progressive Novel View Synthesis for Single-View 3D Garment Reconstruction and Editing

Wang, Yuanhao, Zhang, Cheng, Frazão, Gonçalo, Yang, Jinlong, Ichim, Alexandru-Eugen, Beeler, Thabo, De la Torre, Fernando

arXiv.org Artificial IntelligenceMar-11-2025

We introduce GarmentCrafter, a new approach that enables non-professional users to create and modify 3D garments from a single-view image. While recent advances in image generation have facilitated 2D garment design, creating and editing 3D garments remains challenging for non-professional users. Existing methods for single-view 3D reconstruction often rely on pre-trained generative models to synthesize novel views conditioning on the reference image and camera pose, yet they lack cross-view consistency, failing to capture the internal relationships across different views. In this paper, we tackle this challenge through progressive depth prediction and image warping to approximate novel views. Subsequently, we train a multi-view diffusion model to complete occluded and unknown clothing regions, informed by the evolving camera pose. By jointly inferring RGB and depth, GarmentCrafter enforces inter-view coherence and reconstructs precise geometries and fine details. Extensive experiments demonstrate that our method achieves superior visual fidelity and inter-view coherence compared to state-of-the-art single-view 3D garment reconstruction methods.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.08678

Country:

Asia (0.28)
Europe > Germany (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Natural Language (0.88)

Add feedback

Doc-Guided Sent2Sent++: A Sent2Sent++ Agent with Doc-Guided memory for Document-level Machine Translation

Guo, Jiaxin, Luo, Yuanchang, Wei, Daimeng, Zhang, Ling, Li, Zongyao, Shang, Hengchao, Rao, Zhiqiang, Li, Shaojun, Yang, Jinlong, Wu, Zhanglin, Yang, Hao

arXiv.org Artificial IntelligenceJan-14-2025

The field of artificial intelligence has witnessed significant advancements in natural language processing, largely attributed to the capabilities of Large Language Models (LLMs). These models form the backbone of Agents designed to address long-context dependencies, particularly in Document-level Machine Translation (DocMT). DocMT presents unique challenges, with quality, consistency, and fluency being the key metrics for evaluation. Existing approaches, such as Doc2Doc and Doc2Sent, either omit sentences or compromise fluency. This paper introduces Doc-Guided Sent2Sent++, an Agent that employs an incremental sentence-level forced decoding strategy \textbf{to ensure every sentence is translated while enhancing the fluency of adjacent sentences.} Our Agent leverages a Doc-Guided Memory, focusing solely on the summary and its translation, which we find to be an efficient approach to maintaining consistency. Through extensive testing across multiple languages and domains, we demonstrate that Sent2Sent++ outperforms other methods in terms of quality, consistency, and fluency. The results indicate that, our approach has achieved significant improvements in metrics such as s-COMET, d-COMET, LTCR-$1_f$, and document-level perplexity (d-ppl). The contributions of this paper include a detailed analysis of current DocMT research, the introduction of the Sent2Sent++ decoding method, the Doc-Guided Memory mechanism, and validation of its effectiveness across languages and domains.

large language model, machine learning, translation, (19 more...)

arXiv.org Artificial Intelligence

2501.08523

Country:

Europe (0.93)
Asia > China (0.46)
North America > United States > Louisiana (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

M-Ped: Multi-Prompt Ensemble Decoding for Large Language Models

Guo, Jiaxin, Wei, Daimeng, Luo, Yuanchang, Tao, Shimin, Shang, Hengchao, Li, Zongyao, Li, Shaojun, Yang, Jinlong, Wu, Zhanglin, Rao, Zhiqiang, Yang, Hao

arXiv.org Artificial IntelligenceDec-24-2024

With the widespread application of Large Language Models (LLMs) in the field of Natural Language Processing (NLP), enhancing their performance has become a research hotspot. This paper presents a novel multi-prompt ensemble decoding approach designed to bolster the generation quality of LLMs by leveraging the aggregation of outcomes from multiple prompts. Given a unique input $X$, we submit $n$ variations of prompts with $X$ to LLMs in batch mode to decode and derive probability distributions. For each token prediction, we calculate the ensemble probability by averaging the $n$ probability distributions within the batch, utilizing this aggregated probability to generate the token. This technique is dubbed Inner-Batch Ensemble. To facilitate efficient batch inference, we implement a Left-Padding strategy to maintain uniform input lengths across the n prompts. Through extensive experimentation on diverse NLP tasks, including machine translation, code generation, and text simplification, we demonstrate the efficacy of our method in enhancing LLM performance. The results show substantial improvements in BLEU scores, pass@$k$ rates, and LENS metrics over conventional methods.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2412.18299

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

FabricDiffusion: High-Fidelity Texture Transfer for 3D Garments Generation from In-The-Wild Clothing Images

Zhang, Cheng, Wang, Yuanhao, Carrasco, Francisco Vicente, Wu, Chenglei, Yang, Jinlong, Beeler, Thabo, De la Torre, Fernando

arXiv.org Artificial IntelligenceOct-2-2024

We introduce FabricDiffusion, a method for transferring fabric textures from a single clothing image to 3D garments of arbitrary shapes. Existing approaches typically synthesize textures on the garment surface through 2D-to-3D texture mapping or depth-aware inpainting via generative models. Unfortunately, these methods often struggle to capture and preserve texture details, particularly due to challenging occlusions, distortions, or poses in the input image. Inspired by the observation that in the fashion industry, most garments are constructed by stitching sewing patterns with flat, repeatable textures, we cast the task of clothing texture transfer as extracting distortion-free, tileable texture materials that are subsequently mapped onto the UV space of the garment. Building upon this insight, we train a denoising diffusion model with a large-scale synthetic dataset to rectify distortions in the input texture image. This process yields a flat texture map that enables a tight coupling with existing Physically-Based Rendering (PBR) material generation pipelines, allowing for realistic relighting of the garment under various lighting conditions. We show that FabricDiffusion can transfer various features from a single clothing image including texture patterns, material properties, and detailed prints and logos. Extensive experiments demonstrate that our model significantly outperforms state-to-the-art methods on both synthetic data and real-world, in-the-wild clothing images while generalizing to unseen textures and garment shapes.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.01801

Country: North America > United States (0.47)

Genre: Research Report (0.82)

Industry: Textiles, Apparel & Luxury Goods (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.89)

Add feedback

Context-aware and Style-related Incremental Decoding framework for Discourse-Level Literary Translation

Luo, Yuanchang, Guo, Jiaxin, Wei, Daimeng, Shang, Hengchao, Li, Zongyao, Wu, Zhanglin, Rao, Zhiqiang, Li, Shaojun, Yang, Jinlong, Yang, Hao

arXiv.org Artificial IntelligenceSep-29-2024

This report outlines our approach for the WMT24 Discourse-Level Literary Translation Task, focusing on the Chinese-English language pair in the Constrained Track. Translating literary texts poses significant challenges due to the nuanced meanings, idiomatic expressions, and intricate narrative structures inherent in such works. To address these challenges, we leveraged the Chinese-Llama2 model, specifically enhanced for this task through a combination of Continual Pre-training (CPT) and Supervised Fine-Tuning (SFT). Our methodology includes a novel Incremental Decoding framework, which ensures that each sentence is translated with consideration of its broader context, maintaining coherence and consistency throughout the text. This approach allows the model to capture long-range dependencies and stylistic elements, producing translations that faithfully preserve the original literary quality. Our experiments demonstrate significant improvements in both sentence-level and document-level BLEU scores, underscoring the effectiveness of our proposed framework in addressing the complexities of document-level literary translation.

machine learning, natural language, translation, (16 more...)

arXiv.org Artificial Intelligence

2409.16539

Country:

North America > United States (0.14)
Asia > China (0.14)
North America > Canada (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Multilingual Transfer and Domain Adaptation for Low-Resource Languages of Spain

Luo, Yuanchang, Wu, Zhanglin, Wei, Daimeng, Shang, Hengchao, Li, Zongyao, Guo, Jiaxin, Rao, Zhiqiang, Li, Shaojun, Yang, Jinlong, Xie, Yuhao, Wei, Jiawei Zheng Bin, Yang, Hao

arXiv.org Artificial IntelligenceSep-29-2024

This article introduces the submission status of the Translation into Low-Resource Languages of Spain task at (WMT 2024) by Huawei Translation Service Center (HW-TSC). We participated in three translation tasks: spanish to aragonese (es-arg), spanish to aranese (es-arn), and spanish to asturian (es-ast). For these three translation tasks, we use training strategies such as multilingual transfer, regularized dropout, forward translation and back translation, labse denoising, transduction ensemble learning and other strategies to neural machine translation (NMT) model based on training deep transformer-big architecture. By using these enhancement strategies, our submission achieved a competitive result in the final evaluation.

machine learning, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

2409.15924

Country: Europe > Spain (0.62)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Exploring the traditional NMT model and Large Language Model for chat translation

Yang, Jinlong, Shang, Hengchao, Wei, Daimeng, Guo, Jiaxin, Li, Zongyao, Wu, Zhanglin, Rao, Zhiqiang, Li, Shaojun, Xie, Yuhao, Luo, Yuanchang, Zheng, Jiawei, Wei, Bin, Yang, Hao

arXiv.org Artificial IntelligenceSep-24-2024

This paper describes the submissions of Huawei Translation Services Center(HW-TSC) to WMT24 chat translation shared task on English$\leftrightarrow$Germany (en-de) bidirection. The experiments involved fine-tuning models using chat data and exploring various strategies, including Minimum Bayesian Risk (MBR) decoding and self-training. The results show significant performance improvements in certain directions, with the MBR self-training method achieving the best results. The Large Language Model also discusses the challenges and potential avenues for further research in the field of chat translation.

large language model, machine learning, translation, (16 more...)

arXiv.org Artificial Intelligence

2409.16331

Country:

Asia > Middle East > UAE (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Add feedback

Machine Translation Advancements of Low-Resource Indian Languages by Transfer Learning

Wei, Bin, Zhen, Jiawei, Li, Zongyao, Wu, Zhanglin, Wei, Daimeng, Guo, Jiaxin, Rao, Zhiqiang, Li, Shaojun, Luo, Yuanchang, Shang, Hengchao, Yang, Jinlong, Xie, Yuhao, Yang, Hao

arXiv.org Artificial IntelligenceSep-24-2024

This paper introduces the submission by Huawei Translation Center (HW-TSC) to the WMT24 Indian Languages Machine Translation (MT) Shared Task. To develop a reliable machine translation system for low-resource Indian languages, we employed two distinct knowledge transfer strategies, taking into account the characteristics of the language scripts and the support available from existing open-source models for Indian languages. For Assamese(as) and Manipuri(mn), we fine-tuned the existing IndicTrans2 open-source model to enable bidirectional translation between English and these languages. For Khasi (kh) and Mizo (mz), We trained a multilingual model as a baseline using bilingual data from these four language pairs, along with an additional about 8kw English-Bengali bilingual data, all of which share certain linguistic features. This was followed by fine-tuning to achieve bidirectional translation between English and Khasi, as well as English and Mizo. Our transfer learning experiments produced impressive results: 23.5 BLEU for en-as, 31.8 BLEU for en-mn, 36.2 BLEU for as-en, and 47.9 BLEU for mn-en on their respective test sets. Similarly, the multilingual model transfer learning experiments yielded impressive outcomes, achieving 19.7 BLEU for en-kh, 32.8 BLEU for en-mz, 16.1 BLEU for kh-en, and 33.9 BLEU for mz-en on their respective test sets. These results not only highlight the effectiveness of transfer learning techniques for low-resource languages but also contribute to advancing machine translation capabilities for low-resource Indian languages.

machine learning, natural language, translation, (18 more...)

arXiv.org Artificial Intelligence

2409.15879

Country:

North America > Canada (0.14)
Europe > Czechia (0.14)
Asia > China (0.14)
Africa > Nigeria (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Disentangled Avatars with Hybrid 3D Representations

Feng, Yao, Liu, Weiyang, Bolkart, Timo, Yang, Jinlong, Pollefeys, Marc, Black, Michael J.

arXiv.org Artificial IntelligenceSep-12-2023

Tremendous efforts have been made to learn animatable and photorealistic human avatars. Towards this end, both explicit and implicit 3D representations are heavily studied for a holistic modeling and capture of the whole human (e.g., body, clothing, face and hair), but neither representation is an optimal choice in terms of representation efficacy since different parts of the human avatar have different modeling desiderata. For example, meshes are generally not suitable for modeling clothing and hair. Motivated by this, we present Disentangled Avatars~(DELTA), which models humans with hybrid explicit-implicit 3D representations. DELTA takes a monocular RGB video as input, and produces a human avatar with separate body and clothing/hair layers. Specifically, we demonstrate two important applications for DELTA. For the first one, we consider the disentanglement of the human body and clothing and in the second, we disentangle the face and hair. To do so, DELTA represents the body or face with an explicit mesh-based parametric 3D model and the clothing or hair with an implicit neural radiance field. To make this possible, we design an end-to-end differentiable renderer that integrates meshes into volumetric rendering, enabling DELTA to learn directly from monocular videos without any 3D supervision. Finally, we show that how these two applications can be easily combined to model full-body avatars, such that the hair, face, body and clothing can be fully disentangled yet jointly rendered. Such a disentanglement enables hair and clothing transfer to arbitrary body shapes. We empirically validate the effectiveness of DELTA's disentanglement by demonstrating its promising performance on disentangled reconstruction, virtual clothing try-on and hairstyle transfer. To facilitate future research, we also release an open-sourced pipeline for the study of hybrid human avatar modeling.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

2309.06441

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SCULPT: Shape-Conditioned Unpaired Learning of Pose-dependent Clothed and Textured Human Meshes

Sanyal, Soubhik, Ghosh, Partha, Yang, Jinlong, Black, Michael J., Thies, Justus, Bolkart, Timo

arXiv.org Artificial IntelligenceAug-21-2023

We present SCULPT, a novel 3D generative model for clothed and textured 3D meshes of humans. Specifically, we devise a deep neural network that learns to represent the geometry and appearance distribution of clothed human bodies. Training such a model is challenging, as datasets of textured 3D meshes for humans are limited in size and accessibility. Our key observation is that there exist medium-sized 3D scan datasets like CAPE, as well as large-scale 2D image datasets of clothed humans and multiple appearances can be mapped to a single geometry. To effectively learn from the two data modalities, we propose an unpaired learning procedure for pose-dependent clothed and textured human meshes. Specifically, we learn a pose-dependent geometry space from 3D scan data. We represent this as per vertex displacements w.r.t. the SMPL model. Next, we train a geometry conditioned texture generator in an unsupervised way using the 2D image data. We use intermediate activations of the learned geometry model to condition our texture generator. To alleviate entanglement between pose and clothing type, and pose and clothing appearance, we condition both the texture and geometry generators with attribute labels such as clothing types for the geometry, and clothing colors for the texture generator. We automatically generated these conditioning labels for the 2D images based on the visual question answering model BLIP and CLIP. We validate our method on the SCULPT dataset, and compare to state-of-the-art 3D generative models for clothed human bodies. We will release the codebase for research purposes.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2308.10638

Country: Asia (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.54)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback