AITopics | Yu, Qian

Collaborating Authors

Yu, Qian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation

Xing, Ximing, Yu, Qian, Wang, Chuang, Zhou, Haitao, Zhang, Jing, Xu, Dong

arXiv.org Artificial IntelligenceDec-13-2024

Recently, text-guided scalable vector graphics (SVG) synthesis has demonstrated significant potential in domains such as iconography and sketching. However, SVGs generated from existing Text-to-SVG methods often lack editability and exhibit deficiencies in visual quality and diversity. In this paper, we propose a novel text-guided vector graphics synthesis method to address these limitations. To enhance the editability of output SVGs, we introduce a Hierarchical Image VEctorization (HIVE) framework that operates at the semantic object level and supervises the optimization of components within the vector object. This approach facilitates the decoupling of vector graphics into distinct objects and component levels. Our proposed HIVE algorithm, informed by image segmentation priors, not only ensures a more precise representation of vector graphics but also enables fine-grained editing capabilities within vector objects. To improve the diversity of output SVGs, we present a Vectorized Particle-based Score Distillation (VPSD) approach. VPSD addresses over-saturation issues in existing methods and enhances sample diversity. A pre-trained reward model is incorporated to re-weight vector particles, improving aesthetic appeal and enabling faster convergence. Additionally, we design a novel adaptive vector primitives control strategy, which allows for the dynamic adjustment of the number of primitives, thereby enhancing the presentation of graphic details. Extensive experiments validate the effectiveness of the proposed method, demonstrating its superiority over baseline methods in terms of editability, visual quality, and diversity. We also show that our new method supports up to six distinct vector styles, capable of generating high-quality vector assets suitable for stylized vector design and poster design. Code and demo will be released at: http://ximinng.github.io/SVGDreamerV2Project/

machine learning, natural language, vector graphic, (19 more...)

arXiv.org Artificial Intelligence

2411.17832

Country: Asia > China (0.28)

Genre: Research Report (0.81)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion

Xing, Ximing, Hu, Juncheng, Zhang, Jing, Xu, Dong, Yu, Qian

arXiv.org Artificial IntelligenceDec-11-2024

The generation of Scalable Vector Graphics (SVG) assets from textual data remains a significant challenge, largely due to the scarcity of high-quality vector datasets and the limitations in scalable vector representations required for modeling intricate graphic distributions. This work introduces SVGFusion, a Text-to-SVG model capable of scaling to real-world SVG data without reliance on a text-based discrete language model or prolonged SDS optimization. The essence of SVGFusion is to learn a continuous latent space for vector graphics with a popular Text-to-Image framework. Specifically, SVGFusion consists of two modules: a Vector-Pixel Fusion Variational Autoencoder (VP-VAE) and a Vector Space Diffusion Transformer (VS-DiT). VP-VAE takes both the SVGs and corresponding rasterizations as inputs and learns a continuous latent space, whereas VS-DiT learns to generate a latent code within this space based on the text prompt. Based on VP-VAE, a novel rendering sequence modeling strategy is proposed to enable the latent space to embed the knowledge of construction logics in SVGs. This empowers the model to achieve human-like design capabilities in vector graphics, while systematically preventing occlusion in complex graphic compositions. Moreover, our SVGFusion's ability can be continuously improved by leveraging the scalability of the VS-DiT by adding more VS-DiT blocks. A large-scale SVG dataset is collected to evaluate the effectiveness of our proposed method. Extensive experimentation has confirmed the superiority of our SVGFusion over existing SVG generation methods, achieving enhanced quality and generalizability, thereby establishing a novel framework for SVG content creation. Code, model, and data will be released at: \href{https://ximinng.github.io/SVGFusionProject/}{https://ximinng.github.io/SVGFusionProject/}

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.10437

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.61)

Add feedback

Deep Learning for Cross-Border Transaction Anomaly Detection in Anti-Money Laundering Systems

Yu, Qian, Xu, Zhen, Ke, Zong

arXiv.org Artificial IntelligenceNov-20-2024

In the context of globalization and the rapid expansion of the digital economy, anti-money laundering (AML) has become a crucial aspect of financial oversight, particularly in cross-border transactions. The rising complexity and scale of international financial flows necessitate more intelligent and adaptive AML systems to combat increasingly sophisticated money laundering techniques. This paper explores the application of unsupervised learning models in cross-border AML systems, focusing on rule optimization through contrastive learning techniques. Five deep learning models, ranging from basic convolutional neural networks (CNNs) to hybrid CNNGRU architectures, were designed and tested to assess their performance in detecting abnormal transactions. The results demonstrate that as model complexity increases, so does the system's detection accuracy and responsiveness. In particular, the self-developed hybrid Convolutional-Recurrent Neural Integration Model (CRNIM) model showed superior performance in terms of accuracy and area under the receiver operating characteristic curve (AUROC). These findings highlight the potential of unsupervised learning models to significantly improve the intelligence, flexibility, and real-time capabilities of AML systems. By optimizing detection rules and enhancing adaptability to emerging money laundering schemes, this research provides both theoretical and practical contributions to the advancement of AML technologies, which are essential for safeguarding the global financial system against illicit activities.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2412.07027

Country: Asia (0.47)

Genre: Research Report > New Finding (0.88)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Banking & Finance (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Contrastive Learning for Knowledge-Based Question Generation in Large Language Models

Zhang, Zhenhong, Chen, Jiajing, Shi, Weiyan, Yi, Lingjie, Wang, Chihang, Yu, Qian

arXiv.org Artificial IntelligenceSep-26-2024

With the rapid development of artificial intelligence technology, especially the increasingly widespread application of question-and-answer systems, high-quality question generation has become a key component in supporting the development of these systems. This article focuses on knowledge-based question generation technology, which aims to enable computers to simulate the human questioning process based on understanding specific texts or knowledge bases. In light of the issues of hallucination and knowledge gaps present in large-scale language models when applied to knowledge-intensive tasks, this paper proposes an enhanced question generation method that incorporates contrastive learning. This method utilizes multiple models to jointly mine domain knowledge and uses contrastive learning to guide the model in reducing noise and hallucinations in generation. Experimental results show that by designing prompts containing contrasting examples, the model's performance in question generation improves considerably, particularly when contrasting instructions and examples are used simultaneously, leading to the highest quality of generated questions and improved accuracy. These results demonstrate that the method proposed in this study, which combines contrasting context and chain-of-thought prompts, can effectively improve both the quality and the practicality of question generation.

large language model, machine learning, question answering, (16 more...)

arXiv.org Artificial Intelligence

2409.13994

Country: North America > United States (0.70)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Graph Neural Network Framework for Sentiment Analysis Using Syntactic Feature

Wu, Linxiao, Luo, Yuanshuai, Zhu, Binrong, Liu, Guiran, Wang, Rui, Yu, Qian

arXiv.org Artificial IntelligenceSep-20-2024

Amidst the swift evolution of social media platforms and e-commerce ecosystems, the domain of opinion mining has surged as a pivotal area of exploration within natural language processing. A specialized segment within this field focuses on extracting nuanced evaluations tied to particular elements within textual contexts. This research advances a composite framework that amalgamates the positional cues of topical descriptors. The proposed system converts syntactic structures into a matrix format, leveraging convolutions and attention mechanisms within a graph to distill salient characteristics. Incorporating the positional relevance of descriptors relative to lexical items enhances the sequential integrity of the input. Trials have substantiated that this integrated graph-centric scheme markedly elevates the efficacy of evaluative categorization, showcasing preeminence.

machine learning, natural language, sentiment analysis, (14 more...)

arXiv.org Artificial Intelligence

2409.14

Country: North America > United States (0.30)

Genre: Research Report (0.82)

Industry: Information Technology (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Add feedback

Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity

Yu, Qian, Wang, Yining, Huang, Baihe, Lei, Qi, Lee, Jason D.

arXiv.org Artificial IntelligenceJun-27-2024

Optimization of convex functions under stochastic zeroth-order feedback has been a major and challenging question in online learning. In this work, we consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds. We propose an algorithm that features a combination of a bootstrapping stage and a mirror-descent stage. Our main technical innovation consists of a sharp characterization for the spherical-sampling gradient estimator under higher-order smoothness conditions, which allows the algorithm to optimally balance the bias-variance tradeoff, and a new iterative method for the bootstrapping stage, which maintains the performance for unbounded Hessian.

artificial intelligence, convexity and lipschitz hessian, stochastic zeroth-order optimization, (1 more...)

arXiv.org Artificial Intelligence

2406.19617

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.60)

Add feedback

Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation

Ma, Qinghe, Zhang, Jian, Qi, Lei, Yu, Qian, Shi, Yinghuan, Gao, Yang

arXiv.org Artificial IntelligenceApr-13-2024

Both limited annotation and domain shift are prevalent challenges in medical image segmentation. Traditional semi-supervised segmentation and unsupervised domain adaptation methods address one of these issues separately. However, the coexistence of limited annotation and domain shift is quite common, which motivates us to introduce a novel and challenging scenario: Mixed Domain Semi-supervised medical image Segmentation (MiDSS). In this scenario, we handle data from multiple medical centers, with limited annotations available for a single domain and a large amount of unlabeled data from multiple domains. We found that the key to solving the problem lies in how to generate reliable pseudo labels for the unlabeled data in the presence of domain shift with labeled data. To tackle this issue, we employ Unified Copy-Paste (UCP) between images to construct intermediate domains, facilitating the knowledge transfer from the domain of labeled data to the domains of unlabeled data. To fully utilize the information within the intermediate domain, we propose a symmetric Guidance training strategy (SymGD), which additionally offers direct guidance to unlabeled data by merging pseudo labels from intermediate samples. Subsequently, we introduce a Training Process aware Random Amplitude MixUp (TP-RAM) to progressively incorporate style-transition components into intermediate samples. Compared with existing state-of-the-art approaches, our method achieves a notable 13.57% improvement in Dice score on Prostate dataset, as demonstrated on three public datasets. Our code is available at https://github.com/MQinghe/MiDSS .

artificial intelligence, machine learning, unlabeled data, (17 more...)

arXiv.org Artificial Intelligence

2404.08951

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

Xing, Ximing, Wang, Chuang, Zhou, Haitao, Zhang, Jing, Yu, Qian, Xu, Dong

arXiv.org Artificial IntelligenceJan-15-2024

Even though trained mainly on images, we discover that pretrained diffusion models show impressive power in guiding sketch synthesis. In this paper, we present DiffSketcher, an innovative algorithm that creates \textit{vectorized} free-hand sketches using natural language input. DiffSketcher is developed based on a pre-trained text-to-image diffusion model. It performs the task by directly optimizing a set of B\'ezier curves with an extended version of the score distillation sampling (SDS) loss, which allows us to use a raster-level diffusion model as a prior for optimizing a parametric vectorized sketch generator. Furthermore, we explore attention maps embedded in the diffusion model for effective stroke initialization to speed up the generation process. The generated sketches demonstrate multiple levels of abstraction while maintaining recognizability, underlying structure, and essential visual details of the subject drawn. Our experiments show that DiffSketcher achieves greater quality than prior work. The code and demo of DiffSketcher can be found at https://ximinng.github.io/DiffSketcher-project/.

artificial intelligence, machine learning, sketch, (19 more...)

arXiv.org Artificial Intelligence

2306.14685

Country:

Asia (0.46)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Media > Photography (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SVGDreamer: Text Guided SVG Generation with Diffusion Model

Xing, Ximing, Zhou, Haitao, Wang, Chuang, Zhang, Jing, Xu, Dong, Yu, Qian

arXiv.org Artificial IntelligenceJan-3-2024

Recently, text-guided scalable vector graphics (SVGs) synthesis has shown promise in domains such as iconography and sketch. However, existing text-to-SVG generation methods lack editability and struggle with visual quality and result diversity. To address these limitations, we propose a novel text-guided vector graphics synthesis method called SVGDreamer. SVGDreamer incorporates a semantic-driven image vectorization (SIVE) process that enables the decomposition of synthesis into foreground objects and background, thereby enhancing editability. Specifically, the SIVE process introduce attention-based primitive control and an attention-mask loss function for effective control and manipulation of individual elements. Additionally, we propose a Vectorized Particle-based Score Distillation (VPSD) approach to tackle the challenges of color over-saturation, vector primitives over-smoothing, and limited result diversity in existing text-to-SVG generation methods. Furthermore, on the basis of VPSD, we introduce Reward Feedback Learning (ReFL) to accelerate VPSD convergence and improve aesthetic appeal. Extensive experiments have been conducted to validate the effectiveness of SVGDreamer, demonstrating its superiority over baseline methods in terms of editability, visual quality, and diversity. The code and demo of SVGDreamer can be found at \href{https://ximinng.github.io/SVGDreamer-project/}{https://ximinng.github.io/SVGDreamer-project/}.

machine learning, reinforcement learning, vector graphic, (21 more...)

arXiv.org Artificial Intelligence

2312.16476

Country: Asia (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Data Contamination Issues in Brain-to-Text Decoding

Yin, Congchi, Yu, Qian, Fang, Zhiwei, He, Jie, Peng, Changping, Lin, Zhangang, Shao, Jingping, Li, Piji

arXiv.org Artificial IntelligenceDec-26-2023

Decoding non-invasive cognitive signals to natural language has long been the goal of building practical brain-computer interfaces (BCIs). Recent major milestones have successfully decoded cognitive signals like functional Magnetic Resonance Imaging (fMRI) and electroencephalogram (EEG) into text under open vocabulary setting. However, how to split the datasets for training, validating, and testing in cognitive signal decoding task still remains controversial. In this paper, we conduct systematic analysis on current dataset splitting methods and find the existence of data contamination largely exaggerates model performance. Specifically, first we find the leakage of test subjects' cognitive signals corrupts the training of a robust encoder. Second, we prove the leakage of text stimuli causes the auto-regressive decoder to memorize information in test set. The decoder generates highly accurate text not because it truly understands cognitive signals. To eliminate the influence of data contamination and fairly evaluate different models' generalization ability, we propose a new splitting method for different types of cognitive datasets (e.g. fMRI, EEG). We also test the performance of SOTA Brain-to-Text decoding models under the proposed dataset splitting paradigm as baselines for further research.

artificial intelligence, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2312.10987

Country:

Asia > China (0.28)
North America > United States > Hawaii (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.69)
Health & Medicine > Diagnostic Medicine > Imaging (0.53)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.34)

Add feedback