AITopics | Wang, Zhiyong

Plotting

Wang, Zhiyong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The FruitShell French synthesis system at the Blizzard 2023 Challenge

Qi, Xin, Wang, Xiaopeng, Wang, Zhiyong, Liu, Wang, Ding, Mingming, Shi, Shuchen

arXiv.org Artificial IntelligenceAug-31-2023

This paper presents a French text-to-speech synthesis system for the Blizzard Challenge 2023. The challenge consists of two tasks: generating high-quality speech from female speakers and generating speech that closely resembles specific individuals. Regarding the competition data, we conducted a screening process to remove missing or erroneous text data. We organized all symbols except for phonemes and eliminated symbols that had no pronunciation or zero duration. Additionally, we added word boundary and start/end symbols to the text, which we have found to improve speech quality based on our previous experience. For the Spoke task, we performed data augmentation according to the competition rules. We used an open-source G2P model to transcribe the French texts into phonemes. As the G2P model uses the International Phonetic Alphabet (IPA), we applied the same transcription process to the provided competition data for standardization. However, due to compiler limitations in recognizing special symbols from the IPA chart, we followed the rules to convert all phonemes into the phonetic scheme used in the competition data. Finally, we resampled all competition audio to a uniform sampling rate of 16 kHz. We employed a VITS-based acoustic model with the hifigan vocoder. For the Spoke task, we trained a multi-speaker model and incorporated speaker information into the duration predictor, vocoder, and flow layers of the model. The evaluation results of our system showed a quality MOS score of 3.6 for the Hub task and 3.4 for the Spoke task, placing our system at an average level among all participating teams.

artificial intelligence, fruitshell french synthesis system, speech synthesis, (2 more...)

arXiv.org Artificial Intelligence

2309.00223

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.53)

Add feedback

Terrain Diffusion Network: Climatic-Aware Terrain Generation with Geological Sketch Guidance

Hu, Zexin, Hu, Kun, Mo, Clinton, Pan, Lei, Wang, Zhiyong

arXiv.org Artificial IntelligenceAug-31-2023

Sketch-based terrain generation seeks to create realistic landscapes for virtual environments in various applications such as computer games, animation and virtual reality. Recently, deep learning based terrain generation has emerged, notably the ones based on generative adversarial networks (GAN). However, these methods often struggle to fulfill the requirements of flexible user control and maintain generative diversity for realistic terrain. Therefore, we propose a novel diffusion-based method, namely terrain diffusion network (TDN), which actively incorporates user guidance for enhanced controllability, taking into account terrain features like rivers, ridges, basins, and peaks. Instead of adhering to a conventional monolithic denoising process, which often compromises the fidelity of terrain details or the alignment with user control, a multi-level denoising scheme is proposed to generate more realistic terrains by taking into account fine-grained details, particularly those related to climatic patterns influenced by erosion and tectonic activities. Specifically, three terrain synthesisers are designed for structural, intermediate, and fine-grained level denoising purposes, which allow each synthesiser concentrate on a distinct terrain aspect. Moreover, to maximise the efficiency of our TDN, we further introduce terrain and sketch latent spaces for the synthesizers with pre-trained terrain autoencoders. Comprehensive experiments on a new dataset constructed from NASA Topology Images clearly demonstrate the effectiveness of our proposed method, achieving the state-of-the-art performance. Our code and dataset will be publicly available.

artificial intelligence, climatic-aware terrain generation, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2308.16725

Country: North America > United States > California (0.24)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Robust Audio Anti-Spoofing with Fusion-Reconstruction Learning on Multi-Order Spectrograms

Wen, Penghui, Hu, Kun, Yue, Wenxi, Zhang, Sen, Zhou, Wanlei, Wang, Zhiyong

arXiv.org Artificial IntelligenceAug-18-2023

Robust audio anti-spoofing has been increasingly challenging due to the recent advancements on deepfake techniques. While spectrograms have demonstrated their capability for anti-spoofing, complementary information presented in multi-order spectral patterns have not been well explored, which limits their effectiveness for varying spoofing attacks. Therefore, we propose a novel deep learning method with a spectral fusion-reconstruction strategy, namely S2pecNet, to utilise multi-order spectral patterns for robust audio anti-spoofing representations. Specifically, spectral patterns up to second-order are fused in a coarse-to-fine manner and two branches are designed for the fine-level fusion from the spectral and temporal contexts. A reconstruction from the fused representation to the input spectrograms further reduces the potential fused information loss. Our method achieved the state-of-the-art performance with an EER of 0.77% on a widely used dataset: ASVspoof2019 LA Challenge.

artificial intelligence, machine learning, spectrogram, (17 more...)

arXiv.org Artificial Intelligence

2308.09302

Country: Asia (0.28)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient and Interpretable Compressive Text Summarisation with Unsupervised Dual-Agent Reinforcement Learning

Tang, Peggy, Gao, Junbin, Zhang, Lei, Wang, Zhiyong

arXiv.org Artificial IntelligenceJun-6-2023

Recently, compressive text summarisation offers a balance between the conciseness issue of extractive summarisation and the factual hallucination issue of abstractive summarisation. However, most existing compressive summarisation methods are supervised, relying on the expensive effort of creating a new training dataset with corresponding compressive summaries. In this paper, we propose an efficient and interpretable compressive summarisation method that utilises unsupervised dual-agent reinforcement learning to optimise a summary's semantic coverage and fluency by simulating human judgment on summarisation quality. Our model consists of an extractor agent and a compressor agent, and both agents have a multi-head attentional pointer-based structure. The extractor agent first chooses salient sentences from a document, and then the compressor agent compresses these extracted sentences by selecting salient words to form a summary without using reference summaries to compute the summary reward. To our best knowledge, this is the first work on unsupervised compressive summarisation. Experimental results on three widely used datasets (e.g., Newsroom, CNN/DM, and XSum) show that our model achieves promising performance and a significant improvement on Newsroom in terms of the ROUGE metric, as well as interpretability of semantic coverage of summarisation results.

artificial intelligence, computational linguistic, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2306.03415

Country:

Europe (1.00)
Asia (0.68)
North America > United States > California (0.46)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Industry: Media > News (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adversarial Attacks on Online Learning to Rank with Click Feedback

Zuo, Jinhang, Zhang, Zhiyao, Wang, Zhiyong, Li, Shuai, Hajiesmaili, Mohammad, Wierman, Adam

arXiv.org Artificial IntelligenceMay-26-2023

Online learning to rank (OLTR) is a sequential decision-making problem where a learning agent selects an ordered list of items and receives feedback through user clicks. Although potential attacks against OLTR algorithms may cause serious losses in real-world applications, little is known about adversarial attacks on OLTR. This paper studies attack strategies against multiple variants of OLTR. Our first result provides an attack strategy against the UCB algorithm on classical stochastic bandits with binary feedback, which solves the key issues caused by bounded and discrete feedback that previous works can not handle. Building on this result, we design attack algorithms against UCB-based OLTR algorithms in position-based and cascade models. Finally, we propose a general attack strategy against any algorithm under the general click model. Each attack algorithm manipulates the learning agent into choosing the target attack item $T-o(T)$ times, incurring a cumulative cost of $o(T)$. Experiments on synthetic and real data further validate the effectiveness of our proposed attack algorithms.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.17071

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.84)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (1.00)
Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

DSMNet: Deep High-precision 3D Surface Modeling from Sparse Point Cloud Frames

Qiu, Changjie, Wang, Zhiyong, Lin, Xiuhong, Zang, Yu, Wang, Cheng, Liu, Weiquan

arXiv.org Artificial IntelligenceApr-9-2023

Existing point cloud modeling datasets primarily express the modeling precision by pose or trajectory precision rather than the point cloud modeling effect itself. Under this demand, we first independently construct a set of LiDAR system with an optical stage, and then we build a HPMB dataset based on the constructed LiDAR system, a High-Precision, Multi-Beam, real-world dataset. Second, we propose an modeling evaluation method based on HPMB for object-level modeling to overcome this limitation. In addition, the existing point cloud modeling methods tend to generate continuous skeletons of the global environment, hence lacking attention to the shape of complex objects. To tackle this challenge, we propose a novel learning-based joint framework, DSMNet, for high-precision 3D surface modeling from sparse point cloud frames. DSMNet comprises density-aware Point Cloud Registration (PCR) and geometry-aware Point Cloud Sampling (PCS) to effectively learn the implicit structure feature of sparse point clouds. Extensive experiments demonstrate that DSMNet outperforms the state-of-the-art methods in PCS and PCR on Multi-View Partial Point Cloud (MVP) database. Furthermore, the experiments on the open source KITTI and our proposed HPMB datasets show that DSMNet can be generalized as a post-processing of Simultaneous Localization And Mapping (SLAM), thereby improving modeling precision in environments with sparse point clouds.

artificial intelligence, machine learning, point cloud, (18 more...)

arXiv.org Artificial Intelligence

2304.042

Country: Europe (0.46)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Towards Efficient Visual Simplification of Computational Graphs in Deep Neural Networks

Pan, Rusheng, Wang, Zhiyong, Wei, Yating, Gao, Han, Ou, Gongchang, Cao, Caleb Chen, Xu, Jingli, Xu, Tong, Chen, Wei

arXiv.org Artificial IntelligenceDec-21-2022

A computational graph in a deep neural network (DNN) denotes a specific data flow diagram (DFD) composed of many tensors and operators. Existing toolkits for visualizing computational graphs are not applicable when the structure is highly complicated and large-scale (e.g., BERT [1]). To address this problem, we propose leveraging a suite of visual simplification techniques, including a cycle-removing method, a module-based edge-pruning algorithm, and an isomorphic subgraph stacking strategy. We design and implement an interactive visualization system that is suitable for computational graphs with up to 10 thousand elements. Experimental results and usage scenarios demonstrate that our tool reduces 60% elements on average and hence enhances the performance for recognizing and diagnosing DNN models. Our contributions are integrated into an open-source DNN visualization toolkit, namely, MindInsight [2].

artificial intelligence, computational graph, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TVCG.2022.3230832

2212.10774

Country: Asia > China (0.68)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Protein Contact Prediction by Integrating Joint Evolutionary Coupling Analysis and Supervised Learning

Ma, Jianzhu, Wang, Sheng, Wang, Zhiyong, Xu, Jinbo

arXiv.org Machine LearningApr-8-2015

Protein contacts contain important information for protein structure and functional study, but contact prediction from sequence remains very challenging. Both evolutionary coupling (EC) analysis and supervised machine learning methods are developed to predict contacts, making use of different types of information, respectively. This paper presents a group graphical lasso (GGL) method for contact prediction that integrates joint multi-family EC analysis and supervised learning. Different from existing single-family EC analysis that uses residue co-evolution information in only the target protein family, our joint EC analysis uses residue co-evolution in both the target family and its related families, which may have divergent sequences but similar folds. To implement joint EC analysis, we model a set of related protein families using Gaussian graphical models (GGM) and then co-estimate their precision matrices by maximum-likelihood, subject to the constraint that the precision matrices shall share similar residue co-evolution patterns. To further improve the accuracy of the estimated precision matrices, we employ a supervised learning method to predict contact probability from a variety of evolutionary and non-evolutionary information and then incorporate the predicted probability as prior into our GGL framework. Experiments show that our method can predict contacts much more accurately than existing methods, and that our method performs better on both conserved and family-specific contacts.

inductive learning, optimization problem, prediction, (20 more...)

arXiv.org Machine Learning

1312.2988

Country:

North America > United States (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Predicting protein contact map using evolutionary and physical constraints by integer programming (extended version)

Wang, Zhiyong, Xu, Jinbo

arXiv.org Machine LearningAug-19-2013

Motivation. Protein contact map describes the pairwise spatial and functional relationship of residues in a protein and contains key information for protein 3D structure prediction. Although studied extensively, it remains very challenging to predict contact map using only sequence information. Most existing methods predict the contact map matrix element-by-element, ignoring correlation among contacts and physical feasibility of the whole contact map. A couple of recent methods predict contact map based upon residue co-evolution, taking into consideration contact correlation and enforcing a sparsity restraint, but these methods require a very large number of sequence homologs for the protein under consideration and the resultant contact map may be still physically unfavorable. Results. This paper presents a novel method PhyCMAP for contact map prediction, integrating both evolutionary and physical restraints by machine learning and integer linear programming (ILP). The evolutionary restraints include sequence profile, residue co-evolution and context-specific statistical potential. The physical restraints specify more concrete relationship among contacts than the sparsity restraint. As such, our method greatly reduces the solution space of the contact map matrix and thus, significantly improves prediction accuracy. Experimental results confirm that PhyCMAP outperforms currently popular methods no matter how many sequence homologs are available for the protein under consideration. PhyCMAP can predict contacts within minutes after PSIBLAST search for sequence homologs is done, much faster than the two recent methods PSICOV and EvFold. See http://raptorx.uchicago.edu for the web server.

constraint, health & medicine, optimization problem, (20 more...)

arXiv.org Machine Learning

doi: 10.1093/bioinformatics/btt211

1308.1975

Country:

North America > United States (0.68)
Europe > United Kingdom > England (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Add feedback

A Multi-Party Negotiation Game for Improving Crisis Management Decision Making

Rens, Thomas (Delft University of Technology) | Jonker, Catholijn M. (Delft University of Technology) | Riemsdijk, M. Birna van (Delft University of Technology) | Wang, Zhiyong (Delft University of Technology)

AAAI ConferencesNov-1-2011

This paper presents a training game intended to train crisis management teams to negotiate collaboratively in order to reach the group goal in the best way possible. The importance of the group goal in comparison to their individual goals is touched upon as well, as are various conflicts that can occur during such a negotiation. The game, which is implemented in the Blocks World 4 Teams environment, gives a team a specific scenario and allows them to negotiate a plan of action. This plan of action is then performed by agents, after which the team members will be debriefed on their performance. An experiment, containing multiple rounds to test the effect the game has on participants, is planned in the near future.

artificial intelligence, individual goal, negotiation, (16 more...)

AAAI Conferences

2011 AAAI Fall Symposium Series

Country: Europe > Netherlands (0.29)

Industry: Leisure & Entertainment > Games (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)

Add feedback