South America
A New Algorithm based on Extent Bit-array for Computing Formal Concepts
Zhou, Jianqin, Yang, Sichun, Wang, Xifeng, Liu, Wanquan
The emergence of Formal Concept Analysis (FCA) as a data analysis technique has increased the need for developing algorithms which can compute formal concepts quickly. The current efficient algorithms for FCA are variants of the Close-By-One (CbO) algorithm, such as In-Close2, In-Close3 and In-Close4, which are all based on horizontal storage of contexts. In this paper, based on algorithm In-Close4, a new algorithm based on the vertical storage of contexts, called In-Close5, is proposed, which can significantly reduce both the time complexity and space complexity of algorithm In-Close4. Technically, the new algorithm stores both context and extent of a concept as a vertical bit-array, while within In-Close4 algorithm the context is stored only as a horizontal bit-array, which is very slow in finding the intersection of two extent sets. Experimental results demonstrate that the proposed algorithm is much more effective than In-Close4 algorithm, and it also has a broader scope of applicability in computing formal concept in which one can solve the problems that cannot be solved by the In-Close4 algorithm.
Diagnosis of COVID-19 Using Machine Learning and Deep Learning: A review
Mondal, M. Rubaiyat Hossain, Bharati, Subrato, Podder, Prajoy
Background: This paper provides a systematic review of the application of Artificial Intelligence (AI) in the form of Machine Learning (ML) and Deep Learning (DL) techniques in fighting against the effects of novel coronavirus disease (COVID-19). Objective & Methods: The objective is to perform a scoping review on AI for COVID-19 using preferred reporting items of systematic reviews and meta-analysis (PRISMA) guidelines. A literature search was performed for relevant studies published from 1 January 2020 till 27 March 2021. Out of 4050 research papers available in reputed publishers, a full-text review of 440 articles was done based on the keywords of AI, COVID-19, ML, forecasting, DL, X-ray, and Computed Tomography (CT). Finally, 52 articles were included in the result synthesis of this paper. As part of the review, different ML regression methods were reviewed first in predicting the number of confirmed and death cases. Secondly, a comprehensive survey was carried out on the use of ML in classifying COVID-19 patients. Thirdly, different datasets on medical imaging were compared in terms of the number of images, number of positive samples and number of classes in the datasets. The different stages of the diagnosis, including preprocessing, segmentation and feature extraction were also reviewed. Fourthly, the performance results of different research papers were compared to evaluate the effectiveness of DL methods on different datasets. Results: Results show that residual neural network (ResNet-18) and densely connected convolutional network (DenseNet 169) exhibit excellent classification accuracy for X-ray images, while DenseNet-201 has the maximum accuracy in classifying CT scan images. This indicates that ML and DL are useful tools in assisting researchers and medical professionals in predicting, screening and detecting COVID-19.
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
Choi, Hyeong-Seok, Lee, Juheon, Kim, Wansoo, Lee, Jie Hwan, Heo, Hoon, Lee, Kyogu
We present a neural analysis and synthesis (NANSY) framework that can manipulate voice, pitch, and speed of an arbitrary speech signal. Most of the previous works have focused on using information bottleneck to disentangle analysis features for controllable synthesis, which usually results in poor reconstruction quality. We address this issue by proposing a novel training strategy based on information perturbation. The idea is to perturb information in the original input signal (e.g., formant, pitch, and frequency response), thereby letting synthesis networks selectively take essential attributes to reconstruct the input signal. Because NANSY does not need any bottleneck structures, it enjoys both high reconstruction quality and controllability. Furthermore, NANSY does not require any labels associated with speech data such as text and speaker information, but rather uses a new set of analysis features, i.e., wav2vec feature and newly proposed pitch feature, Yingram, which allows for fully self-supervised training. Taking advantage of fully self-supervised training, NANSY can be easily extended to a multilingual setting by simply training it with a multilingual dataset. The experiments show that NANSY can achieve significant improvement in performance in several applications such as zero-shot voice conversion, pitch shift, and time-scale modification.
3 space science questions that computing is helping to answer
Scientists have since charted these observations and scrambled to learn all they can about these elusive forces. They've detected dozens more gravitational-wave signals, and advances in computing are helping them to keep up. As a postdoc, Huerta searched for gravitational waves by tediously trying to match data collected by detectors to a catalogue of potential waveforms. He wanted to find a better way. Earlier this year Huerta, who is now a computational scientist at Argonne National Laboratory near Chicago, created an AI ensemble that's capable of processing a month's worth of LIGO data in just seven minutes.
Active clustering for labeling training data
Lutz, Quentin, de Panafieu, Élie, Scott, Alex, Stein, Maya
Gathering training data is a key step of any supervised learning task, and it is both critical and expensive. Critical, because the quantity and quality of the training data has a high impact on the performance of the learned function. Expensive, because most practical cases rely on humans-in-the-loop to label the data. The process of determining the correct labels is much more expensive than comparing two items to see whether they belong to the same class. Thus motivated, we propose a setting for training data gathering where the human experts perform the comparatively cheap task of answering pairwise queries, and the computer groups the items into classes (which can be labeled cheaply at the very end of the process). Given the items, we consider two random models for the classes: one where the set partition they form is drawn uniformly, the other one where each item chooses its class independently following a fixed distribution. In the first model, we characterize the algorithms that minimize the average number of queries required to cluster the items and analyze their complexity. In the second model, we analyze a specific algorithm family, propose as a conjecture that they reach the minimum average number of queries and compare their performance to a random approach. We also propose solutions to handle errors or inconsistencies in the experts' answers.
Zero-shot Voice Conversion via Self-supervised Prosody Representation Learning
Wang, Shijun, Kostadinov, Dimche, Borth, Damian
Voice Conversion (VC) for unseen speakers, also known as zero-shot VC, is an attractive topic due to its usefulness in real use-case scenarios. Recent work in this area made progress with disentanglement methods that separate utterance content and speaker characteristics. Although crucial, extracting disentangled prosody characteristics for unseen speakers remains an open issue. In this paper, we propose a novel self-supervised approach to effectively learn the prosody characteristics. Then, we use the learned prosodic representations to train our VC model for zero-shot conversion. Our evaluation demonstrates that we can efficiently extract disentangled prosody representation. Moreover, we show improved performance compared to the state-of-the-art zero-shot VC models.
We Are Not Users
On August 27, 2020, Amazon introduced its Amazon Halo: a technology comprised of AI software and a wristband that monitors body indicators including voice to detect problems, suggests a behavioral change, or other actions to potentially improve our health.a One day later, Elon Musk and his team presented their Neuralink technology--AI software and a skull chip implant that receives and sends signals to our brain to compensate for brain malfunctioning, aiming to solve various brain-related health problems. These announcements seem like great news amid the health crisis that engulfs many of us, with technology coming to our rescue to confront some of the most critical diseases of humankind. Yet risks remain, and once the genie is out of the bottle, they are often difficult to manage and contain--they range from unintended consequences and side effects to threats to privacy and loss or misdirection of control. Endless devices surrounding us include processors that compute and monitor our abundant but wasteful lifestyle, with generations of products getting faster, cheaper, and "better."
Solving a verbal reasoning test with word embeddings (Analogies)
I tried two different approaches, none of which is a standard functionality to calculate analogies according to the word embeddings model. But they both use functionality from the libraries to calculate the arithmetic of words and the cosine similarity. As this method is often used, I implemented it as a function to keep me in check and debug the code throughout the experiment. However, this approach is really returning the words that best fit the analogy, not the score. This method is really the one that I finally used to calculate scores for analogies.
WWE releases 2022 pay-per-view schedule
Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. WWE is ready for 2022. The pro wrestling company released its pay-per-view schedule for the next year with two more shows left on the docket for the year, Survivor Series in November and TLC: Tables Ladders & Chairs in December. MIAMI GARDENS, FL - APRIL 1: John Cena looks on before his match against Dwayne ''The Rock'' Johnson during WrestleMania XXVIII at Sun Life Stadium on April 1, 2012 in Miami Gardens, Florida.
Mining frequency-based sequential trajectory co-clusters
Santos, Yuri, Tyska, Jônata, Bogorny, Vania
Co-clustering is a specific type of clustering that addresses the problem of finding groups of objects without necessarily considering all attributes. This technique has shown to have more consistent results in high-dimensional sparse data than traditional clustering. In trajectory co-clustering, the methods found in the literature have two main limitations: first, the space and time dimensions have to be constrained by user-defined thresholds; second, elements (trajectory points) are clustered ignoring the trajectory sequence, assuming that the points are independent among them. To address the limitations above, we propose a new trajectory co-clustering method for mining semantic trajectory co-clusters. It simultaneously clusters the trajectories and their elements taking into account the order in which they appear. This new method uses the element frequency to identify candidate co-clusters. Besides, it uses an objective cost function that automatically drives the co-clustering process, avoiding the need for constraining dimensions. We evaluate the proposed approach using real-world a publicly available dataset. The experimental results show that our proposal finds frequent and meaningful contiguous sequences revealing mobility patterns, thereby the most relevant elements.