Goto

Collaborating Authors

 South America


The challenges of using machine learning to identify gender in images

#artificialintelligence

In recent years, computer-driven image recognition systems that automatically recognize and classify human subjects have become increasingly widespread. These algorithmic systems are applied in many settings – from helping social media sites tell whether a user is a cat owner or dog owner to identifying individual people in crowded public spaces. A form of machine intelligence called deep learning is the basis of these image recognition systems, as well as many other artificial intelligence efforts. This essay on the lessons we learned about deep learning systems and gender recognition is one part of a three-part examination of issues relating to machine vision technology. Interactive: How does a computer "see" gender?


Variance Reduced Stochastic Proximal Algorithm for AUC Maximization

arXiv.org Machine Learning

Stochastic Gradient Descent has been widely studied with classification accuracy as a performance measure. However, these stochastic algorithms cannot be directly used when non-decomposable pairwise performance measures are used such as Area under the ROC curve (AUC) which is a common performance metric when the classes are imbalanced. There have been several algorithms proposed for optimizing AUC as a performance metric, and one of the recent being a stochastic proximal gradient algorithm (SPAM). But the downside of the stochastic methods is that they suffer from high variance leading to slower convergence. To combat this issue, several variance reduced methods have been proposed with faster convergence guarantees than vanilla stochastic gradient descent. Again, these variance reduced methods are not directly applicable when non-decomposable performance measures are used. In this paper, we develop a Variance Reduced Stochastic Proximal algorithm for AUC Maximization (\textsc{VRSPAM}) and perform a theoretical analysis as well as empirical analysis to show that our algorithm converges faster than SPAM which is the previous state-of-the-art for the AUC maximization problem.


Advances in Machine Learning for the Behavioral Sciences

arXiv.org Machine Learning

This is most apparent when auto-encoders are trained, where a network is trained to map the input data upon itself but is forced to project them into a lower-dimensional embedding space on the way (Vincent et al., 2010). In addition to the conventional fully connected layers, there are various special types of network connections. For example, in computer vision, convolu-tional layers are commonly used, which train multiple sliding windows that move over the image data and process just a part of the image at a time, thereby learning to recognize local features. These layers are subsequently abstracted into more and more complex visual patterns (Krizhevsky et al., 2017). For temporal data, one can use recurrent neural networks, which do not make predictions for individual input vectors, but for a sequence of input vectors. To do so, they allow feeding abstracted information from previous data points forward to the next layers.


Deep geometric knowledge distillation with graphs

arXiv.org Machine Learning

In most cases deep learning architectures are trained disregarding the amount of operations and energy consumption. However, some applications, like embedded systems, can be resource-constrained during inference. A popular approach to reduce the size of a deep learning architecture consists in distilling knowledge from a bigger network (teacher) to a smaller one (student). Directly training the student to mimic the teacher representation can be effective, but it requires that both share the same latent space dimensions. In this work, we focus instead on relative knowledge distillation (RKD), which considers the geometry of the respective latent spaces, allowing for dimension-agnostic transfer of knowledge. Specifically we introduce a graph-based RKD method, in which graphs are used to capture the geometry of latent spaces. Using classical computer vision benchmarks, we demonstrate the ability of the proposed method to efficiently distillate knowledge from the teacher to the student, leading to better accuracy for the same budget as compared to existing RKD alternatives.


Global Cognitive Informatics Market by Technology, Solution, Sector, Industry Vertical, and Region 2019-2024 - ResearchAndMarkets.com

#artificialintelligence

DUBLIN--(BUSINESS WIRE)--The "Cognitive Informatics Market by Technology, Solution (Smart Data, Self-Adaptive Software, Self-Correcting Infrastructure, Cognitive Analytics), Sector (Consumer, Enterprise, Industrial, Government), Industry Vertical, and Region 2019-2024" report has been added to ResearchAndMarkets.com's offering. This report assesses the cognitive informatics market including technologies, companies, strategies, and solutions. It includes analysis by industry sector and major industry verticals. It also evaluates the impact of 5G, edge computing, and IoT on the cognitive informatics market. All forecasts provide a market outlook from 2019 through 2024.


An Ophthalmologist's Guide to Deciphering Studies in Artificial Intelligence

#artificialintelligence

Deep learning, a recently described AI machine learning technique, when applied to image analysis, allows the algorithm to analyze data using multiple processing layers to extract different image features,1x1LeCun, Y., Bengio, Y., and Hinton, G. Deep learning. In ophthalmology, many groups have reported exceptional diagnostic performance using deep learning algorithms to detect various ocular conditions based on anterior segment topography (e.g., keratoconus),5x5Hwang, E.S., Perez-Straziota, C.E., Kim, S.W. et al. Distinguishing highly asymmetric keratoconus eyes using combined Scheimpflug and spectral-domain OCT analysis. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. Improved automated detection of diabetic retinopathy on a publicly available dataset through integration of deep learning. Efficacy of a deep learning system for detecting glaucomatous optic neuropathy based on color fundus photographs.


Change your singer: a transfer learning generative adversarial framework for song to song conversion

arXiv.org Machine Learning

Have you ever wondered how a song might sound if performed by a different artist? In this work, we propose SCM-GAN, an end-to-end non-parallel song conversion system powered by generative adversarial and transfer learning that allows users to listen to a selected target singer singing any song. SCM-GAN first separates songs into vocals and instrumental music using a U-Net network, then converts the vocal segments to the target singer using advanced CycleGAN-VC, before merging the converted vocals with their corresponding background music. SCM-GAN is first initialized with feature representations learned from a state-of-the-art voice-to-voice conversion and then trained on a dataset of non-parallel songs. Furthermore, SCM-GAN is evaluated against a set of metrics including global variance GV and modulation spectra MS on the 24 Mel-cepstral coefficients (MCEPs). Transfer learning improves the GV by 35% and the MS by 13% on average. A subjective comparison is conducted to test the user satisfaction with the quality and the naturalness of the conversion. Results show above par similarity between SCM-GAN's output and the target (70\% on average) as well as great naturalness of the converted songs.


Graph Neural News Recommendation with Long-term and Short-term Interest Modeling

arXiv.org Machine Learning

With the information explosion of news articles, personalized news recommendation has become important for users to quickly find news that they are interested in. Existing methods on news recommendation mainly include collaborative filtering methods which rely on direct user-item interactions and content based methods which characterize the content of user reading history. Although these methods have achieved good performances, they still suffer from data sparse problem, since most of them fail to extensively exploit high-order structure information (similar users tend to read similar news articles) in news recommendation systems. In this paper, we propose to build a heterogeneous graph to explicitly model the interactions among users, news and latent topics. The incorporated topic information would help indicate a user's interest and alleviate the sparsity of user-item interactions. Then we take advantage of graph neural networks to learn user and news representations that encode high-order structure information by propagating embeddings over the graph. The learned user embeddings with complete historic user clicks capture the users' long-term interests. We also consider a user's short-term interest using the recent reading history with an attention based LSTM model. Experimental results on real-world datasets show that our proposed model significantly outperforms state-of-the-art methods on news recommendation.


Can Neural Networks Learn Symbolic Rewriting?

arXiv.org Artificial Intelligence

This work investigates if the current neural architectures are adequate for learning symbolic rewriting. Two kinds of data sets are proposed for this research -- one based on automated proofs and the other being a synthetic set of polynomial terms. The experiments with use of the current neural machine translation models are performed and its results are discussed. Ideas for extending this line of research are proposed and its relevance is motivated.


NVIDIA Research Takes NeurIPS Attendees on AI Road Trip NVIDIA Blog

#artificialintelligence

Take a joyride through a 3D urban neighborhood that looks like Tokyo, or New York, or maybe Rio de Janeiro -- all imagined by AI. We've introduced at this week's NeurIPS conference AI research that allows developers to render fully synthetic, interactive 3D worlds. While still early stage, this work shows promise for a variety of applications, including VR, autonomous vehicle development and architecture. The tech is among several NVIDIA projects on display here in Montreal. Attendees huddled around a green and black racing chair in our booth have been wowed by the demo, which lets drivers navigate around an eight-block world rendered by the neural network.