South America
Infinity Learning: Learning Markov Chains from Aggregate Steady-State Observations
Gao, Jianfei, Zahran, Mohamed A., Sheoran, Amit, Fahmy, Sonia, Ribeiro, Bruno
We consider the task of learning a parametric Continuous Time Markov Chain (CTMC) sequence model without examples of sequences, where the training data consists entirely of aggregate steady-state statistics. Making the problem harder, we assume that the states we wish to predict are unobserved in the training data. Specifically, given a parametric model over the transition rates of a CTMC and some known transition rates, we wish to extrapolate its steady state distribution to states that are unobserved. A technical roadblock to learn a CTMC from its steady state has been that the chain rule to compute gradients will not work over the arbitrarily long sequences necessary to reach steady state ---from where the aggregate statistics are sampled. To overcome this optimization challenge, we propose $\infty$-SGD, a principled stochastic gradient descent method that uses randomly-stopped estimators to avoid infinite sums required by the steady state computation, while learning even when only a subset of the CTMC states can be observed. We apply $\infty$-SGD to a real-world testbed and synthetic experiments showcasing its accuracy, ability to extrapolate the steady state distribution to unobserved states under unobserved conditions (heavy loads, when training under light loads), and succeeding in difficult scenarios where even a tailor-made extension of existing methods fails.
Community Detection on Mixture Multi-layer Networks via Regularized Tensor Decomposition
Jing, Bing-Yi, Li, Ting, Lyu, Zhongyuan, Xia, Dong
We study the problem of community detection in multi-layer networks, where pairs of nodes can be related in multiple modalities. We introduce a general framework, i.e., mixture multi-layer stochastic block model (MMSBM), which includes many earlier models as special cases. We propose a tensor-based algorithm (TWIST) to reveal both global/local memberships of nodes, and memberships of layers. We show that the TWIST procedure can accurately detect the communities with small misclassification error as the number of nodes and/or the number of layers increases. Numerical studies confirm our theoretical findings. To our best knowledge, this is the first systematic study on the mixture multi-layer networks using tensor decomposition. The method is applied to two real datasets: worldwide trading networks and malaria parasite genes networks, yielding new and interesting findings.
Optimization of Retrieval Algorithms on Large Scale Knowledge Graphs
Dรถrpinghaus, Jens, Stefan, Andreas
Knowledge graphs have been shown to play an important role in recent knowledge mining and discovery, for example in the field of life sciences or bioinformatics. Although a lot of research has been done on the field of query optimization, query transformation and of course in storing and retrieving large scale knowledge graphs the field of algorithmic optimization is still a major challenge and a vital factor in using graph databases. Few researchers have addressed the problem of optimizing algorithms on large scale labeled property graphs. Here, we present two optimization approaches and compare them with a naive approach of directly querying the graph database. The aim of our work is to determine limiting factors of graph databases like Neo4j and we describe a novel solution to tackle these challenges. For this, we suggest a classification schema to differ between the complexity of a problem on a graph database. We evaluate our optimization approaches on a test system containing a knowledge graph derived biomedical publication data enriched with text mining data. This dense graph has more than 71M nodes and 850M relationships. The results are very encouraging and - depending on the problem - we were able to show a speedup of a factor between 44 and 3839.
Accenture Opens Innovation Hub in Hyderabad - Express Computer
Accenture today opened a new Innovation Hub in Hyderabad, where clients can co-innovate with Accenture by ideating, rapidly prototyping and then scaling disruptive products and services for the digital economy. The latest addition to Accenture's global innovation network, the Hyderabad Innovation Hub is spread over 300,000 square feet where clients can co-innovate and co-create solutions with more than 2,000 Accenture professionals with expertise across multiple industries and advanced technologies such as artificial intelligence, security, extended reality, automation and blockchain. "Our research shows that organizations are struggling to achieve their innovation goals, due to the lack of an enterprise-wide strategy for technology investments and adoption," said Bhaskar Ghosh, group chief executive, Accenture Technology Services. "Through our leading advanced technology capabilities, we help clients scale their technology investments and bridge the innovation achievement gap. Our Innovation Hub in Hyderabad has the pieces our clients require to accelerate value creation through enterprise-wide, game-changing innovation."
Deep Learning for Financial Applications : A Survey
Ozbayoglu, Ahmet Murat, Gudelek, Mehmet Ugur, Sezer, Omer Berat
Computational intelligence in finance has been a very popular topic for both academia and financial industry in the last few decades. Numerous studies have been published resulting in various models. Meanwhile, within the Machine Learning (ML) field, Deep Learning (DL) started getting a lot of attention recently, mostly due to its outperformance over the classical models. Lots of different implementations of DL exist today, and the broad interest is continuing. Finance is one particular area where DL models started getting traction, however, the playfield is wide open, a lot of research opportunities still exist. In this paper, we tried to provide a state-of-the-art snapshot of the developed DL models for financial applications, as of today. We not only categorized the works according to their intended subfield in finance but also analyzed them based on their DL models. In addition, we also aimed at identifying possible future implementations and highlighted the pathway for the ongoing research within the field.
MOGPTK: The Multi-Output Gaussian Process Toolkit
de Wolff, Taco, Cuevas, Alejandro, Tobar, Felipe
GPs are designed through parametrizing a covariance kernel, meaning that constructing expressive kernels allows for an improved representation of complex signals. Recent advances extend the GP concept to multiple series (or channels), where both auto-correlations and cross-correlations among channels are designed jointly; we refer to these models as multi-output GP (MOGP) models. A key attribute of MOGPs is that appropriate cross-correlations allow for improved data-imputation and prediction tasks when the channels have missing data. Popular MOGP models include: i) the Linear Model of Coregionalization (LMC) [2], ii) the Cross-Spectral Mixture (CSM) [3], iii) the Convolutional Model (CONV) [4], and iv) the Multi-Output Spectral Mixture (MOSM) [5]. Training MOGPs is challenging due to the large number of parameters required to model all the cross-correlations, and the fact that most of MOGP models are parametrized in the spectral domain, thus being prone to local minima. Therefore, a unified framework that implements these MOGPs is required both by the the GP research community as well as by those interested in practical applications for multi-channel data.
How NLP Will Unlock the Value of Data for Businesses
Natural Language Processing (NLP) has advanced to the point where it can allow humans to interact with machines in non-technical terms. This opens the door to deriving value very quickly from the most complex enterprise data management systems. Natural language processing (NLP) is a branch of AI focused on enabling humans to communicate with computers in human languages vs. computer languages. When humans speak to each other, the conversations are flexible and fluid. One typo in code syntax, and the entire function totally breaks down.
How to Build a Morally Ethical Self-Driving Car
What happens when the road gets bumpier? Not too long ago, tech enthusiasts were telling us that by 2020, we'd see self-driving cars hit the mainstream, with some 10 million on the roads. That turned out to be a wild overestimation. The actual number of vehicles in testing is thousands of times smaller, and they're still driving mostly in controlled conditions. Companies have also scaled back their ambitions, aiming more for driver support than full autonomy, just as sober-minded transport experts told us to expect. But slower development is probably just as well, as it should help improve vehicle safety and give engineers time to prepare for other threats, such as hackers turning cars into destructive weapons.
Can Artificial Intelligence Prevent Coronavirus from Spreading?
The combination of human expert and artificial intelligence can efficiently eradicate the spread of coronavirus. FREMONT, CA: The coronaviruses are a massive family of viruses that are popular among several species of animals like cattle, bats, camels, and cats. Animal coronavirus can rarely infect people, but it can spread among the public with severe acute respiratory syndrome (SARS), Middle East respiratory syndrome (MERS), and recently identified through novel coronavirus (nCOV). The first human case of the virus was recognized last year in December in the district of Wuhan, China. It is being predicted that the virus was originated from seafood. The symptoms of the infection are cough, fever, shortness of breath, and sore throat.