Goto

Collaborating Authors

 Country


Global Financial Services Corporation Chooses Finn AI to Optimize Cust

#artificialintelligence

Finn AI, the world's leading AI-powered conversational banking technology provider, today announced that one of the world's largest financial services corporations has chosen Finn AI to help improve customer service and to enhance customer acquisition workflows. Under the terms of the engagement, Finn AI has developed a virtual assistant to pre-qualify prospects for the company's personal and small business banking products, ensuring human sales agents are engaged only when an inquiry is sales-based, thus reducing the cost to acquire. The virtual assistant also interacts with customers outside of regular call center hours, allowing the financial institution to extend their support hours to a 24/7 model. Additionally, the virtual assistant is used to expedite the product application process for new and existing customers, providing information about banking products and in-the-moment guidance, through to applying for a product. "This customer engagement is an excellent example of how Finn AI can help financial institutions achieve very specific, and often complex, business objectives," said Jake Tyler, CEO at Finn AI. "This particular use case leverages a number of our pre-built Customer Acquisition features including Smart Routing, Product Comparison, and Product Recommender. Used in combination, the customer is able to reduce distractions to sales agents while improving the consumer experience."


Supply of AI workers failing to meet demand - Government News

#artificialintelligence

The government must take strategic action to ensure the nation's AI workforce will meet future demands because current supply is falling short, a new report warns. The report from CSIRO's data sciences arm Data61 focuses on how the nation can capture the full potential of artificial intelligence technology, which is already being used in a wide range of fields. The Artificial Intelligence: Solving problems, growing the economy and improving our quality of life report found that Australia currently has 6,600 AI specialist workers, which is up from 650 AI workers in 2014 and is predicted to grow. However it is well short of the up to 160,000 workers that may be required in the next ten years. "We estimate that by 2030 Australian industry will require a workforce of between 32,000 to 161,000 employees in computer vision, robotics, human language technologies, data science and other areas of AI expertise," the report says.


Singapore wants to be a 'living lab' for global AI solutions: Vivian Balakrishnan

#artificialintelligence

BARCELONA: Singapore hopes to be a "living laboratory" for developing artificial intelligence (AI) solutions globally - an ambition that plays to its strengths, according to Minister-in-charge of the Smart Nation initiative Vivian Balakrishnan. Dr Balakrishnan, who is also Foreign Affairs Minister, was speaking at the opening session of the Smart City Expo World Congress in Barcelona on Tuesday (Nov 19), where he made a pitch for Singapore's attractiveness as an AI hub. For example, testing solutions in Singapore would be facilitated by agile regulations, as the country has just "a single layer of government". "We understand science, technology, engineering. We get it and we're able to make decisions quickly, pivot instantly and seek opportunities that new technology will provide," said Dr Balakrishnan.


Deep-seismic-prior-based reconstruction of seismic data using convolutional neural networks

arXiv.org Machine Learning

Reconstruction of seismic data with missing traces is a long-standing issue in seismic data processing. In recent years, rank reduction operations are being commonly utilized to overcome this problem, which require the rank of seismic data to be a prior. However, the rank of field data is unknown; usually it requires much time to manually adjust the rank and just obtain an approximated rank. Methods based on deep learning require very large datasets for training; however acquiring large datasets is difficult owing to physical or financial constraints in practice. Therefore, in this work, we developed a novel method based on unsupervised learning using the intrinsic properties of a convolutional neural network known as U-net, without training datasets. Only one undersampled seismic data was needed, and the deep seismic prior of input data could be exploited by the network itself, thus making the reconstruction convenient. Furthermore, this method can handle both irregular and regular seismic data. Synthetic and field data were tested to assess the performance of the proposed algorithm (DSPRecon algorithm); the advantages of using our method were evaluated by comparing it with the singular spectrum analysis (SSA) method for irregular data reconstruction and de-aliased Cadzow method for regular data reconstruction. Experimental results showed that our method provided better reconstruction performance than the SSA or Cadzow methods. The recovered signal-to-noise ratios (SNRs) were 32.68 dB and 19.11 dB for the DSPRecon and SSA algorithms, respectively. Those for the DSPRecon and Cadzow methods were 35.91 dB and 15.32 dB, respectively.


Shapelets for earthquake detection

arXiv.org Machine Learning

This paper introduces EQShapelets (EarthQuake Shapelets) a time-series shape-based approach embedded in machine learning to autonomously detect earthquakes. It promises to overcome the challenges in the field of seismology related to automated detection and cataloging of earthquakes. EQShapelets are amplitude and phase-independent, i.e., their detection sensitivity is irrespective of the magnitude of the earthquake and the time of occurrence. They are also robust to noise and other spurious signals. The detection capability of EQShapelets is tested on one week of continuous seismic data provided by the Northern California Seismic Network (NCSN) obtained from a station in central California near the Calaveras Fault. EQShapelets combined with a Random Forest classifier, detected all of the cataloged earthquakes and 281 uncataloged events with lower false detection rate thus offering a better performance than autocorrelation and FAST algorithms. The primary advantage of EQShapelets over competing methods is the interpretability and insight it offers. Shape-based approaches are intuitive, visually meaningful and offers immediate insight into the problem domain that goes beyond their use in accurate detection. EQShapelets, if implemented at a large scale, can significantly reduce catalog completeness magnitudes and can serve as an effective tool for near real-time earthquake monitoring and cataloging.


DPM: A deep learning PDE augmentation method (with application to large-eddy simulation)

arXiv.org Machine Learning

DPM: A deep learning PDE augmentation method (with application to large-eddy simulation) Jonathan B. Freund, Jonathan F. MacArt †, and Justin Sirignano ‡§ November 22, 2019 Abstract Machine learning for scientific applications faces the challenge of limited data. We propose a framework that leverages a priori known physics to reduce overfitting when training on relatively small datasets. A deep neural network is embedded in a partial differential equation (PDE) that expresses the known physics and learns to describe the corresponding unknown or unrepresented physics from the data. Crafted as such, the neural network can also provide corrections for erroneously represented physics, such as discretization errors associated with the PDE's numerical solution. Once trained, the deep learning PDE model (DPM) can make out-of-sample predictions for new physical parameters, geometries, and boundary conditions. Estimating the embedded neural network requires optimizing over the entire PDE, which itself is a function of the neural network. Adjoint partial differential equations are used to efficiently calculate the high-dimensional gradient of the objective function with respect to the neural network parameters. A stochastic adjoint method (SAM), similar in spirit to stochastic gradient descent, further accelerates training. The approach is demonstrated and evaluated for turbulence predictions using large-eddy simulation (LES), a filtered version of the Navier-Stokes equation containing unclosed sub-filter-scale terms. High-fidelity direct numerical simulations (DNS) of decaying isotropic turbulence provide the training and testing data. The DPM outperforms the widely-used constant-coefficient and dynamic Smagorinsky models, even for filter sizes so large that these established models become qualitatively incorrect. It also significantly outperforms a priori trained models, which do not account for the full PDE. For comparable accuracy, the overall cost is reduced. Simulations of the DPM are accelerated by efficient GPU implementations of network evaluations. Measures of discretization errors, which are well-known to be consequential in LES, suggest that the ability of the training formulation to correct for these errors Mechanical Science & Engineering and Aerospace Engineering, University of Illinois at Urbana-Champaign, jbfre-und@illinois.edu


Gradient-based Optimization for Bayesian Preference Elicitation

arXiv.org Artificial Intelligence

Effective techniques for eliciting user preferences have taken on added importance as recommender systems (RSs) become increasingly interactive and conversational. A common and conceptually appealing Bayesian criterion for selecting queries is expected value of information (EVOI) . Unfortunately, it is computationally prohibitive to construct queries with maximum EVOI in RSs with large item spaces. We tackle this issue by introducing a continuous formulation of EVOI as a differentiable network that can be optimized using gradient methods available in modern machine learning (ML) computational frameworks (e.g., TensorFlow, PyTorch). We exploit this to develop a novel, scalable Monte Carlo method for EVOI optimization, which is more scalable for large item spaces than methods requiring explicit enumeration of items. While we emphasize the use of this approach for pairwise (or k -wise) comparisons of items, we also demonstrate how our method can be adapted to queries involving subsets of item attributes or "partial items," which are often more cognitively manageable for users. Experiments show that our gradient-based EVOI technique achieves state-of-the-art performance across several domains while scaling to large item spaces.


Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization

arXiv.org Machine Learning

Overparameterized neural networks can be highly accurate on average on an i.i.d. test set yet consistently fail on atypical groups of the data (e.g., by learning spurious correlations that hold on average but not in such groups). Distributionally robust optimization (DRO) allows us to learn models that instead minimize the worst-case training loss over a set of pre-defined groups. However, we find that naively applying group DRO to overparameterized neural networks fails: these models can perfectly fit the training data, and any model with vanishing average training loss also already has vanishing worst-case training loss. Instead, their poor worst-case performance arises from poor generalization on some groups. By coupling group DRO models with increased regularization---stronger-than-typical $\ell_2$ regularization or early stopping---we achieve substantially higher worst-group accuracies, with 10-40 percentage point improvements on a natural language inference task and two image tasks, while maintaining high average accuracies. Our results suggest that regularization is critical for worst-group generalization in the overparameterized regime, even if it is not needed for average generalization. Finally, we introduce and give convergence guarantees for a stochastic optimizer for the group DRO setting, underpinning the empirical study above.


A semiparametric instrumental variable approach to optimal treatment regimes under endogeneity

arXiv.org Machine Learning

There is a fast-growing literature on estimating optimal treatment regimes based on randomized trials or observational studies under a key identifying condition of no unmeasured confounding. Because confounding by unmeasured factors cannot generally be ruled out with certainty in observational studies or randomized trials subject to noncompliance, we propose a general instrumental variable approach to learning optimal treatment regimes under endogeneity. Specifically, we provide sufficient conditions for the identification of both value function $E[Y_{\cD(L)}]$ for a given regime $\cD$ and optimal regime $\arg \max_{\cD} E[Y_{\cD(L)}]$ with the aid of a binary instrumental variable, when no unmeasured confounding fails to hold. We establish consistency of the proposed weighted estimators. We also extend the proposed method to identify and estimate the optimal treatment regime among those who would comply to the assigned treatment under monotonicity. In this latter case, we establish the somewhat surprising result that the complier optimal regime can be consistently estimated without directly collecting compliance information and therefore without the complier average treatment effect itself being identified. Furthermore, we propose novel semiparametric locally efficient and multiply robust estimators. Our approach is illustrated via extensive simulation studies and a data application on the effect of child rearing on labor participation.


Deep Reinforcement Learning with Explicitly Represented Knowledge and Variable State and Action Spaces

arXiv.org Artificial Intelligence

We focus on a class of real-world domains, where gathering hierarchical knowledge is required to accomplish a task. Many problems can be represented in this manner, such as network penetration testing, targeted advertising or medical diagnosis. In our formalization, the task is to sequentially request pieces of information about a sample to build the knowledge hierarchy and terminate when suitable. Any of the learned pieces of information can be further analyzed, resulting in a complex and variable action space. We present a combination of techniques in which the knowledge hierarchy is explicitly represented and given to a deep reinforcement learning algorithm as its input. To process the hierarchical input, we employ Hierarchical Multiple-Instance Learning and to cope with the complex action space, we factor it with hierarchical softmax. Our end-to-end differentiable model is trained with A2C, a standard deep reinforcement learning algorithm. We demonstrate the method in a set of seven classification domains, where the task is to achieve the best accuracy with a set budget on the amount of information retrieved. Compared to baseline algorithms, our method achieves not only better results, but also better generalization.