Goto

Collaborating Authors

 South America


Super-App Behavioral Patterns in Credit Risk Models: Financial, Statistical and Regulatory Implications

arXiv.org Machine Learning

In this paper we present the impact of alternative data that originates from an app-based marketplace, in contrast to traditional bureau data, upon credit scoring models. These alternative data sources have shown themselves to be immensely powerful in predicting borrower behavior in segments traditionally underserved by banks and financial institutions. Our results, validated across two countries, show that these new sources of data are particularly useful for predicting financial behavior in low-wealth and young individuals, who are also the most likely to engage with alternative lenders. Furthermore, using the TreeSHAP method for Stochastic Gradient Boosting interpretation, our results also revealed interesting non-linear trends in the variables originating from the app, which would not normally be available to traditional banks. Our results represent an opportunity for technology companies to disrupt traditional banking by correctly identifying alternative data sources and handling this new information properly. At the same time alternative data must be carefully validated to overcome regulatory hurdles across diverse jurisdictions.


Coronavirus Fragments 15: Medical Precrime and the Hackable Brain

#artificialintelligence

Horrifying Glimpse Into How DARPA Will "Save" You From COVID-19 and Venezuela Coup Tied Back To Trump (7 May 2020). In my last two posts, The New World Emperor and Wake Up, You're Next, I stated that the main worry in the nCov pandemic is not just the virus - its origins, seriousness, the number of strains, and their forthcoming spread - but how the pandemic will be controlled. I argued that mass vaccines and tracking will involve the transition from computer-based to human-based operating systems. A series of pandemic outbreaks now and in coming years will be followed by successive vaccines, which will implant weaponized AI and nanotechnology on a mass scale, in order to establish brain-machine interfaces around the globe, paired with a cryptocurrency as a reward or punishment system. If you accept this technology into your body, the control of the few over the many will be complete, and the Internet of Thoughts will be born. To understand injectable technologies, see (above) The Last American Vagabond's 7 May 2020 interview with independent journalist, Whitney Webb.


Exactech and KenSci Publish Research on the Impact of Artificial Intelligence to Predict Clinical Outcomes after Shoulder Arthroplasty

#artificialintelligence

GAINESVILLE, Fla.--(BUSINESS WIRE)--Exactech, a developer and producer of innovative implants, instrumentation and computer-assisted technologies for joint replacement surgery, and KenSci, a healthcare artificial intelligence (AI) platform company, announced today that a collaborative, foundational study on using machine learning (ML) to predict outcomes after shoulder arthroplasty has been published in Clinical Orthopaedics and Related Research, one of the premier scientific journals in orthopaedics. The research analyzes the potential of ML to use preoperative data to anticipate patients' post-operative results after anatomic total shoulder arthroplasty (aTSA) or reverse total shoulder arthroplasty (rTSA). These results can help surgeons preoperatively identify if a patient will achieve certain clinical improvement thresholds to appropriately risk-stratify patients for these elective procedures. Specifically, this research explores the efficacy of ML to predict the American Shoulder and Elbow Surgery (ASES), Constant, global shoulder function and VAS pain score, as well as to predict a patient's active range of motion in abduction, forward flexion and external rotation. This research also studies the ability of ML to identify if a patient may achieve clinical improvement that exceeds the minimal clinically important difference threshold as well as the substantial clinical benefit threshold for each outcome measure.


A Graph Gaussian Embedding Method for Predicting Alzheimer's Disease Progression with MEG Brain Networks

arXiv.org Machine Learning

Characterizing the subtle changes of functional brain networks associated with the pathological cascade of Alzheimer's disease (AD) is important for early diagnosis and prediction of disease progression prior to clinical symptoms. We developed a new deep learning method, termed multiple graph Gaussian embedding model (MG2G), which can learn highly informative network features by mapping high-dimensional resting-state brain networks into a low-dimensional latent space. These latent distribution-based embeddings enable a quantitative characterization of subtle and heterogeneous brain connectivity patterns at different regions and can be used as input to traditional classifiers for various downstream graph analytic tasks, such as AD early stage prediction, and statistical evaluation of between-group significant alterations across brain regions. We used MG2G to detect the intrinsic latent dimensionality of MEG brain networks, predict the progression of patients with mild cognitive impairment (MCI) to AD, and identify brain regions with network alterations related to MCI.


An AI Assist for Spotting COVID-19 in X Rays

#artificialintelligence

Chest x rays offer a quick screening method for lung problems: A collapsed lung, the buildup of excess fluid, or swollen tissue are all recognizable by radiologists in these black and white images. Doctors also use the images to help quickly diagnosis diseases, such as pneumonia. Making x-ray screening of COVID-19 similarly speedy would benefit over-run hospitals, and the developers of artificial intelligence (AI) algorithms are hoping that they can help. But getting the data they need from hospitals and ensuring accuracy are challenges that they must first overcome. "In places like New York, where there was this huge explosion of [COVID-19] patients, they're already taking chest x rays alongside viral testing. So why not have a greater immediate impact by building AI [software] to help screen through all those images?" says Alexander Wong, who works on medical image processing problems using AI at the University of Waterloo, Canada.


COVID-19 identification in chest X-ray images on flat and hierarchical classification scenarios

arXiv.org Machine Learning

The COVID-19 can cause severe pneumonia and is estimated to have a high impact on the healthcare system. The standard image diagnosis tests for pneumonia are chest X-ray (CXR) and computed tomography (CT) scan. CXR are useful in because it is cheaper, faster and more widespread than CT. This study aims to identify pneumonia caused by COVID-19 from other types and also healthy lungs using only CXR images. In order to achieve the objectives, we have proposed a classification schema considering the multi-class and hierarchical perspectives, since pneumonia can be structured as a hierarchy. Given the natural data imbalance in this domain, we also proposed the use of resampling algorithms in order to re-balance the classes distribution. Our classification schema extract features using some well-known texture descriptors and also using a pre-trained CNN model. We also explored early and late fusion techniques in order to leverage the strength of multiple texture descriptors and base classifiers at once. To evaluate the approach, we composed a database, named RYDLS-20, containing CXR images of pneumonia caused by different pathogens as well as CXR images of healthy lungs. The classes distribution follows a real-world scenario in which some pathogens are more common than others. The proposed approach achieved a macro-avg F1-Score of 0.65 using a multi-class approach and a F1-Score of 0.89 for the COVID-19 identification in the hierarchical classification scenario. As far as we know, we achieved the best nominal rate obtained for COVID-19 identification in an unbalanced environment with more than three classes. We must also highlight the novel proposed hierarchical classification approach for this task, which considers the types of pneumonia caused by the different pathogens and lead us to the best COVID-19 recognition rate obtained here.


A Proposal for Intelligent Agents with Episodic Memory

arXiv.org Artificial Intelligence

In the future we can expect that artificial intelligent agents, once deployed, will be required to learn continually from their experience during their operational lifetime. Such agents will also need to communicate with humans and other agents regarding the content of their experience, in the context of passing along their learnings, for the purpose of explaining their actions in specific circumstances or simply to relate more naturally to humans concerning experiences the agent acquires that are not necessarily related to their assigned tasks. We argue that to support these goals, an agent would benefit from an episodic memory; that is, a memory that encodes the agent's experience in such a way that the agent can relive the experience, communicate about it and use its past experience, inclusive of the agents own past actions, to learn more effective models and policies. In this short paper, we propose one potential approach to provide an AI agent with such capabilities. We draw upon the ever-growing body of work examining the function and operation of the Medial Temporal Lobe (MTL) in mammals to guide us in adding an episodic memory capability to an AI agent composed of artificial neural networks (ANNs). Based on that, we highlight important aspects to be considered in the memory organization and we propose an architecture combining ANNs and standard Computer Science techniques for supporting storage and retrieval of episodic memories. Despite being initial work, we hope this short paper can spark discussions around the creation of intelligent agents with memory or, at least, provide a different point of view on the subject.


The More the Merrier?! Evaluating the Effect of Landmark Extraction Algorithms on Landmark-Based Goal Recognition

arXiv.org Artificial Intelligence

Recent approaches to goal and plan recognition using classical planning domains have achieved state of the art results in terms of both recognition time and accuracy by using heuristics based on planning landmarks. To achieve such fast recognition time these approaches use efficient, but incomplete, algorithms to extract only a subset of landmarks for planning domains and problems, at the cost of some accuracy. In this paper, we investigate the impact and effect of using various landmark extraction algorithms capable of extracting a larger proportion of the landmarks for each given planning problem, up to exhaustive landmark extraction. We perform an extensive empirical evaluation of various landmark-based heuristics when using different percentages of the full set of landmarks. Results show that having more landmarks does not necessarily mean achieving higher accuracy and lower spread, as the additional extracted landmarks may not necessarily increase be helpful towards the goal recognition task.


Building A User-Centric and Content-Driven Socialbot

arXiv.org Artificial Intelligence

To build Sounding Board, we develop a system architecture that is capable of accommodating dialog strategies that we designed for socialbot conversations. The architecture consists of a multi-dimensional language understanding module for analyzing user utterances, a hierarchical dialog management framework for dialog context tracking and complex dialog control, and a language generation process that realizes the response plan and makes adjustments for speech synthesis. Additionally, we construct a new knowledge base to power the socialbot by collecting social chat content from a variety of sources. An important contribution of the system is the synergy between the knowledge base and the dialog management, i.e., the use of a graph structure to organize the knowledge base that makes dialog control very efficient in bringing related content to the discussion. Using the data collected from Sounding Board during the competition, we carry out in-depth analyses of socialbot conversations and user ratings which provide valuable insights in evaluation methods for socialbots. We additionally investigate a new approach for system evaluation and diagnosis that allows scoring individual dialog segments in the conversation. Finally, observing that socialbots suffer from the issue of shallow conversations about topics associated with unstructured data, we study the problem of enabling extended socialbot conversations grounded on a document. To bring together machine reading and dialog control techniques, a graph-based document representation is proposed, together with methods for automatically constructing the graph. Using the graph-based representation, dialog control can be carried out by retrieving nodes or moving along edges in the graph. To illustrate the usage, a mixed-initiative dialog strategy is designed for socialbot conversations on news articles.


Exploratory Analysis of Covid-19 Tweets using Topic Modeling, UMAP, and DiGraphs

arXiv.org Artificial Intelligence

This paper illustrates five different techniques to assess the distinctiveness of topics, key terms and features, speed of information dissemination, and network behaviors for Covid19 tweets. First, we use pattern matching and second, topic modeling through Latent Dirichlet Allocation (LDA) to generate twenty different topics that discuss case spread, healthcare workers, and personal protective equipment (PPE). One topic specific to U.S. cases would start to uptick immediately after live White House Coronavirus Task Force briefings, implying that many Twitter users are paying attention to government announcements. We contribute machine learning methods not previously reported in the Covid19 Twitter literature. This includes our third method, Uniform Manifold Approximation and Projection (UMAP), that identifies unique clustering-behavior of distinct topics to improve our understanding of important themes in the corpus and help assess the quality of generated topics. Fourth, we calculated retweeting times to understand how fast information about Covid19 propagates on Twitter. Our analysis indicates that the median retweeting time of Covid19 for a sample corpus in March 2020 was 2.87 hours, approximately 50 minutes faster than repostings from Chinese social media about H7N9 in March 2013. Lastly, we sought to understand retweet cascades, by visualizing the connections of users over time from fast to slow retweeting. As the time to retweet increases, the density of connections also increase where in our sample, we found distinct users dominating the attention of Covid19 retweeters. One of the simplest highlights of this analysis is that early-stage descriptive methods like regular expressions can successfully identify high-level themes which were consistently verified as important through every subsequent analysis.