Goto

Collaborating Authors

 Technology


Fairness in Criminal Justice Risk Assessments: The State of the Art

Berk, Richard, Heidari, Hoda, Jabbari, Shahin, Kearns, Michael, Roth, Aaron

arXiv.org Machine Learning

Objectives: Discussions of fairness in criminal justice risk assessments typically lack conceptual precision. Rhetoric too often substitutes for careful analysis. In this paper, we seek to clarify the tradeoffs between different kinds of fairness and between fairness and accuracy. Methods: We draw on the existing literatures in criminology, computer science and statistics to provide an integrated examination of fairness and accuracy in criminal justice risk assessments. We also provide an empirical illustration using data from arraignments. Results: We show that there are at least six kinds of fairness, some of which are incompatible with one another and with accuracy. Conclusions: Except in trivial cases, it is impossible to maximize accuracy and fairness at the same time, and impossible simultaneously to satisfy all kinds of fairness. In practice, a major complication is different base rates across different legally protected groups. There is a need to consider challenging tradeoffs.


Targeted Learning with Daily EHR Data

Sofrygin, Oleg, Zhu, Zheng, Schmittdiel, Julie A, Adams, Alyce S., Grant, Richard W., van der Laan, Mark J., Neugebauer, Romain

arXiv.org Machine Learning

Electronic health records (EHR) data provide a cost and time-effective opportunity to conduct cohort studies of the effects of multiple time-point interventions in the diverse patient population found in real-world clinical settings. Because the computational cost of analyzing EHR data at daily (or more granular) scale can be quite high, a pragmatic approach has been to partition the follow-up into coarser intervals of pre-specified length. Current guidelines suggest employing a 'small' interval, but the feasibility and practical impact of this recommendation has not been evaluated and no formal methodology to inform this choice has been developed. We start filling these gaps by leveraging large-scale EHR data from a diabetes study to develop and illustrate a fast and scalable targeted learning approach that allows to follow the current recommendation and study its practical impact on inference. More specifically, we map daily EHR data into four analytic datasets using 90, 30, 15 and 5-day intervals. We apply a semi-parametric and doubly robust estimation approach, the longitudinal TMLE, to estimate the causal effects of four dynamic treatment rules with each dataset, and compare the resulting inferences. To overcome the computational challenges presented by the size of these data, we propose a novel TMLE implementation, the 'long-format TMLE', and rely on the latest advances in scalable data-adaptive machine-learning software, xgboost and h2o, for estimation of the TMLE nuisance parameters.


Approximation Vector Machines for Large-scale Online Learning

Le, Trung, Nguyen, Tu Dinh, Nguyen, Vu, Phung, Dinh

arXiv.org Machine Learning

One of the most challenging problems in kernel online learning is to bound the model size and to promote the model sparsity. Sparse models not only improve computation and memory usage, but also enhance the generalization capacity, a principle that concurs with the law of parsimony. However, inappropriate sparsity modeling may also significantly degrade the performance. In this paper, we propose Approximation Vector Machine (AVM), a model that can simultaneously encourage the sparsity and safeguard its risk in compromising the performance. When an incoming instance arrives, we approximate this instance by one of its neighbors whose distance to it is less than a predefined threshold. Our key intuition is that since the newly seen instance is expressed by its nearby neighbor the optimal performance can be analytically formulated and maintained. We develop theoretical foundations to support this intuition and further establish an analysis to characterize the gap between the approximation and optimal solutions. This gap crucially depends on the frequency of approximation and the predefined threshold. We perform the convergence analysis for a wide spectrum of loss functions including Hinge, smooth Hinge, and Logistic for classification task, and $l_1$, $l_2$, and $\epsilon$-insensitive for regression task. We conducted extensive experiments for classification task in batch and online modes, and regression task in online mode over several benchmark datasets. The results show that our proposed AVM achieved a comparable predictive performance with current state-of-the-art methods while simultaneously achieving significant computational speed-up due to the ability of the proposed AVM in maintaining the model size.


Incorporating Prior Information in Compressive Online Robust Principal Component Analysis

Van Luong, Huynh, Deligiannis, Nikos, Seiler, Jurgen, Forchhammer, Soren, Kaup, Andre

arXiv.org Artificial Intelligence

We consider an online version of the robust Principle Component Analysis (PCA), which arises naturally in time-varying source separations such as video foreground-background separation. This paper proposes a compressive online robust PCA with prior information for recursively separating a sequences of frames into sparse and low-rank components from a small set of measurements. In contrast to conventional batch-based PCA, which processes all the frames directly, the proposed method processes measurements taken from each frame. Moreover, this method can efficiently incorporate multiple prior information, namely previous reconstructed frames, to improve the separation and thereafter, update the prior information for the next frame. We utilize multiple prior information by solving $n\text{-}\ell_{1}$ minimization for incorporating the previous sparse components and using incremental singular value decomposition ($\mathrm{SVD}$) for exploiting the previous low-rank components. We also establish theoretical bounds on the number of measurements required to guarantee successful separation under assumptions of static or slowly-changing low-rank components. Using numerical experiments, we evaluate our bounds and the performance of the proposed algorithm. In addition, we apply the proposed algorithm to online video foreground and background separation from compressive measurements. Experimental results show that the proposed method outperforms the existing methods.


STRIPS Planning in Infinite Domains

Garrett, Caelan Reed, Lozano-Pérez, Tomás, Kaelbling, Leslie Pack

arXiv.org Artificial Intelligence

Many robotic planning applications involve continuous actions with highly non-linear constraints, which cannot be modeled using modern planners that construct a propositional representation. We introduce STRIPStream: an extension of the STRIPS language which can model these domains by supporting the specification of blackbox generators to handle complex constraints. The outputs of these generators interact with actions through possibly infinite streams of objects and static predicates. We provide two algorithms which both reduce STRIPStream problems to a sequence of finite-domain planning problems. The representation and algorithms are entirely domain independent. We demonstrate our framework on simple illustrative domains, and then on a high-dimensional, continuous robotic task and motion planning domain.


Indirect Causes in Dynamic Bayesian Networks Revisited

Motzek, Alexander, Möller, Ralf

Journal of Artificial Intelligence Research

Modeling causal dependencies often demands cycles at a coarse-grained temporal scale. If Bayesian networks are to be used for modeling uncertainties, cycles are eliminated with dynamic Bayesian networks, spreading indirect dependencies over time and enforcing an infinitesimal resolution of time. Without a ``causal design,'' i.e., without anticipating indirect influences appropriately in time, we argue that such networks return spurious results. By identifying activator random variables, we propose activator dynamic Bayesian networks (ADBNs) which are able to rapidly adapt to contexts under a causal use of time, anticipating indirect influences on a solid mathematical basis using familiar Bayesian network semantics. ADBNs are well-defined dynamic probabilistic graphical models allowing one to model cyclic dependencies from local and causal perspectives while preserving a classical, familiar calculus and classically known algorithms, without introducing any overhead in modeling or inference.


Why Soul Machines made an AI baby

#artificialintelligence

At Soul Machines, a company that uses artificial intelligence to create lifelike avatars that respond to human emotion, a fair amount of their work could be considered unsettling to the average person who fears the coming takeover by our AI-robot overlords. It's a company that pretty much lives in the uncanny valley, that space between fake and real that can creep people out, but that's not usually what happens when people meet BabyX, said Soul Machines founder Mark Sagar. Instead, he says, when the baby begins to whimper or cry, some respond in human ways, demonstrating what appears to be sympathy similar to the kind they may lavish on a human baby. "I'll probably get about 10 or 15 percent of people respond with'that's creepy,' and others it doesn't bother them at all. Ultimately it's about creating an emotional connection and then people jump right into that," he said. To see which of these two camps you fall into, watch the video below.


Deep learning algorithms demand nearly limitless supplies of data

#artificialintelligence

In any deep learning project, it's almost impossible to imagine an upper limit on the amount of data needed for training models and conducting analyses. "We need to get more data," said Patrick Lucey, director of data science at sports consulting company STATS LLC in Chicago. We want to reconstruct that story, [and] tell better stories, and we're limited because we can't get all the data we want." Deep learning, as defined by the use of multiple machine learning algorithms, such as neural networks strung together, isn't necessarily a new concept. However, it started to gain more widespread traction last year, as researchers and enterprises realized that analytical models could be turned loose on the massive troves of data businesses had accumulated since the dawn of the big data era. Deep learning algorithms require experience to sharpen their recommendations, and big data provides them with exactly the fuel they need. But this raises the question of when is enough data enough? Some of the most prominent deep learning examples used hundreds of thousands, even millions of records during the model training process. At STATS, Lucey has access to ample data, but said he still feels models could function better with more. The company maintains databases of game data going back to its beginnings in 1981. Its deepest data sets go back to 2010 with the NBA, and come from its SportVU system, a network of cameras installed at sports arenas that captures player movement data. This wealth of data has enabled Lucey and his team to do some interesting things with deep learning. For example, he and his team developed a model that looks at video data from NBA games and analyzes players' body positions to better define what an open shot looks like. Another STATS project applied deep learning algorithms to English Premier League soccer. STATS analyzed data beyond traditional statistics, like shots and goals, to understand the factors that led to longshot Leicester City Football Club taking home the title in the league's 2015-2016 season, which ended last May. The data science team at STATS primarily builds models in open source tools, such as the Google-created TensorFlow and scikit-learn, a library of machine learning models built in Python. These projects have been successful, according to Lucey. However, he added that he's already looking to sharpen analyses, and he thinks more data will help. In addition to larger data volumes, more detailed information will be necessary, he noted. Deep learning algorithms thrive on detailed data as much as large amounts of data, and that will play an important role as these models continue to improve and describe the world more accurately. "That's the key -- finding that context," Lucey said. "You can get a good prediction, but if it's washed over by context, it's not as valuable.


Baidu Whiz Must Outsmart Google At Artificial Intelligence - BI News - Business Intelligence

#artificialintelligence

For much of the past two decades, Qi Lu, a search-technology whiz, waged losing battles against Google, first at Yahoo Inc. then at Microsoft Corp.'s Bing.


Top 10 Artificial Intelligence Blogs

#artificialintelligence

Great news for those who like attending conferences -- they hold annual AI World Conference and Expo https://aitrends.com/. His vast experience in AI and applying machine learning to weather forecasting and defence industries allow him to write useful articles and books on the topic http://machinelearningmastery.com/blog/. Future is closer than you think, Algorithmia will prove it http://blog.algorithmia.com/. By the way, it is an application of AI itself. So check out what a machine's prepared for you to read https://aitopics.org/search.