AITopics

2207.10479

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Austria > Styria > Graz (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(10 more...)

Genre:

Overview (0.48)
Research Report (0.40)

Industry:

Law (0.93)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)

arXiv.org Artificial IntelligenceJul-21-2022

Causal Machine Learning: A Survey and Open Problems

Kaddour, Jean, Lynch, Aengus, Liu, Qi, Kusner, Matt J., Silva, Ricardo

Causal Machine Learning (CausalML) is an umbrella term for machine learning methods that formalize the data-generation process as a structural causal model (SCM). This perspective enables us to reason about the effects of changes to this process (interventions) and what would have happened in hindsight (counterfactuals). We categorize work in CausalML into five groups according to the problems they address: (1) causal supervised learning, (2) causal generative modeling, (3) causal explanations, (4) causal fairness, and (5) causal reinforcement learning. We systematically compare the methods in each category and point out open problems. Further, we review data-modality-specific applications in computer vision, natural language processing, and graph representation learning. Finally, we provide an overview of causal benchmarks and a critical discussion of the state of this nascent field, including recommendations for future work.

neural information processing system 2016, neural information processing system 2019, neural information processing system 32, (14 more...)

2206.15475

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.27)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
(27 more...)

Genre:

Summary/Review (1.00)
Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Media (1.00)
Leisure & Entertainment > Games (1.00)
Law (1.00)
(9 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(6 more...)

Deep Learning to Estimate Permeability using Geophysical Data

Mudunuru, M. K., Cromwell, E. L. D., Wang, H., Chen, X.

Time-lapse electrical resistivity tomography (ERT) is a popular geophysical method to estimate three-dimensional (3D) permeability fields from electrical potential difference measurements. Traditional inversion and data assimilation methods are used to ingest this ERT data into hydrogeophysical models to estimate permeability. Due to ill-posedness and the curse of dimensionality, existing inversion strategies provide poor estimates and low resolution of the 3D permeability field. Recent advances in deep learning provide us with powerful algorithms to overcome this challenge. This paper presents a deep learning (DL) framework to estimate the 3D subsurface permeability from time-lapse ERT data. To test the feasibility of the proposed framework, we train DL-enabled inverse models on simulation data. Subsurface process models based on hydrogeophysics are used to generate this synthetic data for deep learning analyses. Results show that proposed weak supervised learning can capture salient spatial features in the 3D permeability field. Quantitatively, the average mean squared error (in terms of the natural log) on the strongly labeled training, validation, and test datasets is less than 0.5. The R2-score (global metric) is greater than 0.75, and the percent error in each cell (local metric) is less than 10%. Finally, an added benefit in terms of computational cost is that the proposed DL-based inverse model is at least O(104) times faster than running a forward model. Note that traditional inversion may require multiple forward model simulations (e.g., in the order of 10 to 1000), which are very expensive. This computational savings (O(105) - O(107)) makes the proposed DL-based inverse model attractive for subsurface imaging and real-time ERT monitoring applications due to fast and yet reasonably accurate estimations of the permeability field.

artificial intelligence, machine learning, survey article, (18 more...)

doi: 10.1016/j.advwatres.2022.104272

2110.10077

Country: North America > United States (1.00)

Genre:

Overview (1.00)
Research Report > New Finding (0.88)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable > Geothermal (1.00)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Shurrab, Saeed, Duwairi, Rehab

Self-supervised learning methods and applications in medical imaging analysis: A survey

The scarcity of high-quality annotated medical imaging datasets is a major problem that collides with machine learning applications in the field of medical imaging analysis and impedes its advancement. Self-supervised learning is a recent training paradigm that enables learning robust representations without the need for human annotation which can be considered an effective solution for the scarcity of annotated medical data. This article reviews the state-of-the-art research directions in self-supervised learning approaches for image data with a concentration on their applications in the field of medical imaging analysis. The article covers a set of the most recent self-supervised learning methods from the computer vision field as they are applicable to the medical imaging analysis and categorize them as predictive, generative, and contrastive approaches. Moreover, the article covers 40 of the most recent research papers in the field of self-supervised learning in medical imaging analysis aiming at shedding the light on the recent innovation in the field. Finally, the article concludes with possible future research directions in the field.

artificial intelligence, machine learning, representation, (19 more...)

doi: 10.7717/peerj-cs.1045

2109.08685

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
(3 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Task Allocation using a Team of Robots

Aziz, Haris, Pal, Arindam, Pourmiri, Ali, Ramezani, Fahimeh, Sims, Brendan

Task allocation using a team or coalition of robots is one of the most important problems in robotics, computer science, operational research, and artificial intelligence. In recent work, research has focused on handling complex objectives and feasibility constraints amongst other variations of the multi-robot task allocation problem. There are many examples of important research progress in these directions. We present a general formulation of the task allocation problem that generalizes several versions that are well-studied. Our formulation includes the states of robots, tasks, and the surrounding environment in which they operate. We describe how the problem can vary depending on the feasibility constraints, objective functions, and the level of dynamically changing information. In addition, we discuss existing solution approaches for the problem including optimization-based approaches, and market-based approaches.

evolutionary algorithm, machine learning, reinforcement learning, (17 more...)

2207.0965

Country:

North America > United States (0.04)
Europe > Italy > Lazio > Rome (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre:

Overview (1.00)
Research Report (0.64)

Industry: Transportation (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(4 more...)

Subramanian, Shivaram, Sun, Wei, Drissi, Youssef, Ettl, Markus

Constrained Prescriptive Trees via Column Generation

With the abundance of available data, many enterprises seek to implement data-driven prescriptive analytics to help them make informed decisions. These prescriptive policies need to satisfy operational constraints, and proactively eliminate rule conflicts, both of which are ubiquitous in practice. It is also desirable for them to be simple and interpretable, so they can be easily verified and implemented. Existing approaches from the literature center around constructing variants of prescriptive decision trees to generate interpretable policies. However, none of the existing methods are able to handle constraints. In this paper, we propose a scalable method that solves the constrained prescriptive policy generation problem. We introduce a novel path-based mixed-integer program (MIP) formulation which identifies a (near) optimal policy efficiently via column generation. The policy generated can be represented as a multiway-split tree which is more interpretable and informative than a binary-split tree due to its shorter rules. We demonstrate the efficacy of our method with extensive experiments on both synthetic and real datasets.

constraint, data mining, machine learning, (21 more...)

2207.10163

Country: North America > United States (0.14)

Genre:

Overview (0.68)
Research Report (0.50)

Industry: Transportation (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Abdolali, Maryam, Gillis, Nicolas

Revisiting data augmentation for subspace clustering

Subspace clustering is the classical problem of clustering a collection of data samples that approximately lie around several low-dimensional subspaces. The current state-of-the-art approaches for this problem are based on the self-expressive model which represents the samples as linear combination of other samples. However, these approaches require sufficiently well-spread samples for accurate representation which might not be necessarily accessible in many applications. In this paper, we shed light on this commonly neglected issue and argue that data distribution within each subspace plays a critical role in the success of self-expressive models. Our proposed solution to tackle this issue is motivated by the central role of data augmentation in the generalization power of deep neural networks. We propose two subspace clustering frameworks for both unsupervised and semi-supervised settings that use augmented samples as an enlarged dictionary to improve the quality of the self-expressive representation. We present an automatic augmentation strategy using a few labeled samples for the semi-supervised problem relying on the fact that the data samples lie in the union of multiple linear subspaces. Experimental results confirm the effectiveness of data augmentation, as it significantly improves the performance of general self-expressive models.

artificial intelligence, machine learning, optimization problem, (21 more...)

doi: 10.1016/j.knosys.2022.109974

2207.09728

Country: Europe > Belgium (0.04)

Genre:

Research Report (1.00)
Overview (0.65)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

A Large-Scale Dataset of Twitter Chatter about Online Learning during the Current COVID-19 Omicron Wave

Thakur, Nirmalya

The COVID-19 Omicron variant, reported to be the most immune evasive variant of COVID-19, is resulting in a surge of COVID-19 cases globally. This has caused schools, colleges, and universities in different parts of the world to transition to online learning. As a result, social media platforms such as Twitter are seeing an increase in conversations related to online learning in the form of tweets. Mining such tweets to develop a dataset can serve as a data resource for different applications and use-cases related to the analysis of interest, views, opinions, perspectives, attitudes, and feedback towards online learning during the current surge of COVID-19 cases caused by the Omicron variant. Therefore, this work presents a large-scale open-access Twitter dataset of conversations about online learning from different parts of the world since the first detected case of the COVID-19 Omicron variant in November 2021. The dataset is compliant with the privacy policy, developer agreement, and guidelines for content redistribution of Twitter, as well as with the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) principles for scientific data management. The paper also briefly outlines some potential applications in the fields of Big Data, Data Mining, Natural Language Processing, and their related disciplines, with a specific focus on online learning during this Omicron wave that may be studied, explored, and investigated by using this dataset.

dataset, online, student, (10 more...)

2208.0781

Country:

Europe > United Kingdom (0.28)
Asia > Middle East > UAE (0.28)
North America > United States > New York > New York County > New York City (0.15)
(41 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Instructional Material > Online (0.93)
Overview (0.93)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(2 more...)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Hosseini, Ryien, Simini, Filippo, Vishwanath, Venkatram

Operation-Level Performance Benchmarking of Graph Neural Networks for Scientific Applications

As Graph Neural Networks (GNNs) increase in popularity for scientific machine learning, their training and inference efficiency is becoming increasingly critical. Additionally, the deep learning field as a whole is trending towards wider and deeper networks, and ever increasing data sizes, to the point where hard hardware bottlenecks are often encountered. Emerging specialty hardware platforms provide an exciting solution to this problem. In this paper, we systematically profile and select low-level operations pertinent to GNNs for scientific computing implemented in the Pytorch Geometric software framework. These are then rigorously benchmarked on NVIDIA A100 GPUs for several various combinations of input values, including tensor sparsity. We then analyze these results for each operation. At a high level, we conclude that on NVIDIA systems: (1) confounding bottlenecks such as memory inefficiency often dominate runtime costs moreso than data sparsity alone, (2) native Pytorch operations are often as or more competitive than their Pytorch Geometric equivalents, especially at low to moderate levels of input data sparsity, and (3) many operations central to state-of-the-art GNN architectures have little to no optimization for sparsity. We hope that these results serve as a baseline for those developing these operations on specialized hardware and that our subsequent analysis helps to facilitate future software and hardware based optimizations of these operations and thus scalable GNN performance as a whole.

opération, sparsity, tensor, (16 more...)

2207.09955

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > Illinois > Cook County > Lemont (0.04)
North America > Costa Rica > Heredia Province > Heredia (0.04)
Europe (0.04)

Genre:

Overview (0.93)
Research Report (0.82)

Industry: Information Technology (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Integrating Linguistic Theory and Neural Language Models

Li, Bai

Transformer-based language models have recently achieved remarkable results in many natural language tasks. However, performance on leaderboards is generally achieved by leveraging massive amounts of training data, and rarely by encoding explicit linguistic knowledge into neural models. This has led many to question the relevance of linguistics for modern natural language processing. In this dissertation, I present several case studies to illustrate how theoretical linguistics and neural language models are still relevant to each other. First, language models are useful to linguists by providing an objective tool to measure semantic distance, which is difficult to do using traditional methods. On the other hand, linguistic theory contributes to language modelling research by providing frameworks and sources of data to probe our language models for specific aspects of language understanding. This thesis contributes three studies that explore different aspects of the syntax-semantics interface in language models. In the first part of my thesis, I apply language models to the problem of word class flexibility. Using mBERT as a source of semantic distance measurements, I present evidence in favour of analyzing word class flexibility as a directional process. In the second part of my thesis, I propose a method to measure surprisal at intermediate layers of language models. My experiments show that sentences containing morphosyntactic anomalies trigger surprisals earlier in language models than semantic and commonsense anomalies. Finally, in the third part of my thesis, I adapt several psycholinguistic studies to show that language models contain knowledge of argument structure constructions. In summary, my thesis develops new connections between natural language processing, linguistic theory, and psycholinguistics to provide fresh perspectives for the interpretation of language models.

euclidean distance, linguistic knowledge, psycholinguistic study, (13 more...)

2207.09643

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
(13 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Consumer Products & Services > Restaurants (0.67)
Health & Medicine > Therapeutic Area (0.45)
Education > Educational Setting > Higher Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(2 more...)