AITopics

2309.12444

Genre: Research Report (0.40)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.69)

arXiv.org Artificial IntelligenceSep-12-2023

Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations

Haydarov, Kilichbek, Shen, Xiaoqian, Madasu, Avinash, Salem, Mahmoud, Li, Li-Jia, Elsayed, Gamaleldin, Elhoseiny, Mohamed

We introduce Affective Visual Dialog, an emotion explanation and reasoning task as a testbed for research on understanding the formation of emotions in visually grounded conversations. The task involves three skills: (1) Dialog-based Question Answering (2) Dialog-based Emotion Prediction and (3) Affective emotion explanation generation based on the dialog. Our key contribution is the collection of a large-scale dataset, dubbed AffectVisDial, consisting of 50K 10-turn visually grounded dialogs as well as concluding emotion attributions and dialog-informed textual emotion explanations, resulting in a total of 27,180 working hours. We explain our design decisions in collecting the dataset and introduce the questioner and answerer tasks that are associated with the participants in the conversation. We train and demonstrate solid Affective Visual Dialog baselines adapted from state-of-the-art models. Remarkably, the responses generated by our models show promising emotional reasoning abilities in response to visually grounded conversations. Our project page is available at https://affective-visual-dialog.github.io.

large language model, machine learning, natural language, (20 more...)

2308.16349

Country:

North America (0.14)
Europe (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

arXiv.org Artificial IntelligenceSep-7-2023

Non-inferiority of Deep Learning Acute Ischemic Stroke Segmentation on Non-Contrast CT Compared to Expert Neuroradiologists

Ostmeier, Sophie, Axelrod, Brian, Verhaaren, Benjamin F. J., Christensen, Soren, Mahammedi, Abdelkader, Liu, Yongkai, Pulli, Benjamin, Li, Li-Jia, Zaharchuk, Greg, Heit, Jeremy J.

To determine if a convolutional neural network (CNN) deep learning model can accurately segment acute ischemic changes on non-contrast CT compared to neuroradiologists. Non-contrast CT (NCCT) examinations from 232 acute ischemic stroke patients who were enrolled in the DEFUSE 3 trial were included in this study. Three experienced neuroradiologists independently segmented hypodensity that reflected the ischemic core on each scan. The neuroradiologist with the most experience (expert A) served as the ground truth for deep learning model training. Two additional neuroradiologists (experts B and C) segmentations were used for data testing. The 232 studies were randomly split into training and test sets. The training set was further randomly divided into 5 folds with training and validation sets. A 3-dimensional CNN architecture was trained and optimized to predict the segmentations of expert A from NCCT. The performance of the model was assessed using a set of volume, overlap, and distance metrics using non-inferiority thresholds of 20%, 3ml, and 3mm. The optimized model trained on expert A was compared to test experts B and C. We used a one-sided Wilcoxon signed-rank test to test for the non-inferiority of the model-expert compared to the inter-expert agreement. The final model performance for the ischemic core segmentation task reached a performance of 0.46+-0.09 Surface Dice at Tolerance 5mm and 0.47+-0.13 Dice when trained on expert A. Compared to the two test neuroradiologists the model-expert agreement was non-inferior to the inter-expert agreement, p < 0.05. The CNN accurately delineates the hypodense ischemic core on NCCT in acute ischemic stroke patients with an accuracy comparable to neuroradiologists.

artificial intelligence, machine learning, segmentation, (17 more...)

2211.15341

Country:

Europe (0.28)
North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceSep-7-2023

USE-Evaluator: Performance Metrics for Medical Image Segmentation Models with Uncertain, Small or Empty Reference Annotations

Ostmeier, Sophie, Axelrod, Brian, Bertels, Jeroen, Isensee, Fabian, Lansberg, Maarten G., Christensen, Soren, Albers, Gregory W., Li, Li-Jia, Heit, Jeremy J.

Performance metrics for medical image segmentation models are used to measure the agreement between the reference annotation and the predicted segmentation. Usually, overlap metrics, such as the Dice, are used as a metric to evaluate the performance of these models in order for results to be comparable. However, there is a mismatch between the distributions of cases and difficulty level of segmentation tasks in public data sets compared to clinical practice. Common metrics fail to measure the impact of this mismatch, especially for clinical data sets that include low signal pathologies, a difficult segmentation task, and uncertain, small, or empty reference annotations. This limitation may result in ineffective research of machine learning practitioners in designing and optimizing models. Dimensions of evaluating clinical value include consideration of the uncertainty of reference annotations, independence from reference annotation volume size, and evaluation of classification of empty reference annotations. We study how uncertain, small, and empty reference annotations influence the value of metrics for medical image segmentation on an in-house data set regardless of the model. We examine metrics behavior on the predictions of a standard deep learning framework in order to identify metrics with clinical value. We compare to a public benchmark data set (BraTS 2019) with a high-signal pathology and certain, larger, and no empty reference annotations. We may show machine learning practitioners, how uncertain, small, or empty reference annotations require a rethinking of the evaluation and optimizing procedures. The evaluation code was released to encourage further analysis of this topic. https://github.com/SophieOstmeier/UncertainSmallEmpty.git

artificial intelligence, machine learning, reference annotation, (19 more...)

doi: 10.1016/j.media.2023.102927

2209.13008

Country:

Europe (0.67)
North America > United States > California > Santa Clara County > Palo Alto (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-27-2021

YMIR: A Rapid Data-centric Development Platform for Vision Applications

Huang, Phoenix X., Hu, Wenze, Brendel, William, Chandraker, Manmohan, Li, Li-Jia, Wang, Xiaoyu

This paper introduces an open source platform to support the rapid development of computer vision applications at scale. The platform puts the efficient data development at the center of the machine learning development process, integrates active learning methods, data and model version control, and uses concepts such as projects to enable fast iterations of multiple task specific datasets in parallel. This platform abstracts the development process into core states and operations, and integrates third party tools via open APIs as implementations of the operations. This open design reduces the development cost and adoption cost for ML teams with existing tools. At the same time, the platform supports recording project development histories, through which successful projects can be shared to further boost model production efficiency on similar tasks. The platform is open source and is already used internally to meet the increasing demand for different real world computer vision applications.

machine learning, teaching medhods, teaching method, (19 more...)

2111.10046

Country: North America > United States > California (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningOct-16-2019

Generative Modeling for Small-Data Object Detection

Liu, Lanlan, Muelly, Michael, Deng, Jia, Pfister, Tomas, Li, Li-Jia

This paper explores object detection in the small data regime, where only a limited number of annotated bounding boxes are available due to data rarity and annotation expense. This is a common challenge today with machine learning being applied to many new tasks where obtaining training data is more challenging, e.g. in medical images with rare diseases that doctors sometimes only see once in their life-time. In this work we explore this problem from a generative modeling perspective by learning to generate new images with associated bounding boxes, and using these for training an object detector. We show that simply training previously proposed generative models does not yield satisfactory performance due to them optimizing for image realism rather than object detection accuracy. To this end we develop a new model with a novel unrolling mechanism that jointly optimizes the generative model and a detector such that the generated images improve the performance of the detector. We show this method outperforms the state of the art on two challenging datasets, disease detection and small data pedestrian detection, improving the average precision on NIH Chest X-ray by a relative 20% and localization accuracy by a relative 50%.

artificial intelligence, detector, health & medicine, (18 more...)

1910.07169

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.49)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

arXiv.org Machine LearningAug-12-2019

Feature Partitioning for Efficient Multi-Task Architectures

Newell, Alejandro, Jiang, Lu, Wang, Chong, Li, Li-Jia, Deng, Jia

Multi-task learning holds the promise of less data, parameters, and time than training of separate models. We propose a method to automatically search over multi-task architectures while taking resource constraints into consideration. We propose a search space that compactly represents different parameter sharing strategies. This provides more effective coverage and sampling of the space of multi-task architectures. We also present a method for quick evaluation of different architectures by using feature distillation. Together these contributions allow us to quickly optimize for efficient multi-task models. We benchmark on Visual Decathlon, demonstrating that we can automatically search for and identify multi-task architectures that effectively make trade-offs between task resource requirements while achieving a high level of final performance.

artificial intelligence, arxiv preprint arxiv, neural network, (19 more...)

1908.04339

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.68)

arXiv.org Machine LearningMar-27-2018

Thoracic Disease Identification and Localization with Limited Supervision

Li, Zhe, Wang, Chong, Han, Mei, Xue, Yuan, Wei, Wei, Li, Li-Jia, Fei-Fei, Li

Accurate identification and localization of abnormalities from radiology images play an integral part in clinical diagnosis and treatment planning. Building a highly accurate prediction model for these tasks usually requires a large number of images manually annotated with labels and finding sites of abnormalities. In reality, however, such annotated data are expensive to acquire, especially the ones with location annotations. We need methods that can work well with only a small amount of location annotations. To address this challenge, we present a unified approach that simultaneously performs disease identification and localization through the same underlying model for all images. We demonstrate that our approach can effectively leverage both class information as well as limited location annotation, and significantly outperforms the comparative reference baseline in both classification and localization tasks.

alzheimer s disease, neurology, unannotated image, (19 more...)

1711.06373

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

arXiv.org Machine LearningMar-23-2018

Progressive Neural Architecture Search

Liu, Chenxi, Zoph, Barret, Neumann, Maxim, Shlens, Jonathon, Hua, Wei, Li, Li-Jia, Fei-Fei, Li, Yuille, Alan, Huang, Jonathan, Murphy, Kevin

We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms. Our approach uses a sequential model-based optimization (SMBO) strategy, in which we search for structures in order of increasing complexity, while simultaneously learning a surrogate model to guide the search through structure space. Direct comparison under the same search space shows that our method is up to 5 times more efficient than the RL method of Zoph et al. (2018) in terms of number of models evaluated, and 8 times faster in terms of total compute. The structures we discover in this way achieve state of the art classification accuracies on CIFAR-10 and ImageNet.

deep learning, neural network, predictor, (20 more...)

1712.00559

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningMar-9-2018

Attention-based Graph Neural Network for Semi-supervised Learning

Thekumparampil, Kiran K., Wang, Chong, Oh, Sewoong, Li, Li-Jia

Recently popularized graph neural networks achieve the state-of-the-art accuracy on a number of standard benchmark datasets for graph-based semi-supervised learning, improving significantly over existing approaches. These architectures alternate between a propagation layer that aggregates the hidden states of the local neighborhood and a fully-connected layer. Perhaps surprisingly, we show that a linear model, that removes all the intermediate fully-connected layers, is still able to achieve a performance comparable to the state-of-the-art models. This significantly reduces the number of parameters, which is critical for semi-supervised learning where number of labeled examples are small. This in turn allows a room for designing more innovative propagation layers. Based on this insight, we propose a novel graph neural network that removes all the intermediate fully-connected layers, and replaces the propagation layers with attention mechanisms that respect the structure of the graph. The attention mechanism allows us to learn a dynamic and adaptive local summary of the neighborhood to achieve more accurate predictions. In a number of experiments on benchmark citation networks datasets, we demonstrate that our approach outperforms competing methods. By examining the attention weights among neighbors, we show that our model provides some interesting insights on how neighbors influence each other.

deep learning, neural network, node, (17 more...)

1803.03735

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)