AITopics

2211.12485

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceNov-7-2022

What Language Model to Train if You Have One Million GPU Hours?

Scao, Teven Le, Wang, Thomas, Hesslow, Daniel, Saulnier, Lucile, Bekman, Stas, Bari, M Saiful, Biderman, Stella, Elsahar, Hady, Muennighoff, Niklas, Phang, Jason, Press, Ofir, Raffel, Colin, Sanh, Victor, Shen, Sheng, Sutawika, Lintang, Tae, Jaesung, Yong, Zheng Xin, Launay, Julien, Beltagy, Iz

The crystallization of modeling methods around the Transformer architecture has been a boon for practitioners. Simple, well-motivated architectural variations can transfer across tasks and scale, increasing the impact of modeling research. However, with the emergence of state-of-the-art 100B+ parameters models, large language models are increasingly expensive to accurately design and train. Notably, it can be difficult to evaluate how modeling decisions may impact emergent capabilities, given that these capabilities arise mainly from sheer scale alone. In the process of building BLOOM--the Big Science Large Open-science Open-access Multilingual language model--our goal is to identify an architecture and training setup that makes the best use of our 1,000,000 A100-GPU-hours budget. Specifically, we perform an ablation study at the billion-parameter scale comparing different modeling practices and their impact on zero-shot generalization. In addition, we study the impact of various popular pre-training corpora on zero-shot generalization. We also study the performance of a multilingual model and how it compares to the English-only one. Finally, we consider the scaling behaviour of Transformers to choose the target model size, shape, and training setup. All our models and code are open-sourced at https://huggingface.co/bigscience .

artificial intelligence, machine learning, natural language, (17 more...)

2210.15424

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

arXiv.org Artificial IntelligenceOct-19-2022

Two-Turn Debate Doesn't Help Humans Answer Hard Reading Comprehension Questions

Parrish, Alicia, Trivedi, Harsh, Nangia, Nikita, Padmakumar, Vishakh, Phang, Jason, Saimbhi, Amanpreet Singh, Bowman, Samuel R.

The use of language-model-based question-answering systems to aid humans in completing difficult tasks is limited, in part, by the unreliability of the text these systems generate. Using hard multiple-choice reading comprehension questions as a testbed, we assess whether presenting humans with arguments for two competing answer options, where one is correct and the other is incorrect, allows human judges to perform more accurately, even when one of the arguments is unreliable and deceptive. If this is helpful, we may be able to increase our justified trust in language-model-based systems by asking them to produce these arguments where needed. Previous research has shown that just a single turn of arguments in this format is not helpful to humans. However, as debate settings are characterized by a back-and-forth dialogue, we follow up on previous results to test whether adding a second round of counter-arguments is helpful to humans. We find that, regardless of whether they have access to arguments or not, humans perform similarly on our task. These findings suggest that, in the case of answering reading comprehension questions, debate is not a helpful format.

argument, natural language, question answering, (19 more...)

2210.1086

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Education > Assessment & Standards > Student Performance (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.54)

arXiv.org Artificial IntelligenceAug-8-2022

Investigating Efficiently Extending Transformers for Long Input Summarization

Phang, Jason, Zhao, Yao, Liu, Peter J.

While large pretrained Transformer models have proven highly capable at tackling natural language tasks, handling long sequence inputs continues to be a significant challenge. One such task is long input summarization, where inputs are longer than the maximum input context of most pretrained models. Through an extensive set of experiments, we investigate what model architectural changes and pretraining paradigms can most efficiently adapt a pretrained Transformer for long input summarization. We find that a staggered, block-local Transformer with global encoder tokens strikes a good balance of performance and efficiency, and that an additional pretraining phase on long sequences meaningfully improves downstream summarization performance. Based on our findings, we introduce PEGASUS-X, an extension of the PEGASUS model with additional long input pretraining to handle inputs of up to 16K tokens. PEGASUS-X achieves strong performance on long input summarization tasks comparable with much larger models while adding few additional parameters and not requiring model parallelism to train.

artificial intelligence, machine learning, natural language, (20 more...)

2208.04347

Country:

North America > United States (0.67)
Europe (0.67)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningFeb-13-2020

An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization

Shen, Yiqiu, Wu, Nan, Phang, Jason, Park, Jungkyu, Liu, Kangning, Tyagi, Sudarshini, Heacock, Laura, Kim, S. Gene, Moy, Linda, Cho, Kyunghyun, Geras, Krzysztof J.

Medical images differ from natural images in significantly higher resolutions and smaller regions of interest. Because of these differences, neural network architectures that work well for natural images might not be applicable to medical image analysis. In this work, we extend the globally-aware multiple instance classifier, a framework we proposed to address these unique properties of medical images. This model first uses a low-capacity, yet memory-efficient, network on the whole image to identify the most informative regions. It then applies another higher-capacity network to collect details from chosen regions. Finally, it employs a fusion module that aggregates global and local information to make a final prediction. While existing methods often require lesion segmentation during training, our model is trained with only image-level labels and can generate pixel-level saliency maps indicating possible malignant findings. We apply the model to screening mammography interpretation: predicting the presence or absence of benign and malignant lesions. On the NYU Breast Cancer Screening Dataset, consisting of more than one million images, our model achieves an AUC of 0.93 in classifying breasts with malignant findings, outperforming ResNet-34 and Faster R-CNN. Compared to ResNet-34, our model is 4.1x faster for inference while using 78.4% less GPU memory. Furthermore, we demonstrate, in a reader study, that our model surpasses radiologist-level AUC by a margin of 0.11. The proposed model is available online: https://github.com/nyukat/GMIC.

deep learning, neural network, saliency map, (23 more...)

2002.07613

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Machine LearningAug-1-2019

Improving localization-based approaches for breast cancer screening exam classification

Févry, Thibault, Phang, Jason, Wu, Nan, Kim, S. Gene, Moy, Linda, Cho, Kyunghyun, Geras, Krzysztof J.

We trained and evaluated a localization-based deep CNN for breast cancer screening exam classification on over 200,000 exams (over 1,000,000 images). Our model achieves an AUC of 0.919 in predicting malignancy in patients undergoing breast cancer screening, reducing the error rate of the baseline (Wu et al., 2019a) by 23%. In addition, the models generates bounding boxes for benign and malignant findings, providing interpretable predictions.

deep learning, localization-based approach, neural network, (19 more...)

1908.00615

Country: North America > United States (0.29)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

arXiv.org Machine LearningJul-30-2019

Screening Mammogram Classification with Prior Exams

Park, Jungkyu, Phang, Jason, Shen, Yiqiu, Wu, Nan, Kim, S. Gene, Moy, Linda, Cho, Kyunghyun, Geras, Krzysztof J.

Medical Imaging with Deep Learning 2019 MIDL 2019 - Extended Abstract Track Screening Mammogram Classification with Prior Exams Jungkyu Park 1, Jason Phang 1, Yiqiu Shen 1, Nan Wu 1, S. Gene Kim 2, Linda Moy 2, Kyunghyun Cho 1, Krzysztof J. Geras 2, 1 1 Center for Data Science, New York University 2 Department of Radiology, New York University School of Medicine 1. Introduction Screening mammography had been shown to significantly reduce the mortality rate for breast cancer (Kopans, 2002; Duffy et al., 2002a,b), the second leading cause of cancer-related deaths among women in the United States. However, there is a high rate of false positive recalls and biopsies associated with breast cancer screening. Among the 10-15% of women asked for recall, only 10-20% within that subset are recommended for biopsy. Among those biopsies, only 20-40% are diagnosed with cancer (Kopans, 2015). Given the success of deep learning in computer vision, many deep neural network models have been applied to breast cancer screening (Ribli et al., 2018; Lotter et al., 2017; Geras et al., 2017; Wu et al., 2018, 2019a).

deep learning, exam, neural network, (20 more...)

1907.13057

Country: North America > United States > New York (0.45)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

arXiv.org Machine LearningJun-6-2019

Globally-Aware Multiple Instance Classifier for Breast Cancer Screening

Shen, Yiqiu, Wu, Nan, Phang, Jason, Park, Jungkyu, Kim, Gene, Moy, Linda, Cho, Kyunghyun, Geras, Krzysztof J.

Deep learning models designed for visual classification tasks on natural images have become prevalent in medical image analysis. However, medical images differ from typical natural images in many ways, such as significantly higher resolutions and smaller regions of interest. Moreover, both the global structure and local details play important roles in medical image analysis tasks. To address these unique properties of medical images, we propose a neural network that is able to classify breast cancer lesions utilizing information from both a global saliency map and multiple local patches. The proposed model outperforms the ResNet-based baseline and achieves radiologist-level performance in the interpretation of screening mammography. Although our model is trained only with image-level labels, it is able to generate pixel-level saliency maps that provide localization of possible malignant findings.

deep learning, neural network, prediction, (24 more...)

1906.02846

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

arXiv.org Machine LearningMar-19-2019

Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening

Wu, Nan, Phang, Jason, Park, Jungkyu, Shen, Yiqiu, Huang, Zhe, Zorin, Masha, Jastrzębski, Stanisław, Févry, Thibault, Katsnelson, Joe, Kim, Eric, Wolfson, Stacey, Parikh, Ujas, Gaddam, Sushma, Lin, Leng Leng Young, Ho, Kara, Weinstein, Joshua D., Reig, Beatriu, Gao, Yiming, Toth, Hildegard, Pysarenko, Kristine, Lewin, Alana, Lee, Jiyon, Airola, Krystal, Mema, Eralda, Chung, Stephanie, Hwang, Esther, Samreen, Naziya, Kim, S. Gene, Heacock, Laura, Moy, Linda, Cho, Kyunghyun, Geras, Krzysztof J.

This paper makes several contributions. Among these, only 20-40% yield a diagnosis of cancer (5). The authors declare no conflict of interest. To whom correspondence should be addressed. Work done while visiting NYU. In the reader study, we compared the performance of our best model to that of radiologists and found our model to be as accurate as radiologists both in terms of area under ROC curve (AUC) and area under precision-recall curve (PRAUC). We also found that a hybrid model, taking the average of the probabilities of malignancy predicted by a radiologist and by our neural network, yields more accurate predictions than either of the two separately. This suggests that our network and radiologists learned different aspects of the task and that our model could be effective as a tool providing radiologists a second reader. With this contribution, research groups that are working on improving screening mammography, which may not have access to a large training dataset like ours, will be able to directly use our model in their research or to use our pretrained weights as an initialization to train models with less data. By making our models public, we invite other groups to validate our results and test their robustness to shifts in the data distribution. The dataset includes 229,426 digital screening mammography exams (1,001,093 images) from 141,473 patients. For each breast, we assign two binary labels: from biopsies. We have 5,832 exams with at least one biopsy the absence/presence of malignant findings in a breast, performed within 120 days of the screening mammogram. With Among these, biopsies confirmed malignant findings for 985 left and right breasts, each exam has a total of four binary (8.4%) breasts and benign findings for 5,556 (47.6%) breasts.

deep learning, neural network, prediction, (24 more...)

1903.08297

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)