AITopics | Lu, Alex X.

Collaborating Authors

Lu, Alex X.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ASL STEM Wiki: Dataset and Benchmark for Interpreting STEM Articles

Yin, Kayo, Singh, Chinmay, Minakov, Fyodor O., Milan, Vanessa, Daumé, Hal III, Zhang, Cyril, Lu, Alex X., Bragg, Danielle

arXiv.org Artificial IntelligenceNov-8-2024

Deaf and hard-of-hearing (DHH) students face significant barriers in accessing science, technology, engineering, and mathematics (STEM) education, notably due to the scarcity of STEM resources in signed languages. To help address this, we introduce ASL STEM Wiki: a parallel corpus of 254 Wikipedia articles on STEM topics in English, interpreted into over 300 hours of American Sign Language (ASL). ASL STEM Wiki is the first continuous signing dataset focused on STEM, facilitating the development of AI resources for STEM education in ASL. We identify several use cases of ASL STEM Wiki with human-centered applications. Figure 1: One use case of ASL STEM Wiki is automatic For example, because this dataset sign suggestion. Given an English sentence and a video highlights the frequent use of fingerspelling for of its ASL interpretation, the model detects all clips of technical concepts, which inhibits DHH students' ASL that contains fingerspelling (FS). Then, given the ability to learn, we develop models to detected FS clip and the English sentence, the model identify fingerspelled words--which can later identifies which English phrase in the sentence is fingerspelled be used to query for appropriate ASL signs to in the clip. The English phrase can be used to suggest to interpreters.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.05783

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Systemic Biases in Sign Language AI Research: A Deaf-Led Call to Reevaluate Research Agendas

Desai, Aashaka, De Meulder, Maartje, Hochgesang, Julie A., Kocab, Annemarie, Lu, Alex X.

arXiv.org Artificial IntelligenceMar-4-2024

Growing research in sign language recognition, generation, and translation AI has been accompanied by calls for ethical development of such technologies. While these works are crucial to helping individual researchers do better, there is a notable lack of discussion of systemic biases or analysis of rhetoric that shape the research questions and methods in the field, especially as it remains dominated by hearing non-signing researchers. Therefore, we conduct a systematic review of 101 recent papers in sign language AI. Our analysis identifies significant biases in the current state of sign language AI research, including an overfocus on addressing perceived communication barriers, a lack of use of representative datasets, use of annotations lacking linguistic foundations, and development of methods that build on flawed models. We take the position that the field lacks meaningful input from Deaf stakeholders, and is instead driven by what decisions are the most convenient or perceived as important to hearing researchers. We end with a call to action: the field must make space for Deaf researchers to lead the conversation in sign language AI.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2403.02563

Country:

North America > United States > Texas (0.46)
Europe > Germany (0.28)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ASL Citizen: A Community-Sourced Dataset for Advancing Isolated Sign Language Recognition

Desai, Aashaka, Berger, Lauren, Minakov, Fyodor O., Milan, Vanessa, Singh, Chinmay, Pumphrey, Kriston, Ladner, Richard E., Daumé, Hal III, Lu, Alex X., Caselli, Naomi, Bragg, Danielle

arXiv.org Artificial IntelligenceJun-19-2023

Sign languages are used as a primary language by approximately 70 million D/deaf people world-wide. However, most communication technologies operate in spoken and written languages, creating inequities in access. To help tackle this problem, we release ASL Citizen, the first crowdsourced Isolated Sign Language Recognition (ISLR) dataset, collected with consent and containing 83,399 videos for 2,731 distinct signs filmed by 52 signers in a variety of environments. We propose that this dataset be used for sign language dictionary retrieval for American Sign Language (ASL), where a user demonstrates a sign to their webcam to retrieve matching signs from a dictionary. We show that training supervised machine learning classifiers with our dataset advances the state-of-the-art on metrics relevant for dictionary retrieval, achieving 63% accuracy and a recall-at-10 of 91%, evaluated entirely on videos of users who are not present in the training or validation sets.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2304.05934

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Communications > Social Media > Crowdsourcing (0.35)

Add feedback

Protein structure generation via folding diffusion

Wu, Kevin E., Yang, Kevin K., Berg, Rianne van den, Zou, James Y., Lu, Alex X., Amini, Ava P.

arXiv.org Artificial IntelligenceNov-23-2022

The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a new diffusion-based generative model that designs protein backbone structures via a procedure that mirrors the native folding process. We describe protein backbone structure as a series of consecutive angles capturing the relative orientation of the constituent amino acid residues, and generate new structures by denoising from a random, unfolded state towards a stable folded structure. Not only does this mirror how proteins biologically twist into energetically favorable conformations, the inherent shift and rotational invariance of this representation crucially alleviates the need for complex equivariant networks. We train a denoising diffusion probabilistic model with a simple transformer backbone and demonstrate that our resulting model unconditionally generates highly realistic protein structures with complexity and structural patterns akin to those of naturally-occurring proteins. As a useful resource, we release the first open-source codebase and trained models for protein structure diffusion.

artificial intelligence, backbone, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2209.15611

Genre: Research Report (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CytoImageNet: A large-scale pretraining dataset for bioimage transfer learning

Hua, Stanley Bryan Z., Lu, Alex X., Moses, Alan M.

arXiv.org Artificial IntelligenceNov-23-2021

Motivation: In recent years, image-based biological assays have steadily become high-throughput, sparking a need for fast automated methods to extract biologically-meaningful information from hundreds of thousands of images. Taking inspiration from the success of ImageNet, we curate CytoImageNet, a large-scale dataset of openly-sourced and weakly-labeled microscopy images (890K images, 894 classes). Pretraining on CytoImageNet yields features that are competitive to ImageNet features on downstream microscopy classification tasks. We show evidence that CytoImageNet features capture information not available in ImageNet-trained features. The dataset is made available at https://www.kaggle.com/stanleyhua/cytoimagenet.

artificial intelligence, health & medicine, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2111.11646

Country:

North America > Canada > Ontario > Toronto (0.16)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.42)

Add feedback

The Cells Out of Sample (COOS) dataset and benchmarks for measuring out-of-sample generalization of image classifiers

Lu, Alex X., Lu, Amy X., Schormann, Wiebke, Andrews, David W., Moses, Alan M.

arXiv.org Machine LearningJun-17-2019

Understanding if classifiers generalize to out-of-sample datasets is a central problem in machine learning. Microscopy images provide a standardized way to measure the generalization capacity of image classifiers, as we can image the same classes of objects under increasingly divergent, but controlled factors of variation. We created a public dataset of 132,209 images of mouse cells, COOS-7 (Cells Out Of Sample 7-Class). COOS-7 provides a classification setting where four test datasets have increasing degrees of covariate shift: some images are random subsets of the training data, while others are from experiments reproduced months later and imaged by different instruments. We benchmarked a range of classification models using different representations, including transferred neural network features, end-to-end classification with a supervised deep CNN, and features from a self-supervised CNN. While most classifiers perform well on test datasets similar to the training dataset, all classifiers failed to generalize their performance to datasets with greater covariate shifts. These baselines highlight the challenges of covariate shifts in image data, and establish metrics for improving the generalization capacity of image classifiers.

dataset, health & medicine, neural network, (19 more...)

arXiv.org Machine Learning

1906.07282

Country: North America > Canada > Ontario > Toronto (0.29)

Genre: Research Report (0.86)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.33)

Add feedback