AITopics

2411.15931

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)

arXiv.org Machine LearningNov-5-2023

Log-Concavity of Multinomial Likelihood Functions Under Interval Censoring Constraints on Frequencies or Their Partial Sums

Levin, Bruce, Learned-Miller, Erik

We show that the likelihood function for a multinomial vector observed under arbitrary interval censoring constraints on the frequencies or their partial sums is completely log-concave by proving that the constrained sample spaces comprise M-convex subsets of the discrete simplex.

artificial intelligence, lorentzian polynomial, polynomial, (11 more...)

2311.02763

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.40)

Industry: Law > Civil Rights & Constitutional Law (0.61)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

arXiv.org Artificial IntelligenceOct-7-2023

Machine Learning for Automated Mitral Regurgitation Detection from Cardiac Imaging

Xiao, Ke, Learned-Miller, Erik, Kalogerakis, Evangelos, Priest, James, Fiterau, Madalina

Mitral regurgitation (MR) is a heart valve disease with potentially fatal consequences that can only be forestalled through timely diagnosis and treatment. Traditional diagnosis methods are expensive, labor-intensive and require clinical expertise, posing a barrier to screening for MR. To overcome this impediment, we propose a new semi-supervised model for MR classification called CUSSP. CUSSP operates on cardiac imaging slices of the 4-chamber view of the heart. It uses standard computer vision techniques and contrastive models to learn from large amounts of unlabeled data, in conjunction with specialized classifiers to establish the first ever automated MR classification system. Evaluated on a test set of 179 labeled -- 154 non-MR and 25 MR -- sequences, CUSSP attains an F1 score of 0.69 and a ROC-AUC score of 0.88, setting the first benchmark result for this new task.

artificial intelligence, automated mitral regurgitation detection, machine learning, (1 more...)

doi: 10.1007/978-3-031-43990-2_23

2310.04871

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceSep-15-2023

Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes

Delattre, Fabien, Dirnfeld, David, Nguyen, Phat, Scarano, Stephen, Jones, Michael J., Miraldo, Pedro, Learned-Miller, Erik

We present an approach to estimating camera rotation in crowded, real-world scenes from handheld monocular video. While camera rotation estimation is a well-studied problem, no previous methods exhibit both high accuracy and acceptable speed in this setting. Because the setting is not addressed well by other datasets, we provide a new dataset and benchmark, with high-accuracy, rigorously verified ground truth, on 17 video sequences. Methods developed for wide baseline stereo (e.g., 5-point methods) perform poorly on monocular video. On the other hand, methods used in autonomous driving (e.g., SLAM) leverage specific sensor setups, specific motion models, or local optimization strategies (lagging batch processing) and do not generalize well to handheld video. Finally, for dynamic scenes, commonly used robustification techniques like RANSAC require large numbers of iterations, and become prohibitively slow. We introduce a novel generalization of the Hough transform on SO(3) to efficiently and robustly find the camera rotation most compatible with optical flow. Among comparably fast methods, ours reduces error by almost 50\% over the next best, and is more accurate than any method, irrespective of speed. This represents a strong new performance point for crowded scenes, an important setting for computer vision. The code and the dataset are available at https://fabiendelattre.com/robust-rotation-estimation.

artificial intelligence, optimization problem, rotation, (16 more...)

2309.08588

Genre: Research Report (0.64)

Industry: Automobiles & Trucks (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.34)

arXiv.org Artificial IntelligenceJun-29-2023

An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training

Chen, Zitian, Ding, Mingyu, Shen, Yikang, Zhan, Wei, Tomizuka, Masayoshi, Learned-Miller, Erik, Gan, Chuang

We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently. Despite considerable progress in multi-task learning, most efforts focus on learning from multi-label data: a single image set with multiple task labels. Such multi-label data sets are rare, small, and expensive. We say heterogeneous to refer to image sets with different task labels, or to combinations of single-task datasets. Few have explored training on such heterogeneous datasets. General-purpose vision models are still dominated by single-task pretraining, and it remains unclear how to scale up multi-task models by leveraging mainstream vision datasets designed for different purposes. The challenges lie in managing large intrinsic differences among vision tasks, including data distribution, architectures, task-specific modules, dataset scales, and sampling strategies. To address these challenges, we propose to modify and scale up mixture-of-experts (MoE) vision transformers, so that they can simultaneously learn classification, detection, and segmentation on diverse mainstream vision datasets including ImageNet, COCO, and ADE20K. Our approach achieves comparable results to single-task state-of-the-art models and demonstrates strong generalization on downstream tasks. Due to its emergent modularity, this general-purpose model decomposes into high-performing components, efficiently adapting to downstream tasks. We can fine-tune it with fewer training parameters, fewer model parameters, and less computation. Additionally, its modularity allows for easy expansion in continual-learning-without-forgetting scenarios. Finally, these functions can be controlled and combined to meet various demands of downstream tasks.

artificial intelligence, machine learning, natural language, (18 more...)

2306.17165

Country: North America > United States (0.67)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceMay-15-2023

Event Camera-based Visual Odometry for Dynamic Motion Tracking of a Legged Robot Using Adaptive Time Surface

Zhu, Shifan, Tang, Zhipeng, Yang, Michael, Learned-Miller, Erik, Kim, Donghyun

Our paper proposes a direct sparse visual odometry method that combines event and RGB-D data to estimate the pose of agile-legged robots during dynamic locomotion and acrobatic behaviors. Event cameras offer high temporal resolution and dynamic range, which can eliminate the issue of blurred RGB images during fast movements. This unique strength holds a potential for accurate pose estimation of agile-legged robots, which has been a challenging problem to tackle. Our framework leverages the benefits of both RGB-D and event cameras to achieve robust and accurate pose estimation, even during dynamic maneuvers such as jumping and landing a quadruped robot, the Mini-Cheetah. Our major contributions are threefold: Firstly, we introduce an adaptive time surface (ATS) method that addresses the whiteout and blackout issue in conventional time surfaces by formulating pixel-wise decay rates based on scene complexity and motion speed. Secondly, we develop an effective pixel selection method that directly samples from event data and applies sample filtering through ATS, enabling us to pick pixels on distinct features. Lastly, we propose a nonlinear pose optimization formula that simultaneously performs 3D-2D alignment on both RGB-based and event-based maps and images, allowing the algorithm to fully exploit the benefits of both data streams. We extensively evaluate the performance of our framework on both public datasets and our own quadruped robot dataset, demonstrating its effectiveness in accurately estimating the pose of agile robots during dynamic movements.

algorithm, artificial intelligence, video understanding, (19 more...)

2305.08962

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.88)

arXiv.org Artificial IntelligenceJan-18-2023

A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems

Baker, Megan M., New, Alexander, Aguilar-Simon, Mario, Al-Halah, Ziad, Arnold, Sébastien M. R., Ben-Iwhiwhu, Ese, Brna, Andrew P., Brooks, Ethan, Brown, Ryan C., Daniels, Zachary, Daram, Anurag, Delattre, Fabien, Dellana, Ryan, Eaton, Eric, Fu, Haotian, Grauman, Kristen, Hostetler, Jesse, Iqbal, Shariq, Kent, Cassandra, Ketz, Nicholas, Kolouri, Soheil, Konidaris, George, Kudithipudi, Dhireesha, Learned-Miller, Erik, Lee, Seungwon, Littman, Michael L., Madireddy, Sandeep, Mendez, Jorge A., Nguyen, Eric Q., Piatko, Christine D., Pilly, Praveen K., Raghavan, Aswin, Rahman, Abrar, Ramakrishnan, Santhosh Kumar, Ratzlaff, Neale, Soltoggio, Andrea, Stone, Peter, Sur, Indranil, Tang, Zhipeng, Tiwari, Saket, Vedder, Kyle, Wang, Felix, Xu, Zifan, Yanguas-Gil, Angel, Yedidsion, Harel, Yu, Shangqun, Vallabha, Gautam K.

Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through the development of "Lifelong Learning" systems that are capable of 1) Continuous Learning, 2) Transfer and Adaptation, and 3) Scalability. Unfortunately, efforts to improve these capabilities are typically treated as distinct areas of research that are assessed independently, without regard to the impact of each separate capability on other aspects of the system. We instead propose a holistic approach, using a suite of metrics and an evaluation framework to assess Lifelong Learning in a principled way that is agnostic to specific domains or system techniques. Through five case studies, we show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems. We highlight how the proposed suite of metrics quantifies performance trade-offs present during Lifelong Learning system development - both the widely discussed Stability-Plasticity dilemma and the newly proposed relationship between Sample Efficient and Robust Learning. Further, we make recommendations for the formulation and use of metrics to guide the continuing development of Lifelong Learning systems and assess their progress in the future.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

doi: 10.1016/j.neunet.2023.01.007

2301.07799

Country: North America > United States > California (0.92)

Genre:

Instructional Material (1.00)
Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Industry:

Education > Educational Setting > Continuing Education (1.00)
Government > Regional Government > North America Government > United States Government (0.46)
Education > Educational Technology > Educational Software > Computer Based Training (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.92)

arXiv.org Artificial IntelligenceDec-15-2022

Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners

Chen, Zitian, Shen, Yikang, Ding, Mingyu, Chen, Zhenfang, Zhao, Hengshuang, Learned-Miller, Erik, Gan, Chuang

Optimization in multi-task learning (MTL) is more challenging than single-task learning (STL), as the gradient from different tasks can be contradictory. When tasks are related, it can be beneficial to share some parameters among them (cooperation). However, some tasks require additional parameters with expertise in a specific type of data or discrimination (specialization). To address the MTL challenge, we propose Mod-Squad, a new model that is Modularized into groups of experts (a 'Squad'). This structure allows us to formalize cooperation and specialization as the process of matching experts and tasks. We optimize this matching process during the training of a single model. Specifically, we incorporate mixture of experts (MoE) layers into a transformer model, with a new loss that incorporates the mutual dependence between tasks and experts. As a result, only a small set of experts are activated for each task. This prevents the sharing of the entire backbone model between all tasks, which strengthens the model, especially when the training set size and the number of tasks scale up. More interestingly, for each task, we can extract the small set of experts as a standalone model that maintains the same performance as the large model. Extensive experiments on the Taskonomy dataset with 13 vision tasks and the PASCAL-Context dataset with 5 vision tasks show the superiority of our approach.

artificial intelligence, machine learning, mod-squad, (15 more...)

2212.08066

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Machine LearningFeb-1-2019

Nonparametric Curve Alignment

Mattar, Marwan, Ross, Michael, Learned-Miller, Erik

Congealing is a flexible nonparametric data-driven framework for the joint alignment of data. It has been successfully applied to the joint alignment of binary images of digits, binary images of object silhouettes, grayscale MRI images, color images of cars and faces, and 3D brain volumes. This research enhances congealing to practically and effectively apply it to curve data. We develop a parameterized set of nonlinear transformations that allow us to apply congealing to this type of data. We present positive results on aligning synthetic and real curve data sets and conclude with a discussion on extending this work to simultaneous alignment and clustering.

alignment, artificial intelligence, machine learning, (19 more...)

1902.00626

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

arXiv.org Machine LearningJan-6-2018

Active Bias: Training More Accurate Neural Networks by Emphasizing High Variance Samples

Chang, Haw-Shiuan, Learned-Miller, Erik, McCallum, Andrew

Self-paced learning and hard example mining re-weight training instances to improve learning accuracy. This paper presents two improved alternatives based on lightweight estimates of sample uncertainty in stochastic gradient descent (SGD): the variance in predicted probability of the correct class across iterations of mini-batch SGD, and the proximity of the correct class probability to the decision threshold. Extensive experimental results on six datasets show that our methods reliably improve accuracy in various network architectures, including additional gains on top of other popular training techniques, such as residual learning, momentum, ADAM, batch normalization, dropout, and distillation.

deep learning, epoch, neural network, (17 more...)

1704.07433

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report (1.00)

Industry:

Education (0.67)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)