Kassab, Lara
Quantile-Based Randomized Kaczmarz for Corrupted Tensor Linear Systems
Castillo, Alejandra, Haddock, Jamie, Hartsock, Iryna, Hoyos, Paulina, Kassab, Lara, Kryshchenko, Alona, Larripa, Kamila, Needell, Deanna, Suryanarayanan, Shambhavi, Djima, Karamatou Yacoubou
The reconstruction of tensor-valued signals from corrupted measurements, known as tensor regression, has become essential in many multi-modal applications such as hyperspectral image reconstruction and medical imaging. In this work, we address the tensor linear system problem $\mathcal{A} \mathcal{X}=\mathcal{B}$, where $\mathcal{A}$ is a measurement operator, $\mathcal{X}$ is the unknown tensor-valued signal, and $\mathcal{B}$ contains the measurements, possibly corrupted by arbitrary errors. Such corruption is common in large-scale tensor data, where transmission, sensory, or storage errors are rare per instance but likely over the entire dataset and may be arbitrarily large in magnitude. We extend the Kaczmarz method, a popular iterative algorithm for solving large linear systems, to develop a Quantile Tensor Randomized Kaczmarz (QTRK) method robust to large, sparse corruptions in the observations $\mathcal{B}$. This approach combines the tensor Kaczmarz framework with quantile-based statistics, allowing it to mitigate adversarial corruptions and improve convergence reliability. We also propose and discuss the Masked Quantile Randomized Kaczmarz (mQTRK) variant, which selectively applies partial updates to handle corruptions further. We present convergence guarantees, discuss the advantages and disadvantages of our approaches, and demonstrate the effectiveness of our methods through experiments, including an application for video deblurring.
Towards a Fairer Non-negative Matrix Factorization
Kassab, Lara, George, Erin, Needell, Deanna, Geng, Haowen, Nia, Nika Jafar, Li, Aoxi
Topic modeling, or more broadly, dimensionality reduction, techniques provide powerful tools for uncovering patterns in large datasets and are widely applied across various domains. We investigate how Non-negative Matrix Factorization (NMF) can introduce bias in the representation of data groups, such as those defined by demographics or protected attributes. We present an approach, called Fairer-NMF, that seeks to minimize the maximum reconstruction loss for different groups relative to their size and intrinsic complexity. Further, we present two algorithms for solving this problem. The first is an alternating minimization (AM) scheme and the second is a multiplicative updates (MU) scheme which demonstrates a reduced computational time compared to AM while still achieving similar performance. Lastly, we present numerical experiments on synthetic and real datasets to evaluate the overall performance and trade-offs of Fairer-NMF
Parameters, Properties, and Process: Conditional Neural Generation of Realistic SEM Imagery Towards ML-assisted Advanced Manufacturing
Howland, Scott, Kassab, Lara, Kappagantula, Keerti, Kvinge, Henry, Emerson, Tegan
The research and development cycle of advanced manufacturing processes traditionally requires a large investment of time and resources. Experiments can be expensive and are hence conducted on relatively small scales. This poses problems for typically data-hungry machine learning tools which could otherwise expedite the development cycle. We build upon prior work by applying conditional generative adversarial networks (GANs) to scanning electron microscope (SEM) imagery from an emerging manufacturing process, shear assisted processing and extrusion (ShAPE). We generate realistic images conditioned on temper and either experimental parameters or material properties. In doing so, we are able to integrate machine learning into the development cycle, by allowing a user to immediately visualize the microstructure that would arise from particular process parameters or properties. This work forms a technical backbone for a fundamentally new approach for understanding manufacturing processes in the absence of first-principle models. By characterizing microstructure from a topological perspective we are able to evaluate our models' ability to capture the breadth and diversity of experimental scanning electron microscope (SEM) samples. Our method is successful in capturing the visual and general microstructural features arising from the considered process, with analysis highlighting directions to further improve the topological realism of our synthetic imagery.