Shi, Ge
LossLens: Diagnostics for Machine Learning through Loss Landscape Visual Analytics
Xie, Tiankai, Chen, Jiaqing, Yang, Yaoqing, Geniesse, Caleb, Shi, Ge, Chaudhari, Ajinkya, Cava, John Kevin, Mahoney, Michael W., Perciano, Talita, Weber, Gunther H., Maciejewski, Ross
Modern machine learning often relies on optimizing a neural network's parameters using a loss function to learn complex features. Beyond training, examining the loss function with respect to a network's parameters (i.e., as a loss landscape) can reveal insights into the architecture and learning process. While the local structure of the loss landscape surrounding an individual solution can be characterized using a variety of approaches, the global structure of a loss landscape, which includes potentially many local minima corresponding to different solutions, remains far more difficult to conceptualize and visualize. To address this difficulty, we introduce LossLens, a visual analytics framework that explores loss landscapes at multiple scales. LossLens integrates metrics from global and local scales into a comprehensive visual representation, enhancing model diagnostics. We demonstrate LossLens through two case studies: visualizing how residual connections influence a ResNet-20, and visualizing how physical parameters influence a physics-informed neural network (PINN) solving a simple convection problem.
SoGraB: A Visual Method for Soft Grasping Benchmarking and Evaluation
Greenland, Benjamin G., Pinskier, Josh, Wang, Xing, Nguyen, Daniel, Shi, Ge, Bandyopadhyay, Tirthankar, Chung, Jen Jen, Howard, David
Recent years have seen soft robotic grippers gain increasing attention due to their ability to robustly grasp soft and fragile objects. However, a commonly available standardised evaluation protocol has not yet been developed to assess the performance of varying soft robotic gripper designs. This work introduces a novel protocol, the Soft Grasping Benchmarking and Evaluation (SoGraB) method, to evaluate grasping quality, which quantifies object deformation by using the Density-Aware Chamfer Distance (DCD) between point clouds of soft objects before and after grasping. We validated our protocol in extensive experiments, which involved ranking three Fin-Ray gripper designs with a subset of the EGAD object dataset. The protocol appropriately ranked grippers based on object deformation information, validating the method's ability to select soft grippers for complex grasping tasks and benchmark them for comparison against future designs.
DexGrip: Multi-modal Soft Gripper with Dexterous Grasping and In-hand Manipulation Capacity
Wang, Xing, Horrigan, Liam, Pinskier, Josh, Shi, Ge, Viswanathan, Vinoth, Liow, Lois, Bandyopadhyay, Tirthankar, Chung, Jen Jen, Howard, David
The ability of robotic grippers to not only grasp but also re-position and re-orient objects in-hand is crucial for achieving versatile, general-purpose manipulation. While recent advances in soft robotic grasping has greatly improved grasp quality and stability, their manipulation capabilities remain under-explored. This paper presents the DexGrip, a multi-modal soft robotic gripper for in-hand grasping, re-orientation and manipulation. DexGrip features a 3 Degrees of Freedom (DoFs) active suction palm and 3 active (rotating) grasping surfaces, enabling soft, stable, and dexterous grasping and manipulation without ever needing to re-grasp an object. Uniquely, these features enable complete 360 degree rotation in all three principal axes. We experimentally demonstrate these capabilities across a diverse set of objects and tasks. DexGrip successfully grasped, re-positioned, and re-oriented objects with widely varying stiffnesses, sizes, weights, and surface textures; and effectively manipulated objects that presented significant challenges for existing robotic grippers.
Visualizing Loss Functions as Topological Landscape Profiles
Geniesse, Caleb, Chen, Jiaqing, Xie, Tiankai, Shi, Ge, Yang, Yaoqing, Morozov, Dmitriy, Perciano, Talita, Mahoney, Michael W., Maciejewski, Ross, Weber, Gunther H.
In machine learning, a loss function measures the difference between model predictions and ground-truth (or target) values. For neural network models, visualizing how this loss changes as model parameters are varied can provide insights into the local structure of the so-called loss landscape (e.g., smoothness) as well as global properties of the underlying model (e.g., generalization performance). While various methods for visualizing the loss landscape have been proposed, many approaches limit sampling to just one or two directions, ignoring potentially relevant information in this extremely high-dimensional space. This paper introduces a new representation based on topological data analysis that enables the visualization of higher-dimensional loss landscapes. After describing this new topological landscape profile representation, we show how the shape of loss landscapes can reveal new details about model performance and learning dynamics, highlighting several use cases, including image segmentation (e.g., UNet) and scientific machine learning (e.g., physics-informed neural networks). Through these examples, we provide new insights into how loss landscapes vary across distinct hyperparameter spaces: we find that the topology of the loss landscape is simpler for better-performing models; and we observe greater variation in the shape of loss landscapes near transitions from low to high model performance.
PoliPrompt: A High-Performance Cost-Effective LLM-Based Text Classification Framework for Political Science
Liu, Menglin, Shi, Ge
Recent advancements in large language models (LLMs) have opened new avenues for enhancing text classification efficiency in political science, surpassing traditional machine learning methods that often require extensive feature engineering, human labeling, and task-specific training. However, their effectiveness in achieving high classification accuracy remains questionable. This paper introduces a three-stage in-context learning approach that leverages LLMs to improve classification accuracy while minimizing experimental costs. Our method incorporates automatic enhanced prompt generation, adaptive exemplar selection, and a consensus mechanism that resolves discrepancies between two weaker LLMs, refined by an advanced LLM. We validate our approach using datasets from the BBC news reports, Kavanaugh Supreme Court confirmation, and 2018 election campaign ads. The results show significant improvements in classification F1 score (+0.36 for zero-shot classification) with manageable economic costs (-78% compared with human labeling), demonstrating that our method effectively addresses the limitations of traditional machine learning while offering a scalable and reliable solution for text analysis in political science.
ChaosMining: A Benchmark to Evaluate Post-Hoc Local Attribution Methods in Low SNR Environments
Shi, Ge, Kan, Ziwen, Smucny, Jason, Davidson, Ian
In this study, we examine the efficacy of post-hoc local attribution methods in identifying features with predictive power from irrelevant ones in domains characterized by a low signal-to-noise ratio (SNR), a common scenario in real-world machine learning applications. We developed synthetic datasets encompassing symbolic functional, image, and audio data, incorporating a benchmark on the {\it (Model \(\times\) Attribution\(\times\) Noise Condition)} triplet. By rigorously testing various classic models trained from scratch, we gained valuable insights into the performance of these attribution methods in multiple conditions. Based on these findings, we introduce a novel extension to the notable recursive feature elimination (RFE) algorithm, enhancing its applicability for neural networks. Our experiments highlight its strengths in prediction and feature selection, alongside limitations in scalability. Further details and additional minor findings are included in the appendix, with extensive discussions. The codes and resources are available at \href{https://github.com/geshijoker/ChaosMining/}{URL}.
RAAMove: A Corpus for Analyzing Moves in Research Article Abstracts
Li, Hongzheng, Wang, Ruojin, Shi, Ge, Lv, Xing, Lei, Lei, Feng, Chong, Liu, Fang, Lin, Jinkun, Mei, Yangguang, Xu, Lingnan
Move structures have been studied in English for Specific Purposes (ESP) and English for Academic Purposes (EAP) for decades. However, there are few move annotation corpora for Research Article (RA) abstracts. In this paper, we introduce RAAMove, a comprehensive multi-domain corpus dedicated to the annotation of move structures in RA abstracts. The primary objective of RAAMove is to facilitate move analysis and automatic move identification. This paper provides a thorough discussion of the corpus construction process, including the scheme, data collection, annotation guidelines, and annotation procedures. The corpus is constructed through two stages: initially, expert annotators manually annotate high-quality data; subsequently, based on the human-annotated data, a BERT-based model is employed for automatic annotation with the help of experts' modification. The result is a large-scale and high-quality corpus comprising 33,988 annotated instances. We also conduct preliminary move identification experiments using the BERT-based model to verify the effectiveness of the proposed corpus and model. The annotated corpus is available for academic research purposes and can serve as essential resources for move analysis, English language teaching and writing, as well as move/discourse-related tasks in Natural Language Processing (NLP).
Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation
Shi, Ge, Yang, Zhili
Dynamic scene understanding is one of the most conspicuous field of interest among computer vision community. In order to enhance dynamic scene understanding, pixel-wise segmentation with neural networks is widely accepted. The latest researches on pixel-wise segmentation combined semantic and motion information and produced good performance. In this work, we propose a state of art architecture of neural networks to accurately and efficiently get the moving object proposals (MOP). We first train an unsupervised convolutional neural network (UnFlow) to generate optical flow estimation. Then we render the output of optical flow net to a fully convolutional SegNet model. The main contribution of our work is (1) Fine-tuning the pretrained optical flow model on the brand new DAVIS Dataset; (2) Leveraging fully convolutional neural networks with Encoder-Decoder architecture to segment objects. We developed the codes with TensorFlow, and executed the training and evaluation processes on an AWS EC2 instance.
Rethink Baseline of Integrated Gradients from the Perspective of Shapley Value
Liu, Shuyang, Chen, Zixuan, Shi, Ge, Wang, Ji, Fan, Changjie, Xiong, Yu, Hu, Runze Wu Yujing, Ji, Ze, Gao, Yang
Numerous approaches have attempted to interpret deep neural networks (DNNs) by attributing the prediction of DNN to its input features. One of the well-studied attribution methods is Integrated Gradients (IG). Specifically, the choice of baselines for IG is a critical consideration for generating meaningful and unbiased explanations for model predictions in different scenarios. However, current practice of exploiting a single baseline fails to fulfill this ambition, thus demanding multiple baselines. Fortunately, the inherent connection between IG and Aumann-Shapley Value forms a unique perspective to rethink the design of baselines. Under certain hypothesis, we theoretically analyse that a set of baseline aligns with the coalitions in Shapley Value. Thus, we propose a novel baseline construction method called Shapley Integrated Gradients (SIG) that searches for a set of baselines by proportional sampling to partly simulate the computation path of Shapley Value. Simulations on GridWorld show that SIG approximates the proportion of Shapley Values. Furthermore, experiments conducted on various image tasks demonstrate that compared to IG using other baseline methods, SIG exhibits an improved estimation of feature's contribution, offers more consistent explanations across diverse applications, and is generic to distinct data types or instances with insignificant computational overhead.
Boosting Event Extraction with Denoised Structure-to-Text Augmentation
wang, bo, Huang, Heyan, Wei, Xiaochi, Shi, Ge, Liu, Xiao, Feng, Chong, Zhou, Tong, Wang, Shuaiqiang, Yin, Dawei
Event extraction aims to recognize pre-defined event triggers and arguments from texts, which suffer from the lack of high-quality annotations. In most NLP applications, involving a large scale of synthetic training data is a practical and effective approach to alleviate the problem of data scarcity. However, when applying to the task of event extraction, recent data augmentation methods often neglect the problem of grammatical incorrectness, structure misalignment, and semantic drifting, leading to unsatisfactory performances. In order to solve these problems, we propose a denoised structure-to-text augmentation framework for event extraction DAEE, which generates additional training data through the knowledge-based structure-to-text generation model and selects the effective subset from the generated data iteratively with a deep reinforcement learning agent. Experimental results on several datasets demonstrate that the proposed method generates more diverse text representations for event extraction and achieves comparable results with the state-of-the-art.