Anand, Abhishek
Don't Blame the Data, Blame the Model: Understanding Noise and Bias When Learning from Subjective Annotations
Anand, Abhishek, Mokhberian, Negar, Kumar, Prathyusha Naresh, Saha, Anweasha, He, Zihao, Rao, Ashwin, Morstatter, Fred, Lerman, Kristina
Researchers have raised awareness about the harms of aggregating labels especially in subjective tasks that naturally contain disagreements among human annotators. In this work we show that models that are only provided aggregated labels show low confidence on high-disagreement data instances. While previous studies consider such instances as mislabeled, we argue that the reason the high-disagreement text instances have been hard-to-learn is that the conventional aggregated models underperform in extracting useful signals from subjective tasks. Inspired by recent studies demonstrating the effectiveness of learning from raw annotations, we investigate classifying using Multiple Ground Truth (Multi-GT) approaches. Our experiments show an improvement of confidence for the high-disagreement instances.
C3S Micro-architectural Enhancement: Spike Encoder Block and Relaxing Gamma Clock (Asynchronous)
Anand, Alok, Khokhlov, Ivan, Anand, Abhishek
The field of neuromorphic computing is rapidly evolving. As both biological accuracy and practical implementations are explored, existing architectures are modified and improved for both purposes. The Temporal Neural Network(TNN) style of architecture is a good basis for approximating biological neurons due to its use of timed pulses to encode data and a voltage-threshold-like system. Using the Temporal Neural Network cortical column C3S architecture design as a basis, this project seeks to augment the network's design. This project takes note of two ideas and presents their designs with the goal of improving existing cortical column architecture. One need in this field is for an encoder that could convert between common digital formats and timed neuronal spikes, as biologically accurate networks are temporal in nature. To this end, this project presents an encoder to translate between binary encoded values and timed spikes to be processed by the neural network. Another need is for the reduction of wasted processing time to idleness, caused by lengthy Gamma cycle processing bursts. To this end, this project presents a relaxation of Gamma cycles to allow for them to end arbitrarily early once the network has determined an output response. With the goal of contributing to the betterment of the field of neuromorphic computer architecture, designs for both a binary-to-spike encoder, as well as a Gamma cycle controller, are presented and evaluated for optimal design parameters, with overall system gain and performance.
Contextually Guided Semantic Labeling and Search for 3D Point Clouds
Anand, Abhishek, Koppula, Hema Swetha, Joachims, Thorsten, Saxena, Ashutosh
RGB-D cameras, which give an RGB image to- gether with depths, are becoming increasingly popular for robotic perception. In this paper, we address the task of detecting commonly found objects in the 3D point cloud of indoor scenes obtained from such cameras. Our method uses a graphical model that captures various features and contextual relations, including the local visual appearance and shape cues, object co-occurence relationships and geometric relationships. With a large number of object classes and relations, the model's parsimony becomes important and we address that by using multiple types of edge potentials. We train the model using a maximum-margin learning approach. In our experiments over a total of 52 3D scenes of homes and offices (composed from about 550 views), we get a performance of 84.06% and 73.38% in labeling office and home scenes respectively for 17 object classes each. We also present a method for a robot to search for an object using the learned model and the contextual information available from the current labelings of the scene. We applied this algorithm successfully on a mobile robot for the task of finding 12 object classes in 10 different offices and achieved a precision of 97.56% with 78.43% recall.
Semantic Labeling of 3D Point Clouds for Indoor Scenes
Koppula, Hema S., Anand, Abhishek, Joachims, Thorsten, Saxena, Ashutosh
Inexpensive RGB-D cameras that give an RGB image together with depth data have become widely available. In this paper, we use this data to build 3D point clouds of full indoor scenes such as an office and address the task of semantic labeling of these 3D point clouds. We propose a graphical model that captures various features and contextual relations, including the local visual appearance and shape cues, object co-occurence relationships and geometric relationships. With a large number of object classes and relations, the model’s parsimony becomes important and we address that by using multiple types of edge potentials. The model admits efficient approximate inference, and we train it using a maximum-margin learning approach. In our experiments over a total of 52 3D scenes of homes and offices (composed from about 550 views, having 2495 segments labeled with 27 object classes), we get a performance of 84.06% in labeling 17 object classes for offices, and 73.38% in labeling 17 object classes for home scenes. Finally, we applied these algorithms successfully on a mobile robot for the task of finding objects in large cluttered rooms.