Goto

Collaborating Authors

 icn



Introspective Classification with Convolutional Nets

Long Jin, Justin Lazarow, Zhuowen Tu

Neural Information Processing Systems

We propose introspective convolutional networks (ICN) that emphasize the importance of having convolutional neural networks empowered with generative capabilities. We employ a reclassification-by-synthesis algorithm to perform training using a formulation stemmed from the Bayes theory. Our ICN tries to iteratively: (1) synthesize pseudo-negative samples; and (2) enhance itself by improving the classification. The single CNN classifier learned is at the same time generative -- being able to directly synthesize new samples within its own discriminative model. We conduct experiments on benchmark datasets including MNIST, CIFAR-10, and SVHN using state-of-the-art CNN architectures, and observe improved classification results.


Correlation and Navigation in the Vocabulary Key Representation Space of Language Models

Peng, Letian, An, Chenyang, Shang, Jingbo

arXiv.org Artificial Intelligence

Language model (LM) decoding is based on the next-token prediction (NTP) probability distribution. For neural LMs (e.g., Transformer-based), NTP distribution is essentially a softmax-regularized dot product between an encoded input context (query) and fixed vocabulary representations (keys). In this paper, we study the effect of the key distribution on the NTP distribution, with a focus on whether the similarity between keys will trigger spurious correlations in NTP. Through knowledge-probing tasks, we show that in the NTP distribution, the few top-ranked tokens are typically accurate. However, the middle-ranked prediction is highly biased towards the tokens that are distributionally (not necessarily semantically) similar to these top ones. For instance, if "P" is predicted as the top-1 token, "A"-"Z" will all be ranked high in NTP, no matter whether they can lead to correct decoding results. This hurts the sampling diversity and makes the sampling of correct, long-tail results hopeless and noisy. We attempt to alleviate this issue via a novel in-context method that iteratively pushes the query representation away from explored regions. Specifically, we include the explored decoding results in the context and prompt the LM to generate something else, which encourages the LM to produce a query representation that has small dot products with explored keys. Experiments on knowledge-probing tasks show that our method leads to efficient navigation away from explored keys to correct new keys. We further extend our method to open-ended and chain-of-thought (for reasoning) generation. Experiment results show that ICN contributes to better generation diversity and improved self-consistency voting performance. Finally, we discuss potential training issues caused by the fixed key space together with the challenges and possible ways to address them in future research.


Multiscale Neuroimaging Features for the Identification of Medication Class and Non-Responders in Mood Disorder Treatment

Baker, Bradley T., Salman, Mustafa S., Fu, Zening, Iraji, Armin, Osuch, Elizabeth, Bockholt, Jeremy, Calhoun, Vince D.

arXiv.org Artificial Intelligence

In the clinical treatment of mood disorders, the complex behavioral symptoms presented by patients and variability of patient response to particular medication classes can create difficulties in providing fast and reliable treatment when standard diagnostic and prescription methods are used. Increasingly, the incorporation of physiological information such as neuroimaging scans and derivatives into the clinical process promises to alleviate some of the uncertainty surrounding this process. Particularly, if neural features can help to identify patients who may not respond to standard courses of anti-depressants or mood stabilizers, clinicians may elect to avoid lengthy and side-effect-laden treatments and seek out a different, more effective course that might otherwise not have been under consideration. Previously, approaches for the derivation of relevant neuroimaging features work at only one scale in the data, potentially limiting the depth of information available for clinical decision support. In this work, we show that the utilization of multi spatial scale neuroimaging features - particularly resting state functional networks and functional network connectivity measures - provide a rich and robust basis for the identification of relevant medication class and non-responders in the treatment of mood disorders. We demonstrate that the generated features, along with a novel approach for fast and automated feature selection, can support high accuracy rates in the identification of medication class and non-responders as well as the identification of novel, multi-scale biomarkers.


A Novel Representation to Improve Team Problem Solving in Real-Time

Doboli, Alex

arXiv.org Artificial Intelligence

This paper proposes a novel representation to support computing metrics that help understanding and improving in real-time a team's behavior during problem solving in real-life. Even though teams are important in modern activities, there is little computing aid to improve their activity. The representation captures the different mental images developed, enhanced, and utilized during solving. A case study illustrates the representation.


ICNS

#artificialintelligence

This video was recorded with a drone flying in Telluride CO. The raw events from the sensor are shown on an exponentially decaying time surface (bottom left), with ON events in yellow and OFF events in blue. An event-based Self Organising Map of feature detectors was trained on the data using our FEAST algorithm. The top left shows the feature detectors after training. When a new event is received from the camera, it is added to the time-surface and a local region (11x11 pixels) of the time-surface around the event is sent to the feature detectors.


Introspective Classification with Convolutional Nets

Jin, Long, Lazarow, Justin, Tu, Zhuowen

Neural Information Processing Systems

We propose introspective convolutional networks (ICN) that emphasize the importance of having convolutional neural networks empowered with generative capabilities. We employ a reclassification-by-synthesis algorithm to perform training using a formulation stemmed from the Bayes theory. Our ICN tries to iteratively: (1) synthesize pseudo-negative samples; and (2) enhance itself by improving the classification. The single CNN classifier learned is at the same time generative --- being able to directly synthesize new samples within its own discriminative model. We conduct experiments on benchmark datasets including MNIST, CIFAR-10, and SVHN using state-of-the-art CNN architectures, and observe improved classification results.