AITopics | Whitehill, Jacob

Collaborating Authors

Whitehill, Jacob

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Automated Evaluation of Classroom Instructional Support with LLMs and BoWs: Connecting Global Predictions to Specific Feedback

Whitehill, Jacob, LoCasale-Crouch, Jennifer

arXiv.org Artificial IntelligenceJan-7-2024

With the aim to provide teachers with more specific, frequent, and actionable feedback about their teaching, we explore how Large Language Models (LLMs) can be used to estimate ``Instructional Support'' domain scores of the CLassroom Assessment Scoring System (CLASS), a widely used observation protocol. We design a machine learning architecture that uses either zero-shot prompting of Meta's Llama2, and/or a classic Bag of Words (BoW) model, to classify individual utterances of teachers' speech (transcribed automatically using OpenAI's Whisper) for the presence of Instructional Support. Then, these utterance-level judgments are aggregated over an entire 15-min observation session to estimate a global CLASS score. Experiments on two CLASS-coded datasets of toddler and pre-kindergarten classrooms indicate that (1) automatic CLASS Instructional Support estimation accuracy using the proposed method (Pearson $R$ up to $0.47$) approaches human inter-rater reliability (up to $R=0.55$); (2) LLMs yield slightly greater accuracy than BoW for this task, though the best models often combined features extracted from both LLM and BoW; and (3) for classifying individual utterances, there is still room for improvement of automated methods compared to human-level judgments. Finally, (4) we illustrate how the model's outputs can be visualized at the utterance level to provide teachers with explainable feedback on which utterances were most positively or negatively correlated with specific CLASS dimensions.

large language model, machine learning, utterance, (20 more...)

arXiv.org Artificial Intelligence

2310.01132

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)
Research Report > Strength High (0.46)

Industry:

Education > Educational Technology > Educational Software (0.93)
Education > Educational Setting > K-12 Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Compositional Clustering: Applications to Multi-Label Object Recognition and Speaker Identification

Li, Zeqian, He, Xinlu, Whitehill, Jacob

arXiv.org Artificial IntelligenceJul-21-2023

The goal is not just to partition the data into distinct and coherent groups, but also to infer the compositional relationships among the groups. This scenario arises in speaker diarization (i.e., infer who is speaking when from an audio wave) in the presence of simultaneous speech from multiple speakers [6, 36], which occurs frequently in real-world speech settings: The audio at each time t is generated as a composition of the voices of all the people speaking at time t, and the goal is to cluster the audio samples, over all timesteps, into sets of speakers. Hence, if there are 2 people who sometimes speak by themselves and sometimes speak simultaneously, then the clusters would correspond to the speaker sets {1}, {2}, and {1, 2} - the third cluster is not a third independent speaker, but rather the composition of the first two speakers. An analogous scenario arises in open-world (i.e., test classes are disjoint from training classes) multi-label object recognition when clustering images such that each image may contain multiple objects from a fixed set (e.g., the shapes in Figure 1). In some scenarios, the composition function that specifies how examples are generated from other examples might be as simple as superposition by element-wise maximum or addition. However, a more powerful form of composition - and the main motivation for our work - is enabled by compositional embedding models, which are a new technique for few-shot learning.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.patcog.2023.109829

2109.0416

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Harnessing Geometric Constraints from Auxiliary Labels to Improve Embedding Functions for One-Shot Learning

Ramakrishnan, Anand, Pham, Minh, Whitehill, Jacob

arXiv.org Artificial IntelligenceMar-5-2021

We explore the utility of harnessing auxiliary labels (e.g., facial expression) to impose geometric structure when training embedding models for one-shot learning (e.g., for face verification). We introduce novel geometric constraints on the embedding space learned by a deep model using either manually annotated or automatically detected auxiliary labels. We contrast their performances (AUC) on four different face datasets(CK+, VGGFace-2, Tufts Face, and PubFig). Due to the additional structure encoded in the embedding space, our methods provide a higher verification accuracy (99.7, 86.2, 99.4, and 79.3% with our proposed TL+PDP+FBV loss, versus 97.5, 72.6, 93.1, and 70.5% using a standard Triplet Loss on the four datasets, respectively). Our method is implemented purely in terms of the loss function. It does not require any changes to the backbone of the embedding functions.

artificial intelligence, auxiliary label, constraint-based reasoning, (16 more...)

arXiv.org Artificial Intelligence

2103.03862

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.85)

Add feedback

Automatic Classifiers as Scientific Instruments: One Step Further Away from Ground-Truth

Whitehill, Jacob

arXiv.org Machine LearningDec-19-2018

Automatic detectors of facial expression, gesture, affect, etc., can serve as scientific instruments to measure many behavioral and social phenomena (e.g., emotion, empathy, stress, engagement, etc.), and this has great potential to advance basic science. However, when a detector $d$ is trained to approximate an existing measurement tool (e.g., observation protocol, questionnaire), then care must be taken when interpreting measurements collected using $d$ since they are one step further removed from the underlying construct. We examine how the accuracy of $d$, as quantified by the correlation $q$ of $d$'s outputs with the ground-truth construct $U$, impacts the estimated correlation between $U$ (e.g., stress) and some other phenomenon $V$ (e.g., academic performance). In particular: (1) We show that if the true correlation between $U$ and $V$ is $r$, then the expected sample correlation, over all vectors $\mathcal{T}^n$ whose correlation with $U$ is $q$, is $qr$. (2) We derive a formula to compute the probability that the sample correlation (over $n$ subjects) using $d$ is positive, given that the true correlation between $U$ and $V$ is negative (and vice-versa). We show that this probability is non-negligible (around $10-15\%$) for values of $n$ and $q$ that have been used in recent affective computing studies. (3) With the goal to reduce the variance of correlations estimated by an automatic detector, we show empirically that training multiple neural networks $d^{(1)},\ldots,d^{(m)}$ using different training configurations (e.g., architectures, hyperparameters) for the same detection task provides only limited `coverage' of $\mathcal{T}^n$.

artificial intelligence, correlation, neural network, (20 more...)

arXiv.org Machine Learning

1812.08255

Country: Europe (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.47)

Add feedback

Exploiting an Oracle That Reports AUC Scores in Machine Learning Contests

Whitehill, Jacob (Harvard University)

AAAI ConferencesApr-19-2016

In machine learning contests such as the ImageNet Large Scale Visual Recognition Challenge and the KDD Cup, contestants can submit candidate solutions and receive from an oracle (typically the organizers of the competition) the accuracy of their guesses compared to the ground-truth labels. One of the most commonly used accuracy metrics for binary classification tasks is the Area Under the Receiver Operating Characteristics Curve (AUC). In this paper we provide proofs-of-concept of how knowledge of the AUC of a set of guesses can be used, in two different kinds of attacks, to improve the accuracy of those guesses. On the other hand, we also demonstrate the intractability of one kind of AUC exploit by proving that the number of possible binary labelings of n examples for which a candidate solution obtains a AUC score of c grows exponentially in n, for every c in (0,1).

artificial intelligence, contestant, machine learning, (19 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise

Whitehill, Jacob, Wu, Ting-fan, Bergsma, Jacob, Movellan, Javier R., Ruvolo, Paul L.

Neural Information Processing SystemsDec-31-2009

Modern machine learning-based approaches to computer vision require very large databases of labeled images. Some contemporary vision systems already require on the order of millions of images for training (e.g., Omron face detector). While the collection of these large databases is becoming a bottleneck, new Internet-based services that allow labelers from around the world to be easily hired and managed provide a promising solution. However, using these services to label large databases brings with it new theoretical and practical challenges: (1) The labelers may have wide ranging levels of expertise which are unknown a priori, and in some cases may be adversarial; (2) images may vary in their level of difficulty; and (3) multiple labels for the same image must be combined to provide an estimate of the actual label of the image. Probabilistic approaches provide a principled way to approach these problems. In this paper we present a probabilistic model and use it to simultaneously infer the label of each image, the expertise of each labeler, and the difficulty of each image. On both simulated and real data, we demonstrate that the model outperforms the commonly used ``Majority Vote heuristic for inferring image labels, and is robust to both adversarial and noisy labelers.

artificial intelligence, labeler, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Diego County (0.14)

Genre: Research Report (0.66)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Add feedback