AITopics | Chow, Trevor

Collaborating Authors

Chow, Trevor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Smoothie: Label Free Language Model Routing

Guha, Neel, Chen, Mayee F., Chow, Trevor, Khare, Ishan S., Ré, Christopher

arXiv.org Artificial IntelligenceDec-5-2024

Large language models (LLMs) are increasingly used in applications where LLM inputs may span many different tasks. Recent work has found that the choice of LLM is consequential, and different LLMs may be good for different input samples. Prior approaches have thus explored how engineers might select an LLM to use for each sample (i.e. routing). While existing routing methods mostly require training auxiliary models on human-annotated data, our work explores whether it is possible to perform unsupervised routing. We propose Smoothie, a weak supervision-inspired routing approach that requires no labeled data. Given a set of outputs from different LLMs, Smoothie constructs a latent variable graphical model over embedding representations of observable LLM outputs and unknown "true" outputs. Using this graphical model, we estimate sample-dependent quality scores for each LLM, and route each sample to the LLM with the highest corresponding score. We find that Smoothie's LLM quality-scores correlate with ground-truth model quality (correctly identifying the optimal model on 9/14 tasks), and that Smoothie outperforms baselines for routing by up to 10 points accuracy.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.04692

Country:

North America (0.46)
Asia (0.46)

Genre: Research Report (0.64)

Industry:

Semiconductors & Electronics (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Incidental Polysemanticity

Lecomte, Victor, Thaman, Kushal, Chow, Trevor, Schaeffer, Rylan, Koyejo, Sanmi

arXiv.org Artificial IntelligenceDec-5-2023

Polysemantic neurons (neurons that activate for a set of unrelated features) have been seen as a significant obstacle towards interpretability of task-optimized deep networks, with implications for AI safety. The classic origin story of polysemanticity is that the data contains more "features" than neurons, such that learning to perform a task forces the network to co-allocate multiple unrelated features to the same neuron, endangering our ability to understand the network's internal processing. In this work, we present a second and non-mutually exclusive origin story of polysemanticity. We show that polysemanticity can arise incidentally, even when there are ample neurons to represent all features in the data, using a combination of theory and experiments. This second type of polysemanticity occurs because random initialization can, by chance alone, initially assign multiple features to the same neuron, and the training dynamics then strengthen such overlap. Due to its origin, we term this \textit{incidental polysemanticity}.

artificial intelligence, machine learning, polysemanticity, (15 more...)

arXiv.org Artificial Intelligence

2312.03096

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Stanford MLab at SemEval-2023 Task 10: Exploring GloVe- and Transformer-Based Methods for the Explainable Detection of Online Sexism

Choi, Hee Jung, Chow, Trevor, Wan, Aaron, Yam, Hong Meng, Yogeswaran, Swetha, Zhou, Beining

arXiv.org Artificial IntelligenceMay-7-2023

Online sexism has the potential to inflict significant As such, given the increasing importance of explainable harm on women (Ortiz, 2023), and it is a serious detection in machine learning models, issue that must be addressed. With the increasing we propose and compare several natural language prevalence of social media, it has become easy for processing methods for doing so. We used GloVeand groups of people to spread sexist ideas and threaten transformer-based models, as well as various the safety of others, with online social networks becoming data cleaning and augmentation techniques, applying increasingly inundated by sexist comments them on Reddit and Gab textual data to detect (Founta et al., 2018).

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.04356

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.69)
Information Technology > Services (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback