Goto

Collaborating Authors

 feature 100


The Knowledge Microscope: Features as Better Analytical Lenses than Neurons

Chen, Yuheng, Cao, Pengfei, Liu, Kang, Zhao, Jun

arXiv.org Artificial Intelligence

Previous studies primarily utilize MLP neurons as units of analysis for understanding the mechanisms of factual knowledge in Language Models (LMs); however, neurons suffer from polysemanticity, leading to limited knowledge expression and poor interpretability. In this paper, we first conduct preliminary experiments to validate that Sparse Autoencoders (SAE) can effectively decompose neurons into features, which serve as alternative analytical units. With this established, our core findings reveal three key advantages of features over neurons: (1) Features exhibit stronger influence on knowledge expression and superior interpretability. (2) Features demonstrate enhanced monosemanticity, showing distinct activation patterns between related and unrelated facts. (3) Features achieve better privacy protection than neurons, demonstrated through our proposed FeatureEdit method, which significantly outperforms existing neuron-based approaches in erasing privacy-sensitive information from LMs.Code and dataset will be available.


Data Agnostic RoBERTa-based Natural Language to SQL Query Generation

Pal, Debaditya, Sharma, Harsh, Chaudhari, Kaustubh

arXiv.org Artificial Intelligence

Relational databases are among the most widely used architectures to store massive amounts of data in the modern world. However, there is a barrier between these databases and the average user. The user often lacks the knowledge of a query language such as SQL required to interact with the database. The NL2SQL task aims at finding deep learning approaches to solve this problem by converting natural language questions into valid SQL queries. Given the sensitive nature of some databases and the growing need for data privacy, we have presented an approach with data privacy at its core. We have passed RoBERTa embeddings and data-agnostic knowledge vectors into LSTM based submodels to predict the final query. Although we have not achieved state of the art results, we have eliminated the need for the table data, right from the training of the model, and have achieved a test set execution accuracy of 76.7%. By eliminating the table data dependency while training we have created a model capable of zero shot learning based on the natural language question and table schema alone.


Sobolev Independence Criterion

Mroueh, Youssef, Sercu, Tom, Rigotti, Mattia, Padhi, Inkit, Santos, Cicero Dos

arXiv.org Machine Learning

We propose the Sobolev Independence Criterion (SIC), an interpretable dependency measure between a high dimensional random variable X and a response variable Y . SIC decomposes to the sum of feature importance scores and hence can be used for nonlinear feature selection. SIC can be seen as a gradient regularized Integral Probability Metric (IPM) between the joint distribution of the two random variables and the product of their marginals. We use sparsity inducing gradient penalties to promote input sparsity of the critic of the IPM. In the kernel version we show that SIC can be cast as a convex optimization problem by introducing auxiliary variables that play an important role in feature selection as they are normalized feature importance scores. We then present a neural version of SIC where the critic is parameterized as a homogeneous neural network, improving its representation power as well as its interpretability. We conduct experiments validating SIC for feature selection in synthetic and real-world experiments. We show that SIC enables reliable and interpretable discoveries, when used in conjunction with the holdout randomization test and knockoffs to control the False Discovery Rate. Code is available at http://github.com/ibm/sic.


Christmas Spectacular in New York will feature 100 Intel Shooting Star Mini drones

#artificialintelligence

The Christmas Spectacular starring the Radio City Rockettes -- an annual holiday stage show presented at Radio City Music Hall in New York City -- began one year after the Music Hall's opening night in 1933, which featured the Missouri Rockets dance troupe out of St. Louis. What was originally a two-week, 30-minute performance featuring an overture, a ballet, and a handful of vignettes expanded into a 90-minute extravaganza complete with real-life animals, a 36-person cast, 1,100 costumes, and 11 digital projectors that's been viewed by more than 75 million people. And this year will mark the addition of something new to the mix: more than 100 specially designed Intel drones choreographed over the stage. Intel says it's the first time its Shooting Star Mini drones have been incorporated into a theatrical indoor performance, and it claims it'll be the world's largest interior drone show. "We are constantly exploring new venues for Intel's drone light shows. It is an honor to partner with Radio City Music Hall to integrate Intel's innovative technology into the iconic Christmas Spectacular," said Natalie Cheung, general manager of Intel's drone light shows.