Goto

Collaborating Authors

 Supervised Learning


Smoothed Embeddings for Certified Few-Shot Learning

arXiv.org Artificial Intelligence

Randomized smoothing is considered to be the state-of-the-art provable defense against adversarial perturbations. However, it heavily exploits the fact that classifiers map input objects to class probabilities and do not focus on the ones that learn a metric space in which classification is performed by computing distances to embeddings of classes prototypes. In this work, we extend randomized smoothing to few-shot learning models that map inputs to normalized embeddings. We provide analysis of Lipschitz continuity of such models and derive robustness certificate against $\ell_2$-bounded perturbations that may be useful in few-shot learning scenarios. Our theoretical results are confirmed by experiments on different datasets.


Probability estimation and structured output prediction for learning preferences in last mile delivery

arXiv.org Artificial Intelligence

We study the problem of learning the preferences of drivers and planners in the context of last mile delivery. Given a data set containing historical decisions and delivery locations, the goal is to capture the implicit preferences of the decision-makers. We consider two ways to use the historical data: one is through a probability estimation method that learns transition probabilities between stops (or zones). This is a fast and accurate method, recently studied in a VRP setting. Furthermore, we explore the use of machine learning to infer how to best balance multiple objectives such as distance, probability and penalties. Specifically, we cast the learning problem as a structured output prediction problem, where training is done by repeatedly calling the TSP solver. Another important aspect we consider is that for last-mile delivery, every address is a potential client and hence the data is very sparse. Hence, we propose a two-stage approach that first learns preferences at the zone level in order to compute a zone routing; after which a penalty-based TSP computes the stop routing. Results show that the zone transition probability estimation performs well, and that the structured output prediction learning can improve the results further. We hence showcase a successful combination of both probability estimation and machine learning, all the while using standard TSP solvers, both during learning and to compute the final solution; this means the methodology is applicable to other, real-life, TSP variants, or proprietary solvers.


Theoretical analysis and computation of the sample Frechet mean for sets of large graphs based on spectral information

arXiv.org Machine Learning

To characterize the location (mean, median) of a set of graphs, one needs a notion of centrality that is adapted to metric spaces, since graph sets are not Euclidean spaces. A standard approach is to consider the Frechet mean. In this work, we equip a set of graphs with the pseudometric defined by the norm between the eigenvalues of their respective adjacency matrix. Unlike the edit distance, this pseudometric reveals structural changes at multiple scales, and is well adapted to studying various statistical problems for graph-valued data. We describe an algorithm to compute an approximation to the sample Frechet mean of a set of undirected unweighted graphs with a fixed size using this pseudometric.


Generalized Shape Metrics on Neural Representations

arXiv.org Machine Learning

Understanding the operation of biological and artificial networks remains a difficult and important challenge. To identify general principles, researchers are increasingly interested in surveying large collections of networks that are trained on, or biologically adapted to, similar tasks. A standardized set of analysis tools is now needed to identify how network-level covariates -- such as architecture, anatomical brain region, and model organism -- impact neural representations (hidden layer activations). Here, we provide a rigorous foundation for these analyses by defining a broad family of metric spaces that quantify representational dissimilarity. Using this framework we modify existing representational similarity measures based on canonical correlation analysis to satisfy the triangle inequality, formulate a novel metric that respects the inductive biases in convolutional layers, and identify approximate Euclidean embeddings that enable network representations to be incorporated into essentially any off-the-shelf machine learning method. We demonstrate these methods on large-scale datasets from biology (Allen Institute Brain Observatory) and deep learning (NAS-Bench-101). In doing so, we identify relationships between neural representations that are interpretable in terms of anatomical features and model performance.


DeHIN: A Decentralized Framework for Embedding Large-scale Heterogeneous Information Networks

arXiv.org Artificial Intelligence

Modeling heterogeneity by extraction and exploitation of high-order information from heterogeneous information networks (HINs) has been attracting immense research attention in recent times. Such heterogeneous network embedding (HNE) methods effectively harness the heterogeneity of small-scale HINs. However, in the real world, the size of HINs grow exponentially with the continuous introduction of new nodes and different types of links, making it a billion-scale network. Learning node embeddings on such HINs creates a performance bottleneck for existing HNE methods that are commonly centralized, i.e., complete data and the model are both on a single machine. To address large-scale HNE tasks with strong efficiency and effectiveness guarantee, we present \textit{Decentralized Embedding Framework for Heterogeneous Information Network} (DeHIN) in this paper. In DeHIN, we generate a distributed parallel pipeline that utilizes hypergraphs in order to infuse parallelization into the HNE task. DeHIN presents a context preserving partition mechanism that innovatively formulates a large HIN as a hypergraph, whose hyperedges connect semantically similar nodes. Our framework then adopts a decentralized strategy to efficiently partition HINs by adopting a tree-like pipeline. Then, each resulting subnetwork is assigned to a distributed worker, which employs the deep information maximization theorem to locally learn node embeddings from the partition it receives. We further devise a novel embedding alignment scheme to precisely project independently learned node embeddings from all subnetworks onto a common vector space, thus allowing for downstream tasks like link prediction and node classification.


6 months after Biden touted 'independence' from COVID-19, cases set records

FOX News

Fox News White House correspondent Jacqui Heinrich discusses the Biden administration's failure to deliver at-home COVID tests on'Special Report.' It's been six months since President Biden said the U.S. was close to declaring "independence from COVID-19," and yet the pandemic still shows no signs of slowing after the country set a global record for the number of cases Monday due to the spread of the highly transmissible omicron variant. The U.S. reported more than 1 million new coronavirus infections on Monday, setting a global record and almost doubling the previous record set last week. Hospitalizations have also skyrocketed across the country, but deaths have held relatively steady in recent weeks. President Biden listens during a virtual meeting about reducing the costs of meat through increased competition in the meat processing industry in the South Court Auditorium at the Eisenhower Executive Office Building on Jan. 3, 2022, in Washington, D.C. (Photo by Sarah Silbiger/Getty Images) Biden gave a speech Tuesday maintaining his position that "this continues to be a pandemic of the unvaccinated," even though breakthrough cases of COVID-19 among people who are fully vaccinated continue to rise across the country as new variants emerge.


#008 Shallow Neural Network - Master Data Science

#artificialintelligence

In this post we will see how to vectorize across multiple training examples. The outcome will be similar to what we saw in Logistic Regression. These equations tell us how, when given an input feature vector \(x \), we can generate predictions. If we have \(m \) training examples we need to repeat this proces \(m \) times. The notation \( a {[2](i)} \) means that we are talking about activation in the second layer that comes from \(i {th} \) training example.


CORE: A Knowledge Graph Entity Type Prediction Method via Complex Space Regression and Embedding

arXiv.org Artificial Intelligence

Research on knowledge graph (KG) construction, completion, inference, and applications has grown rapidly in recent years since it offers a powerful tool for modeling human knowledge in graph forms. Nodes in KGs denote entities and links represent relations between entities. The basic building blocks of KG are entity-relation triples in form of (subject, predicate, object) introduced by the Resource Description Framework (RDF). Learning representations for entities and relations in low dimensional vector spaces is one of the most active research topics in the field. Entity type offers a valuable piece of information to KG learning tasks. Better results in KG-related tasks have been achieved with the help of entity type. For example, TKRL [1] uses a hierarchical type encoder for KG completion by incorporating entity type information. AutoETER [2] adopts a similar approach but encodes the type information with projection matrices. Based on DistMult [3] and ComplEx [4] embedding, [5] propose an improved factorization model without explicit type supervision.


Improving scripts with a memory of natural feedback

arXiv.org Artificial Intelligence

How can an end-user provide feedback if a deployed structured prediction model generates incorrect output? Our goal is to allow users to correct errors directly through interaction, without retraining, by giving feedback on the model's output. We create a dynamic memory architecture with a growing memory of feedbacks about errors in the output. Given a new, unseen input, our model can use feedback from a similar, past erroneous state. On a script generation task, we show empirically that the model learns to apply feedback effectively (up to 30 points improvement), while avoiding similar past mistakes after deployment (up to 10 points improvement on an unseen set). This is a first step towards strengthening deployed models, potentially broadening their utility.


Distance and Hop-wise Structures Encoding Enhanced Graph Attention Networks

arXiv.org Artificial Intelligence

Many works have proven that existing neighbor-averaging Graph Neural Networks cannot efficiently catch structure information, such GNNs cannot even catch degree features in some cases. The reason is intuitive: as the neighbor-averaging GNNs can only combine neighbor's feature vectors for every node, if the neighbor's feature vectors contains no structure information, the hop-wise neighbor-averaging GNNs can only catch degree information at best([1];[2];[3]). So, as an intuitive idea, injecting structure information into feature vectors may improve the performance of GNNs. Numerous works have shown that injecting structure, distance, position or spatial information can significantly improve performance of neighbor-averaging GNNs([4];[5];[6];[7];[8];[9];[10]). However, existing works have their problems. Some of them has very high computation complexity which can not apply to large-scale graph(MotifNet[4]). Some of them simply concatenate structure information with intrinsic feature vector (ID-GNN[6]; P-GNN[8]; DE-GNN[9]), which may confuse the signals of different feature. For example, in ogbn-arxiv dataset, the intrinsic feature is semantic embedding of headline or abstract, which provides total different signal with structure information. Some of them are graph-level-task oriented and only deal with small graph(Graphormer[7]; SubGNN[10]).