Goto

Collaborating Authors

 Gomel Region




Ironing the Graphs: Toward a Correct Geometric Analysis of Large-Scale Graphs

arXiv.org Artificial Intelligence

Graph embedding approaches attempt to project graphs into geometric entities, i.e, manifolds. The idea is that the geometric properties of the projected manifolds are helpful in the inference of graph properties. However, if the choice of the embedding manifold is incorrectly performed, it can lead to incorrect geometric inference. In this paper, we argue that the classical embedding techniques cannot lead to correct geometric interpretation as they miss the curvature at each point, of manifold. We advocate that for doing correct geometric interpretation the embedding of graph should be done over regular constant curvature manifolds. To this end, we present an embedding approach, the discrete Ricci flow graph embedding (dRfge) based on the discrete Ricci flow that adapts the distance between nodes in a graph so that the graph can be embedded onto a constant curvature manifold that is homogeneous and isotropic, i.e., all directions are equivalent and distances comparable, resulting in correct geometric interpretations. A major contribution of this paper is that for the first time, we prove the convergence of discrete Ricci flow to a constant curvature and stable distance metrics over the edges. A drawback of using the discrete Ricci flow is the high computational complexity that prevented its usage in large-scale graph analysis. Another contribution of this paper is a new algorithmic solution that makes it feasible to calculate the Ricci flow for graphs of up to 50k nodes, and beyond. The intuitions behind the discrete Ricci flow make it possible to obtain new insights into the structure of large-scale graphs. We demonstrate this through a case study on analyzing the internet connectivity structure between countries at the BGP level.


Learning from Models and Data for Visual Grounding

arXiv.org Artificial Intelligence

We introduce SynGround, a novel framework that combines data-driven learning and knowledge transfer from various large-scale pretrained models to enhance the visual grounding capabilities of a pretrained vision-and-language model. The knowledge transfer from the models initiates the generation of image descriptions through an image description generator. These descriptions serve dual purposes: they act as prompts for synthesizing images through a text-to-image generator, and as queries for synthesizing text, from which phrases are extracted using a large language model. Finally, we leverage an open-vocabulary object detector to generate synthetic bounding boxes for the synthetic images and texts. We finetune a pretrained vision-and-language model on this dataset by optimizing a mask-attention consistency objective that aligns region annotations with gradient-based model explanations. The resulting model improves the grounding capabilities of an off-the-shelf vision-and-language model. Particularly, SynGround improves the pointing game accuracy of ALBEF on the Flickr30k dataset from 79.38% to 87.26%, and on RefCOCO+ Test A from 69.35% to 79.06% and on RefCOCO+ Test B from 53.77% to 63.67%.


Multi-View Symbolic Regression

arXiv.org Artificial Intelligence

Symbolic regression (SR) searches for analytical expressions representing the relationship between a set of explanatory and response variables. Current SR methods assume a single dataset extracted from a single experiment. Nevertheless, frequently, the researcher is confronted with multiple sets of results obtained from experiments conducted with different setups. Traditional SR methods may fail to find the underlying expression since the parameters of each experiment can be different. In this work we present Multi-View Symbolic Regression (MvSR), which takes into account multiple datasets simultaneously, mimicking experimental environments, and outputs a general parametric solution. This approach fits the evaluated expression to each independent dataset and returns a parametric family of functions f(x; \theta) simultaneously capable of accurately fitting all datasets. We demonstrate the effectiveness of MvSR using data generated from known expressions, as well as real-world data from astronomy, chemistry and economy, for which an a priori analytical expression is not available. Results show that MvSR obtains the correct expression more frequently and is robust to hyperparameters change. In real-world data, it is able to grasp the group behaviour, recovering known expressions from the literature as well as promising alternatives, thus enabling the use SR to a large range of experimental scenarios.


Improved Visual Grounding through Self-Consistent Explanations

arXiv.org Artificial Intelligence

Vision-and-language models trained to match images with text can be combined with visual explanation methods to point to the locations of specific objects in an image. Our work shows that the localization --"grounding"-- abilities of these models can be further improved by finetuning for self-consistent visual explanations. We propose a strategy for augmenting existing text-image datasets with paraphrases using a large language model, and SelfEQ, a weakly-supervised strategy on visual explanation maps for paraphrases that encourages self-consistency. Specifically, for an input textual phrase, we attempt to generate a paraphrase and finetune the model so that the phrase and paraphrase map to the same region in the image. We posit that this both expands the vocabulary that the model is able to handle, and improves the quality of the object locations highlighted by gradient-based visual explanation methods (e.g. GradCAM). We demonstrate that SelfEQ improves performance on Flickr30k, ReferIt, and RefCOCO+ over a strong baseline method and several prior works. Particularly, comparing to other methods that do not use any type of box annotations, we obtain 84.07% on Flickr30k (an absolute improvement of 4.69%), 67.40% on ReferIt (an absolute improvement of 7.68%), and 75.10%, 55.49% on RefCOCO+ test sets A and B respectively (an absolute improvement of 3.74% on average).


An Exploration of How Training Set Composition Bias in Machine Learning Affects Identifying Rare Objects

arXiv.org Artificial Intelligence

This is due to the rapid expansion of computing (Cutri et al., 2013), had many technical challenges and resources and sensor technology in the last four required intensive astronomy expertise, experience, and labor decades that has driven equally rapid expansions in the to overcome (Eisenhardt et al., 2012, for example). A quantity of data to analyze. Astronomy, in particular, necessary first step in that process, though, is to classify has seen a proliferation of large scale imaging and spectroscopic the sources so that we can prioritize which sources might surveys that have billions of sources in them-- be interesting, and which are examples of already known surveys like: the Sloan Digital Sky Survey (SDSS, York sources. Because these sources are rare it is usually easier et al., 2000), the 2-Micron All Sky Survey (2MASS, Skrutskie to use a supervised machine learning algorithm, one that et al., 2006), the Wide-field Infrared Survey Explorer is tuned using sources with known classifications, than it (WISE, Wright et al., 2010), the Gaia satellite's survey is to use an unsupervised one. The reason should be obvious: (Gaia Collaboration et al., 2016), the Panoramic Survey subgroups of the common known source types are Telescope and Rapid Response System (Pan-STARRS) likely to outnumber the rare new ones, meaning a naive surveys (Chambers et al., 2016), the Dark Energy Spectroscopic unsupervised machine learning algorithm could need a lot Instrument (DESI) surveys (Dey et al., 2019), the of complexity before it actually finds the rare class. UKIRT Infrared Deep Sky Surveys (UKIDSS, Lawrence et al., 2007), and the Galaxy Evolution Explorer (GALEX) Supervised learning also has drawbacks when used for surveys (Martin et al., 2005).