AITopics | Banff

Collaborating Authors

Banff

Sharp bounds for the number of regions of maxout networks and vertices of Minkowski sums

arXiv.org Artificial IntelligenceSep-1-2022

We present results on the number of linear regions of the functions that can be represented by artificial feedforward neural networks with maxout units. A rank-k maxout unit is a function computing the maximum of $k$ linear functions. For networks with a single layer of maxout units, the linear regions correspond to the upper vertices of a Minkowski sum of polytopes. We obtain face counting formulas in terms of the intersection posets of tropical hypersurfaces or the number of upper faces of partial Minkowski sums, along with explicit sharp upper bounds for the number of regions for any input dimension, any number of units, and any ranks, in the cases with and without biases. Based on these results we also obtain asymptotically sharp upper bounds for networks with multiple layers.

arrangement, linear region, polytope, (16 more...)

arXiv.org Artificial Intelligence

2104.08135

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(9 more...)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild

Hegde, Sindhu B, Prajwal, K R, Mukhopadhyay, Rudrabha, Namboodiri, Vinay P, Jawahar, C. V.

arXiv.org Artificial IntelligenceSep-1-2022

In this work, we address the problem of generating speech from silent lip videos for any speaker in the wild. In stark contrast to previous works, our method (i) is not restricted to a fixed number of speakers, (ii) does not explicitly impose constraints on the domain or the vocabulary and (iii) deals with videos that are recorded in the wild as opposed to within laboratory settings. The task presents a host of challenges, with the key one being that many features of the desired target speech, like voice, pitch and linguistic content, cannot be entirely inferred from the silent face video. In order to handle these stochastic variations, we propose a new VAE-GAN architecture that learns to associate the lip and speech sequences amidst the variations. With the help of multiple powerful discriminators that guide the training process, our generator learns to synthesize speech sequences in any voice for the lip movements of any person. Extensive experiments on multiple datasets show that we outperform all baselines by a large margin. Further, our network can be fine-tuned on videos of specific identities to achieve a performance comparable to single-speaker models that are trained on $4\times$ more data. We conduct numerous ablation studies to analyze the effect of different modules of our architecture. We also provide a demo video that demonstrates several qualitative results along with the code and trained models on our website: \url{http://cvit.iiit.ac.in/research/projects/cvit-projects/lip-to-speech-synthesis}}

speech, synthesis, video, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3503161.3548081

2209.00642

Country:

Europe > Portugal > Lisbon > Lisbon (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > India > Telangana > Hyderabad (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Large-Scale Auto-Regressive Modeling Of Street Networks

Birsak, Michael, Kelly, Tom, Para, Wamiq, Wonka, Peter

arXiv.org Artificial IntelligenceSep-1-2022

We present a novel generative method for the creation of city-scale road layouts. While the output of recent methods is limited in both size of the covered area and diversity, our framework produces large traversable graphs of high quality consisting of vertices and edges representing complete street networks covering 400 square kilometers or more. While our framework can process general 2D embedded graphs, we focus on street networks due to the wide availability of training data. Our generative framework consists of a transformer decoder that is used in a sliding window manner to predict a field of indices, with each index encoding a representation of the local neighborhood. The semantics of each index is determined by a dictionary of context vectors. The index field is then input to a decoder to compute the street graph. Using data from OpenStreetMap, we train our system on whole cities and even across large countries such as the US, and finally compare it to the state of the art.

graph, representation, street network, (11 more...)

arXiv.org Artificial Intelligence

2209.00281

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(12 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

A Deep Neural Networks ensemble workflow from hyperparameter search to inference leveraging GPU clusters

Pochelu, Pierrick, Petiton, Serge G., Conche, Bruno

arXiv.org Artificial IntelligenceAug-30-2022

Automated Machine Learning with ensembling (or AutoML with ensembling) seeks to automatically build ensembles of Deep Neural Networks (DNNs) to achieve qualitative predictions. Ensemble of DNNs are well known to avoid over-fitting but they are memory and time consuming approaches. Therefore, an ideal AutoML would produce in one single run time different ensembles regarding accuracy and inference speed. While previous works on AutoML focus to search for the best model to maximize its generalization ability, we rather propose a new AutoML to build a larger library of accurate and diverse individual models to then construct ensembles. First, our extensive benchmarks show asynchronous Hyperband is an efficient and robust way to build a large number of diverse models to combine them. Then, a new ensemble selection method based on a multi-objective greedy algorithm is proposed to generate accurate ensembles by controlling their computing cost. Finally, we propose a novel algorithm to optimize the inference of the DNNs ensemble in a GPU cluster based on allocation optimization. The produced AutoML with ensemble method shows robust results on two datasets using efficiently GPU clusters during both the training phase and the inference phase. Deep Neural networks (DNNs) are notoriously difficult to tune, train, and ensemble to achieve state-of-the-art results. Automatic machine learning with ensembling or "AutoML+ensembling" tools provide a simple interface to train and evaluate many ensembles of DNNs to achieve high accuracy by reducing overfitting. Nowadays, multiple researchers and practitioners have well understood the benefit of ensembling DNNs. Further, several winners and top performers on challenges routinely use ensembles to improve accuracy. However, ensembles of DNNs suffer from three main limitations to be widely deployed in research and industrial applications.

algorithm, ensemble, international conference, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3492805.3492819

2208.14046

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > France > Nouvelle-Aquitaine > Pyrénées-Atlantiques > Pau (0.04)
Asia > Middle East > Jordan (0.04)
(13 more...)

Genre: Workflow (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Robust 3D Vision for Autonomous Robots

Aghili, Farhad

arXiv.org Artificial IntelligenceAug-29-2022

This paper presents a fault-tolerant 3D vision system for autonomous robotic operation. In particular, pose estimation of space objects is achieved using 3D vision data in an integrated Kalman filter (KF) and an Iterative Closest Point (ICP) algorithm in a closed-loop configuration. The initial guess for the internal ICP iteration is provided by the state estimate propagation of the Kalman filer. The Kalman filter is capable of not only estimating the target's states but also its inertial parameters. This allows the motion of the target to be predictable as soon as the filter converges. Consequently, the ICP can maintain pose tracking over a wider range of velocity due to the increased precision of ICP initialization. Furthermore, incorporation of the target's dynamics model in the estimation process allows the estimator continuously provide pose estimation even when the sensor temporally loses its signal namely due to obstruction. The capabilities of the pose estimation methodology is demonstrated by a ground testbed for Automated Rendezvous & Docking. In this experiment, Neptec's Laser Camera System (LCS) is used for real-time scanning of a satellite model attached to a manipulator arm, which is driven by a simulator according to orbital and attitude dynamics. The results showed that robust tracking of the free-floating tumbling satellite can be achieved only if the Kalman filter and ICP are in a closed-loop configuration.

estimation, pose estimation, satellite, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/JSEN.2010.2056365

2208.12925

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada > Quebec > Montreal (0.04)
(14 more...)

Genre: Research Report (0.90)

Industry: Energy (0.55)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Latent Heterogeneous Graph Network for Incomplete Multi-View Learning

Zhu, Pengfei, Yao, Xinjie, Wang, Yu, Cao, Meng, Hui, Binyuan, Zhao, Shuai, Hu, Qinghua

arXiv.org Artificial IntelligenceAug-29-2022

Multi-view learning has progressed rapidly in recent years. Although many previous studies assume that each instance appears in all views, it is common in real-world applications for instances to be missing from some views, resulting in incomplete multi-view data. To tackle this problem, we propose a novel Latent Heterogeneous Graph Network (LHGN) for incomplete multi-view learning, which aims to use multiple incomplete views as fully as possible in a flexible manner. By learning a unified latent representation, a trade-off between consistency and complementarity among different views is implicitly realized. To explore the complex relationship between samples and latent representations, a neighborhood constraint and a view-existence constraint are proposed, for the first time, to construct a heterogeneous graph. Finally, to avoid any inconsistencies between training and test phase, a transductive learning technique is applied based on graph learning for classification tasks. Extensive experimental results on real-world datasets demonstrate the effectiveness of our model over existing state-of-the-art approaches.

dataset, proceedings, representation, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TMM.2022.3154592

2208.13669

Country:

North America > United States > California > San Francisco County > San Francisco (0.28)
Asia > China > Tianjin Province > Tianjin (0.05)
North America > United States > New York > New York County > New York City (0.04)
(25 more...)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

A Review of Knowledge Graph Completion

Zamini, Mohamad, Reza, Hassan, Rabiei, Minou

arXiv.org Artificial IntelligenceAug-24-2022

Information extraction methods proved to be effective at triple extraction from structured or unstructured data. The organization of such triples in the form of (head entity, relation, tail entity) is called the construction of Knowledge Graphs (KGs). Most of the current knowledge graphs are incomplete. In order to use KGs in downstream tasks, it is desirable to predict missing links in KGs. Different approaches have been recently proposed for representation learning of KGs by embedding both entities and relations into a low-dimensional vector space aiming to predict unknown triples based on previously visited triples. According to how the triples will be treated independently or dependently, we divided the task of knowledge graph completion into conventional and graph neural network representation learning and we discuss them in more detail. In conventional approaches, each triple will be processed independently and in GNN-based approaches, triples also consider their local neighborhood. View Full-Text

graph, knowledge graph, relation, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/info13080396

2208.11652

Country:

North America > United States > North Dakota > Grand Forks County > Grand Forks (0.14)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
North America > United States > New York > New York County > New York City (0.04)
(20 more...)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

High-quality Task Division for Large-scale Entity Alignment

Liu, Bing, Hua, Wen, Zuccon, Guido, Zhao, Genghong, Zhang, Xia

arXiv.org Artificial IntelligenceAug-22-2022

Entity Alignment (EA) aims to match equivalent entities that refer to the same real-world objects and is a key step for Knowledge Graph (KG) fusion. Most neural EA models cannot be applied to large-scale real-life KGs due to their excessive consumption of GPU memory and time. One promising solution is to divide a large EA task into several subtasks such that each subtask only needs to match two small subgraphs of the original KGs. However, it is challenging to divide the EA task without losing effectiveness. Existing methods display low coverage of potential mappings, insufficient evidence in context graphs, and largely differing subtask sizes. In this work, we design the DivEA framework for large-scale EA with high-quality task division. To include in the EA subtasks a high proportion of the potential mappings originally present in the large EA task, we devise a counterpart discovery method that exploits the locality principle of the EA task and the power of trained EA models. Unique to our counterpart discovery method is the explicit modelling of the chance of a potential mapping. We also introduce an evidence passing mechanism to quantify the informativeness of context entities and find the most informative context graphs with flexible control of the subtask size. Extensive experiments show that DivEA achieves higher EA performance than alternative state-of-the-art solutions.

machine learning, mapping, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3511808.3557352

2208.10366

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.05)
Oceania > Australia > Queensland (0.04)
North America > Dominican Republic (0.04)
(13 more...)

Genre:

Research Report > Promising Solution (0.54)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Comparison-based Conversational Recommender System with Relative Bandit Feedback

Xie, Zhihui, Yu, Tong, Zhao, Canzhe, Li, Shuai

arXiv.org Artificial IntelligenceAug-21-2022

With the recent advances of conversational recommendations, the recommender system is able to actively and dynamically elicit user preference via conversational interactions. To achieve this, the system periodically queries users' preference on attributes and collects their feedback. However, most existing conversational recommender systems only enable the user to provide absolute feedback to the attributes. In practice, the absolute feedback is usually limited, as the users tend to provide biased feedback when expressing the preference. Instead, the user is often more inclined to express comparative preferences, since user preferences are inherently relative. To enable users to provide comparative preferences during conversational interactions, we propose a novel comparison-based conversational recommender system. The relative feedback, though more practical, is not easy to be incorporated since its feedback scale is always mismatched with users' absolute preferences. With effectively collecting and understanding the relative feedback from an interactive manner, we further propose a new bandit algorithm, which we call RelativeConUCB. The experiments on both synthetic and real-world datasets validate the advantage of our proposed method, compared to the existing bandit algorithms in the conversational recommender systems.

recommendation, recommender system, relative feedback, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3404835.3462920

2208.09837

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Asia > China > Shanghai > Shanghai (0.05)
North America > United States > New York > New York County > New York City (0.04)
(16 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.46)
Media > Film (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

A Generic Self-Supervised Framework of Learning Invariant Discriminative Features

Ntelemis, Foivos, Jin, Yaochu, Thomas, Spencer A.

arXiv.org Artificial IntelligenceAug-21-2022

Self-supervised learning (SSL) has become a popular method for generating invariant representations without the need for human annotations. Nonetheless, the desired invariant representation is achieved by utilising prior online transformation functions on the input data. As a result, each SSL framework is customised for a particular data type, e.g., visual data, and further modifications are required if it is used for other dataset types. On the other hand, autoencoder (AE), which is a generic and widely applicable framework, mainly focuses on dimension reduction and is not suited for learning invariant representation. This paper proposes a generic SSL framework based on a constrained self-labelling assignment process that prevents degenerate solutions. Specifically, the prior transformation functions are replaced with a self-transformation mechanism, derived through an unsupervised training process of adversarial training, for imposing invariant representations. Via the self-transformation mechanism, pairs of augmented instances can be generated from the same input data. Finally, a training objective based on contrastive learning is designed by leveraging both the self-labelling assignment and the self-transformation mechanism. Despite the fact that the self-transformation process is very generic, the proposed training strategy outperforms a majority of state-of-the-art representation learning methods based on AE structures. To validate the performance of our method, we conduct experiments on four types of data, namely visual, audio, text, and mass spectrometry data, and compare them in terms of four quantitative metrics. Our comparison results indicate that the proposed method demonstrate robustness and successfully identify patterns within the datasets.

dataset, representation, target distribution, (16 more...)

arXiv.org Artificial Intelligence

2202.06914

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > Puerto Rico > San Juan > San Juan (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(8 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback