AITopics | Tabor, Jacek

Plotting

Tabor, Jacek

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep processing of structured data

Maziarka, Łukasz, Śmieja, Marek, Nowak, Aleksandra, Tabor, Jacek, Struski, Łukasz, Spurek, Przemysław

arXiv.org Artificial IntelligenceOct-3-2018

We construct a general unified framework for learning representation of structured data, i.e. data which cannot be represented as the fixed-length vectors (e.g. sets, graphs, texts or images of varying sizes). The key factor is played by an intermediate network called SAN (Set Aggregating Network), which maps a structured object to a fixed length vector in a high dimensional latent space. Our main theoretical result shows that for sufficiently large dimension of the latent space, SAN is capable of learning a unique representation for every input example. Experiments demonstrate that replacing pooling operation by SAN in convolutional networks leads to better results in classifying images with different sizes. Moreover, its direct application to text and graph data allows to obtain results close to SOTA, by simpler networks with smaller number of parameters than competitive models.

deep learning, neural network, representation, (22 more...)

arXiv.org Artificial Intelligence

1810.01868

Country:

Europe (0.28)
North America > United States > Oregon (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
(3 more...)

Add feedback

Dynamical Isometry is Achieved in Residual Networks in a Universal Way for any Activation Function

Tarnowski, Wojciech, Warchoł, Piotr, Jastrzębski, Stanisław, Tabor, Jacek, Nowak, Maciej A.

arXiv.org Machine LearningSep-24-2018

M. Smoluchowski Institute of Physics and Mark Kac Complex Systems Research Center, Jagiellonian University, PL-30-348 Kraków, Poland (Dated: September 25, 2018) We demonstrate that in residual neural networks (ResNets) dynamical isometry is achievable irrespectively of the activation function used. We do that by deriving, with the help of Free Probability and Random Matrix Theories, a universal formula for the spectral density of the input-output Jacobian at initialization, in the large network width and depth limit. The resulting singular value spectrum depends on a single parameter, which we calculate for a variety of popular activation functions, by analyzing the signal propagation in the artificial neural network. We corroborate our results with numerical simulations of both random matrices and ResNets applied to the CIFAR-10 classification problem. Moreover, we study the consequence of this universal behavior for the initial and late phases of the learning processes. We conclude by drawing attention to the simple fact, that initialization acts as a confounding factor between the choice of activation function and the rate of learning. We propose that in ResNets this can be resolved based on our results, by ensuring the same level of dynamical isometry at initialization. Deep Learning has achieved unparalleled success in fields such as object detection and recognition, language translation and speech recognition [2].

deep learning, jacobian, neural network, (19 more...)

arXiv.org Machine Learning

1809.08848

Country:

Europe > Poland > Lesser Poland Province > Kraków (0.24)
North America > Canada > Ontario (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cramer-Wold AutoEncoder

Tabor, Jacek, Knop, Szymon, Spurek, Przemysław, Podolak, Igor, Mazur, Marcin, Jastrzębski, Stanisław

arXiv.org Artificial IntelligenceMay-23-2018

We propose a new generative model, Cramer-Wold Autoencoder (CWAE). Following WAE, we directly encourage normality of the latent space. Our paper uses also the recent idea from Sliced WAE (SWAE) model, which uses one-dimensional projections as a method of verifying closeness of two distributions. The crucial new ingredient is the introduction of a new (Cramer-Wold) metric in the space of densities, which replaces the Wasserstein metric used in SWAE. We show that the Cramer-Wold metric between Gaussian mixtures is given by a simple analytic formula, which results in the removal of sampling necessary to estimate the cost function in WAE and SWAE models. As a consequence, while drastically simplifying the optimization procedure, CWAE produces samples of a matching perceptual quality to other SOTA models.

artificial intelligence, autoencoder, neural network, (18 more...)

arXiv.org Artificial Intelligence

1805.09235

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Processing of missing data by neural networks

Smieja, Marek, Struski, Łukasz, Tabor, Jacek, Zieliński, Bartosz, Spurek, Przemysław

arXiv.org Machine LearningMay-18-2018

We propose a general, theoretically justified mechanism for processing missing data by neural networks. Our idea is to replace typical neuron response in the first hidden layer by its expected value. This approach can be applied for various types of networks at minimal cost in their modification. Moreover, in contrast to recent approaches, it does not require complete data for training. Experimental results performed on different types of architectures show that our method gives better results than typical imputation strategies and other methods dedicated for incomplete data.

deep learning, missing data, neural network, (19 more...)

arXiv.org Machine Learning

1805.07405

Country: Europe > Poland (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Efficient mixture model for clustering of sparse high dimensional binary data

Śmieja, Marek, Hajto, Krzysztof, Tabor, Jacek

arXiv.org Machine LearningJul-11-2017

In this paper we propose a mixture model, SparseMix, for clustering of sparse high dimensional binary data, which connects model-based with centroid-based clustering. Every group is described by a representative and a probability distribution modeling dispersion from this representative. In contrast to classical mixture models based on EM algorithm, SparseMix: -is especially designed for the processing of sparse data, -can be efficiently realized by an on-line Hartigan optimization algorithm, -is able to automatically reduce unnecessary clusters. We perform extensive experimental studies on various types of data, which confirm that SparseMix builds partitions with higher compatibility with reference grouping than related methods. Moreover, constructed representatives often better reveal the internal structure of data.

artificial intelligence, health & medicine, sparsemix, (18 more...)

arXiv.org Machine Learning

1707.03157

Country:

Europe (0.46)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Semi-supervised model-based clustering with controlled clusters leakage

Śmieja, Marek, Struski, Łukasz, Tabor, Jacek

arXiv.org Machine LearningMay-4-2017

In this paper, we focus on finding clusters in partially categorized data sets. We propose a semi-supervised version of Gaussian mixture model, called C3L, which retrieves natural subgroups of given categories. In contrast to other semi-supervised models, C3L is parametrized by user-defined leakage level, which controls maximal inconsistency between initial categorization and resulting clustering. Our method can be implemented as a module in practical expert systems to detect clusters, which combine expert knowledge with true distribution of data. Moreover, it can be used for improving the results of less flexible clustering techniques, such as projection pursuit clustering. The paper presents extensive theoretical analysis of the model and fast algorithm for its efficient optimization. Experimental results show that C3L finds high quality clustering model, which can be applied in discovering meaningful groups in partially classified data.

artificial intelligence, constraint, machine learning, (18 more...)

arXiv.org Machine Learning

1705.01877

Country: North America > Canada > British Columbia (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Generalized RBF kernel for incomplete data

Struski, Łukasz, Śmieja, Marek, Tabor, Jacek

arXiv.org Machine LearningMay-2-2017

We construct $\bf genRBF$ kernel, which generalizes the classical Gaussian RBF kernel to the case of incomplete data. We model the uncertainty contained in missing attributes making use of data distribution and associate every point with a conditional probability density function. This allows to embed incomplete data into the function space and to define a kernel between two missing data points based on scalar product in $L_2$. Experiments show that introduced kernel applied to SVM classifier gives better results than other state-of-the-art methods, especially in the case when large number of features is missing. Moreover, it is easy to implement and can be used together with any kernel approaches with no additional modifications.

artificial intelligence, health & medicine, incomplete data, (17 more...)

arXiv.org Machine Learning

1612.0148

Country: Europe > Poland (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Introduction to Cross-Entropy Clustering The R Package CEC

Tabor, Jacek, Spurek, Przemysław, Kamieniecki, Konrad, Śmieja, Marek, Misztal, Krzysztof

arXiv.org Machine LearningAug-19-2015

The R Package CEC performs clustering based on the cross-entropy clustering (CEC) method, which was recently developed with the use of information theory. The main advantage of CEC is that it combines the speed and simplicity of $k$-means with the ability to use various Gaussian mixture models and reduce unnecessary clusters. In this work we present a practical tutorial to CEC based on the R Package CEC. Functions are provided to encompass the whole process of clustering.

artificial intelligence, cec, machine learning, (18 more...)

arXiv.org Machine Learning

1508.04559

Country: Europe (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.96)
Information Technology > Data Science (0.94)

Add feedback

Cluster based RBF Kernel for Support Vector Machines

Czarnecki, Wojciech Marian, Tabor, Jacek

arXiv.org Machine LearningAug-12-2014

In the classical Gaussian SVM classification we use the feature space projection transforming points to normal distributions with fixed covariance matrices (identity in the standard RBF and the covariance of the whole dataset in Mahalanobis RBF). In this paper we add additional information to Gaussian SVM by considering local geometry-dependent feature space projection. We emphasize that our approach is in fact an algorithm for a construction of the new Gaussian-type kernel. We show that better (compared to standard RBF and Mahalanobis RBF) classification results are obtained in the simple case when the space is preliminary divided by k-means into two sets and points are represented as normal distributions with a covariances calculated according to the dataset partitioning. We call the constructed method C$_k$RBF, where $k$ stands for the amount of clusters used in k-means. We show empirically on nine datasets from UCI repository that C$_2$RBF increases the stability of the grid search (measured as the probability of finding good parameters).

artificial intelligence, health & medicine, kernel, (19 more...)

arXiv.org Machine Learning

1408.2869

Country: Europe > Poland (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.87)

Add feedback

Multithreshold Entropy Linear Classifier

Czarnecki, Wojciech Marian, Tabor, Jacek

arXiv.org Machine LearningAug-4-2014

Linear classifiers separate the data with a hyperplane. In this paper we focus on the novel method of construction of multithreshold linear classifier, which separates the data with multiple parallel hyperplanes. Proposed model is based on the information theory concepts -- namely Renyi's quadratic entropy and Cauchy-Schwarz divergence. We begin with some general properties, including data scale invariance. Then we prove that our method is a multithreshold large margin classifier, which shows the analogy to the SVM, while in the same time works with much broader class of hypotheses. What is also interesting, proposed method is aimed at the maximization of the balanced quality measure (such as Matthew's Correlation Coefficient) as opposed to very common maximization of the accuracy. This feature comes directly from the optimization problem statement and is further confirmed by the experiments on the UCI datasets. It appears, that our Multithreshold Entropy Linear Classifier (MELC) obtaines similar or higher scores than the ones given by SVM on both synthetic and real data. We show how proposed approach can be benefitial for the cheminformatics in the task of ligands activity prediction, where despite better classification results, MELC gives some additional insight into the data structure (classes of underrepresented chemical compunds).

artificial intelligence, dataset, health & medicine, (18 more...)

arXiv.org Machine Learning

doi: 10.1016/j.eswa.2015.03.007

1408.1054

Country: Europe > Poland (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)

Add feedback