AITopics | Kuznetsov, Sergei O.

Collaborating Authors

Kuznetsov, Sergei O.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Formal concept analysis for evaluating intrinsic dimension of a natural language

Kuznetsov, Sergei O., Gromov, Vasilii A., Borodin, Nikita S., Divavin, Andrei M.

arXiv.org Artificial IntelligenceNov-17-2023

Some results of a computational experiment for determining the intrinsic dimension of linguistic varieties for the Bengali and Russian languages are presented. At the same time, both sets of words and sets of bigrams in these languages were considered separately. The method used to solve this problem was based on formal concept analysis algorithms. It was found that the intrinsic dimensions of these languages are significantly less than the dimensions used in popular neural network models in natural language processing.

dimension, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-45170-6_34

2311.10862

Country:

Europe > Russia (0.15)
North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Mint: MDL-based approach for Mining INTeresting Numerical Pattern Sets

Makhalova, Tatiana, Kuznetsov, Sergei O., Napoli, Amedeo

arXiv.org Artificial IntelligenceNov-30-2020

Pattern mining is well established in data mining research, especially for mining binary datasets. Surprisingly, there is much less work about numerical pattern mining and this research area remains under-explored. In this paper, we propose Mint, an efficient MDL-based algorithm for mining numerical datasets. The MDL principle is a robust and reliable framework widely used in pattern mining, and as well in subgroup discovery. In Mint we reuse MDL for discovering useful patterns and returning a set of non-redundant overlapping patterns with well-defined boundaries and covering meaningful groups of objects. Mint is not alone in the category of numerical pattern miners based on MDL. In the experiments presented in the paper we show that Mint outperforms competitors among which Slim and RealKrimp.

artificial intelligence, data mining, dataset, (20 more...)

arXiv.org Artificial Intelligence

2011.14843

Country:

Europe > France (0.14)
Europe > Russia (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.49)

Add feedback

Discovery data topology with the closure structure. Theoretical and practical aspects

Makhalova, Tatiana, Kuznetsov, Sergei O., Napoli, Amedeo

arXiv.org Artificial IntelligenceOct-6-2020

In this paper, we are revisiting pattern mining and especially itemset mining, which allows one to analyze binary datasets in searching for interesting and meaningful association rules and respective itemsets in an unsupervised way. While a summarization of a dataset based on a set of patterns does not provide a general and satisfying view over a dataset, we introduce a concise representation --the closure structure-- based on closed itemsets and their minimum generators, for capturing the intrinsic content of a dataset. The closure structure allows one to understand the topology of the dataset in the whole and the inherent complexity of the data. We propose a formalization of the closure structure in terms of Formal Concept Analysis, which is well adapted to study this data topology. We present and demonstrate theoretical results, and as well, practical results using the GDPM algorithm. GDPM is rather unique in its functionality as it returns a characterization of the topology of a dataset in terms of complexity levels, highlighting the diversity and the distribution of the itemsets. Finally, a series of experiments shows how GDPM can be practically used and what can be expected from the output.

health & medicine, itemset, neural network, (21 more...)

arXiv.org Artificial Intelligence

2010.02628

Country:

Europe > France (0.14)
North America > United States > California (0.14)
Europe > Switzerland (0.14)
Europe > Russia (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.86)

Add feedback

Ordered Sets for Data Analysis

Kuznetsov, Sergei O.

arXiv.org Artificial IntelligenceAug-27-2019

This book dwells on mathematical and algorithmic issues of data analysis based on generality order of descriptions and respective precision. To speak of these topics correctly, we have to go some way getting acquainted with the important notions of relation and order theory. On the one hand, data often have a complex structure with natural order on it. On the other hand, many symbolic methods of data analysis and machine learning allow to compare the obtained classifiers w.r.t. their generality, which is also an order relation. Efficient algorithms are very important in data analysis, especially when one deals with big data, so scalability is a real issue. That is why we analyze the computational complexity of algorithms and problems of data analysis. We start from the basic definitions and facts of algorithmic complexity theory and analyze the complexity of various tools of data analysis we consider. The tools and methods of data analysis, like computing taxonomies, groups of similar objects (concepts and n-clusters), dependencies in data, classification, etc., are illustrated with applications in particular subject domains, from chemoinformatics to text mining and natural language processing.

artificial intelligence, logic programming, relation, (22 more...)

arXiv.org Artificial Intelligence

1908.11341

Country: Europe > Russia (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On interestingness measures of formal concepts

Kuznetsov, Sergei O., Makhalova, Tatiana

arXiv.org Artificial IntelligenceApr-19-2017

Formal concepts and closed itemsets proved to be of big importance for knowledge discovery, both as a tool for concise representation of association rules and a tool for clustering and constructing domain taxonomies and ontologies. Exponential explosion makes it difficult to consider the whole concept lattice arising from data, one needs to select most useful and interesting concepts. In this paper interestingness measures of concepts are considered and compared with respect to various aspects, such as efficiency of computation and applicability to noisy data and performing ranking correlation.

artificial intelligence, data mining, subset, (18 more...)

arXiv.org Artificial Intelligence

1611.02646

Country:

Europe > Germany (0.14)
Europe > Spain (0.14)
Europe > Russia (0.14)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.48)

Add feedback

Concept Stability for Constructing Taxonomies of Web-site Users

Kuznetsov, Sergei O., Ignatov, Dmitry I.

arXiv.org Artificial IntelligenceNov-24-2016

Owners of a web-site are often interested in analysis of groups of users of their site. Information on these groups can help optimizing the structure and contents of the site. In this paper we use an approach based on formal concepts for constructing taxonomies of user groups. For decreasing the huge amount of concepts that arise in applications, we employ stability index of a concept, which describes how a group given by a concept extent differs from other such groups. We analyze resulting taxonomies of user groups for three target websites.

artificial intelligence, concept lattice, stability index, (13 more...)

arXiv.org Artificial Intelligence

0905.1424

Country: Europe > Russia (0.16)

Technology:

Information Technology > Communications > Web (0.60)
Information Technology > Artificial Intelligence (0.48)

Add feedback

Concept Relation Discovery and Innovation Enabling Technology (CORDIET)

Poelmans, Jonas, Elzinga, Paul, Neznanov, Alexey, Viaene, Stijn, Kuznetsov, Sergei O., Ignatov, Dmitry, Dedene, Guido

arXiv.org Artificial IntelligenceFeb-13-2012

Concept Relation Discovery and Innovation Enabling Technology (CORDIET), is a toolbox for gaining new knowledge from unstructured text data. At the core of CORDIET is the C-K theory which captures the essential elements of innovation. The tool uses Formal Concept Analysis (FCA), Emergent Self Organizing Maps (ESOM) and Hidden Markov Models (HMM) as main artifacts in the analysis process. The user can define temporal, text mining and compound attributes. The text mining attributes are used to analyze the unstructured text in documents, the temporal attributes use these document's timestamps for analysis. The compound attributes are XML rules based on text mining and temporal attributes. The user can cluster objects with object-cluster rules and can chop the data in pieces with segmentation rules. The artifacts are optimized for efficient data analysis; object labels in the FCA lattice and ESOM map contain an URL on which the user can click to open the selected document.

cordiet, law enforcement, oncology, (20 more...)

arXiv.org Artificial Intelligence

1202.2895

Country:

Asia (0.68)
Europe > Netherlands (0.31)
Europe > Belgium > Flanders (0.15)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.92)

Add feedback

Revisiting Numerical Pattern Mining with Formal Concept Analysis

Kaytoue, Mehdi, Kuznetsov, Sergei O., Napoli, Amedeo

arXiv.org Artificial IntelligenceNov-24-2011

In this paper, we investigate the problem of mining numerical data in the framework of Formal Concept Analysis. The usual way is to use a scaling procedure --transforming numerical attributes into binary ones-- leading either to a loss of information or of efficiency, in particular w.r.t. the volume of extracted patterns. By contrast, we propose to directly work on numerical data in a more precise and efficient way, and we prove it. For that, the notions of closed patterns, generators and equivalent classes are revisited in the numerical context. Moreover, two original algorithms are proposed and used in an evaluation involving real-world data, showing the predominance of the present approach.

artificial intelligence, data mining, interval pattern, (19 more...)

arXiv.org Artificial Intelligence

1111.5689

Country: Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.14)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)

Add feedback

Mining Biclusters of Similar Values with Triadic Concept Analysis

Kaytoue, Mehdi, Kuznetsov, Sergei O., Macko, Juraj, Meira, Wagner, Napoli, Amedeo

arXiv.org Artificial IntelligenceNov-14-2011

Biclustering numerical data became a popular data-mining task in the beginning of 2000's, especially for analysing gene expression data. A bicluster reflects a strong association between a subset of objects and a subset of attributes in a numerical object/attribute data-table. So called biclusters of similar values can be thought as maximal sub-tables with close values. Only few methods address a complete, correct and non redundant enumeration of such patterns, which is a well-known intractable problem, while no formal framework exists. In this paper, we introduce important links between biclustering and formal concept analysis. More specifically, we originally show that Triadic Concept Analysis (TCA), provides a nice mathematical framework for biclustering. Interestingly, existing algorithms of TCA, that usually apply on binary data, can be used (directly or with slight modifications) after a preprocessing step for extracting maximal biclusters of similar values.

artificial intelligence, bicluster, health & medicine, (15 more...)

arXiv.org Artificial Intelligence

1111.327

Country:

Europe (0.93)
South America > Brazil > Minas Gerais (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.49)

Technology:

Information Technology > Data Science > Data Mining (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.34)

Add feedback

Concept-based Recommendations for Internet Advertisement

Ignatov, Dmitry I., Kuznetsov, Sergei O.

arXiv.org Artificial IntelligenceJun-26-2009

The problem of detecting terms that can be interesting to the advertiser is considered. If a company has already bought some advertising terms which describe certain services, it is reasonable to find out the terms bought by competing companies. A part of them can be recommended as future advertising terms to the company. The goal of this work is to propose better interpretable recommendations based on FCA and association rules.

artificial intelligence, association rule, health & medicine, (16 more...)

arXiv.org Artificial Intelligence

0906.4982

Country:

North America > United States (0.46)
Europe (0.28)

Industry:

Marketing (0.66)
Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.73)

Add feedback