closure system
A note on the connectedness property of union-free generic sets of partial orders
Schollmeyer, Georg, Blocher, Hannah
This short note describes and proves a connectedness property which was introduced in Blocher et al. [2023] in the context of data depth functions for partial orders. The connectedness property gives a structural insight into union-free generic sets. These sets, presented in Blocher et al. [2023], are defined by using a closure operator on the set of all partial orders which naturally appears within the theory of formal concept analysis. In the language of formal concept analysis, the property of connectedness can be vividly proven. However, since within Blocher et al. [2023] we did not discuss formal concept analysis, we outsourced the proof to this note.
Ordinal Motifs in Lattices
Hirth, Johannes, Horn, Viktoria, Stumme, Gerd, Hanika, Tom
The foundation of any formal analysis of data is the identification of unique and meaningful substructures and properties. The realm of ordinal structures, in particular lattices, is no exemption to that. The field of Formal Conceptual Analysis (FCA), which derives lattices from data tables, called formal contexts, is already very well equipped with tools and notions for identifying and analyzing important substructures. One essential tool of FCA is to provide a user a lattice diagram of meaningful size, which can be interpreted (or even explained). For obvious reasons, this approach defies any applicability to data sets as they are commonly used today, as the resulting lattices are comprised of thousands of elements. In addition, the lattice diagram itself, as the primary means of communication, presents a significant hurdle to interpretation for untrained users. Common approaches tackle the first problem by data set reductions within the data tables [10, 14] or within the resulting lattice structure [1, 2, 9, 15].
Quantifying the Conceptual Error in Dimensionality Reduction
Dimension reduction of data sets is a standard problem in the realm of machine learning and knowledge reasoning. They affect patterns in and dependencies on data dimensions and ultimately influence any decision-making processes. Therefore, a wide variety of reduction procedures are in use, each pursuing different objectives. A so far not considered criterion is the conceptual continuity of the reduction mapping, i.e., the preservation of the conceptual structure with respect to the original data set. Based on the notion scale-measure from formal concept analysis we present in this work a) the theoretical foundations to detect and quantify conceptual errors in data scalings; b) an experimental investigation of our approach on eleven data sets that were respectively treated with a variant of non-negative matrix factorization.
Exploring Scale-Measures of Data Sets
An inevitable step of any data-based knowledge discovery process is measurement [24] and the associated (explicit or implicit) scaling of the data [27]. The latter is particularly constrained by the underlying mathematical formulation of the data representation, e.g., real-valued vector spaces or weighted graphs, the requirements of the data procedures, e.g., the presence of a distance function, and, more recently, the need for human understanding of the results. Considering the scaling of data as part of the analysis itself, in particular formalizing it and thus making it controllable, is a salient feature of formal concept analysis (FCA) [7]. This field of research has spawned a variety of specialized scaling methods, such as logical scaling [25], and in the form of scale-measures links the scaling process with the study of continuous mappings between closure systems. Recent results by the authors [13] revealed that the set of all scale-measures for a given data set constitutes a lattice. Furthermore, it was shown that any scale-measure can be expressed in simple propositional terms using disjunction, conjunction and negation. Among other things, the previous results allow a computational transition between different scale-measures, which we may call scalemeasure navigation, as well as their interpretability by humans.
On the Lattice of Conceptual Measurements
Beyond that, almost every data set is further scaled prior to (data)processing to meet the requirements of the employed data analysis method, such as the introduction of artificial metrics, the numerical representation of nominal features, etc. This scaling is usually accompanied by a grade of detail, which in turn is becoming more and more of a problem for data science tasks as the availability of features increases and their human explainability decreases. Often used methods to deal with this problem from the field of machine learning, such as principal component analysis, do enforce particular, possible inapt, levels of measurement, e.g., food tastes represented by real numbers, and amplify the problem for explainability. Therefore, understanding the set of possible scaling maps, identifying its (algebraic) properties, and deriving to some extent human explainable control over it, is a pressing problem. This is especially important since found patterns and dependencies may be artifacts of some scaling map and may therefore corrupt any subsequent task,e.g., classification tasks.
Maximal Closed Set and Half-Space Separations in Finite Closure Systems
Seiffarth, Florian, Horvath, Tamas, Wrobel, Stefan
We investigate some algorithmic properties of closed set and half-space separation in abstract closure systems. Assuming that the underlying closure system is finite and given by the corresponding closure operator, we show that the half-space separation problem is NP-complete. In contrast, for the relaxed problem of maximal closed set separation we give a greedy algorithm using linear number of queries (i.e., closure operator calls) and show that this bound is sharp. For a second direction to overcome the negative result above, we consider Kakutani closure systems and prove that they are algorithmically characterized by the greedy algorithm. As one of the major potential application fields, we then focus on Kakutani closure systems over graphs and generalize a fundamental characterization result based on the Pasch axiom to graph structured partitioning of finite sets. In addition, we give a sufficient condition for Kakutani closure systems over graphs in terms of graph minors. For a second application field, we consider closure systems over finite lattices, present an adaptation of the generic greedy algorithm to this kind of closure systems, and consider two potential applications. We show that for the special case of subset lattices over finite ground sets, e.g., for formal concept lattices, its query complexity is only logarithmic in the size of the lattice. The second application is concerned with finite subsumption lattices in inductive logic programming. We show that our method for separating two sets of first-order clauses from each other extends the traditional approach based on least general generalizations of first-order clauses. Though our primary focus is on the generality of the results obtained, we experimentally demonstrate the practical usefulness of the greedy algorithm on binary classification problems in Kakutani and non-Kakutani closure systems.
Description of sup- and inf-preserving aggregation functions via families of clusters in data tables
Halaš, Radomír, Mesiar, Radko, Pócs, Jozef
Connection between the theory of aggregation functions and formal concept analysis is discussed and studied, thus filling a gap in the literature by building a bridge between these two theories, one of them living in the world of data fusion, the second one in the area of data mining. We show how Galois connections can be used to describe an important class of aggregation functions preserving suprema, and, by duality, to describe aggregation functions preserving infima. Our discovered method gives an elegant and complete description of these classes. Also possible applications of our results within certain biclustering fuzzy FCA-based methods are discussed.
Towards Collaborative Conceptual Exploration
In domains with high knowledge distribution a natural objective is to create principle foundations for collaborative interactive learning environments. We present a first mathematical characterization of a collaborative learning group, a consortium, based on closure systems of attribute sets and the well-known attribute exploration algorithm from formal concept analysis. To this end, we introduce (weak) local experts for subdomains of a given knowledge domain. These entities are able to refute and potentially accept a given (implicational) query for some closure system that is a restriction of the whole domain. On this we build up a consortial expert and show first insights about the ability of such an expert to answer queries. Furthermore, we depict techniques on how to cope with falsely accepted implications and on combining counterexamples. Using notions from combinatorial design theory we further expand those insights as far as providing first results on the decidability problem if a given consortium is able to explore some target domain. Applications in conceptual knowledge acquisition as well as in collaborative interactive ontology learning are at hand.
Logic of temporal attribute implications
We study logic for reasoning with if-then formulas describing dependencies between attributes of objects which are observed in consecutive points in time. We introduce semantic entailment of the formulas, show its fixed-point characterization, investigate closure properties of model classes, present an axiomatization and prove its completeness, and investigate alternative axiomatizations and normalized proofs. We investigate decidability and complexity issues of the logic and prove that the entailment problem is NP-hard and belongs to EXPSPACE. We show that by restricting to predictive formulas, the entailment problem is decidable in pseudo-linear time.