Goto

Collaborating Authors

Results


Demand Prediction Using Machine Learning Methods and Stacked Generalization

arXiv.org Artificial Intelligence

Supply and demand are two fundamental concepts of sellers and customers. Predicting demand accurately is critical for organizations in order to be able to make plans. In this paper, we propose a new approach for demand prediction on an e-commerce web site. The proposed model differs from earlier models in several ways. The business model used in the e-commerce web site, for which the model is implemented, includes many sellers that sell the same product at the same time at different prices where the company operates a market place model. The demand prediction for such a model should consider the price of the same product sold by competing sellers along the features of these sellers. In this study we first applied different regression algorithms for specific set of products of one department of a company that is one of the most popular online e-commerce companies in Turkey. Then we used stacked generalization or also known as stacking ensemble learning to predict demand. Finally, all the approaches are evaluated on a real world data set obtained from the e-commerce company. The experimental results show that some of the machine learning methods do produce almost as good results as the stacked generalization method.


A Review of Visual Descriptors and Classification Techniques Used in Leaf Species Identification

arXiv.org Artificial Intelligence

Plants are fundamentally important to life. Key research areas in plant science include plant species identification, weed classification using hyper spectral images, monitoring plant health and tracing leaf growth, and the semantic interpretation of leaf information. Botanists easily identify plant species by discriminating between the shape of the leaf, tip, base, leaf margin and leaf vein, as well as the texture of the leaf and the arrangement of leaflets of compound leaves. Because of the increasing demand for experts and calls for biodiversity, there is a need for intelligent systems that recognize and characterize leaves so as to scrutinize a particular species, the diseases that affect them, the pattern of leaf growth, and so on. We review several image processing methods in the feature extraction of leaves, given that feature extraction is a crucial technique in computer vision. As computers cannot comprehend images, they are required to be converted into features by individually analysing image shapes, colours, textures and moments. Images that look the same may deviate in terms of geometric and photometric variations. In our study, we also discuss certain machine learning classifiers for an analysis of different species of leaves.


Handling of uncertainty in medical data using machine learning and probability theory techniques: A review of 30 years (1991-2020)

arXiv.org Artificial Intelligence

Understanding data and reaching valid conclusions are of paramount importance in the present era of big data. Machine learning and probability theory methods have widespread application for this purpose in different fields. One critically important yet less explored aspect is how data and model uncertainties are captured and analyzed. Proper quantification of uncertainty provides valuable information for optimal decision making. This paper reviewed related studies conducted in the last 30 years (from 1991 to 2020) in handling uncertainties in medical data using probability theory and machine learning techniques. Medical data is more prone to uncertainty due to the presence of noise in the data. So, it is very important to have clean medical data without any noise to get accurate diagnosis. The sources of noise in the medical data need to be known to address this issue. Based on the medical data obtained by the physician, diagnosis of disease, and treatment plan are prescribed. Hence, the uncertainty is growing in healthcare and there is limited knowledge to address these problems. We have little knowledge about the optimal treatment methods as there are many sources of uncertainty in medical science. Our findings indicate that there are few challenges to be addressed in handling the uncertainty in medical raw data and new models. In this work, we have summarized various methods employed to overcome this problem. Nowadays, application of novel deep learning techniques to deal such uncertainties have significantly increased.


Distributed Linguistic Representations in Decision Making: Taxonomy, Key Elements and Applications, and Challenges in Data Science and Explainable Artificial Intelligence

arXiv.org Artificial Intelligence

Distributed linguistic representations are powerful tools for modelling the uncertainty and complexity of preference information in linguistic decision making. To provide a comprehensive perspective on the development of distributed linguistic representations in decision making, we present the taxonomy of existing distributed linguistic representations. Then, we review the key elements of distributed linguistic information processing in decision making, including the distance measurement, aggregation methods, distributed linguistic preference relations, and distributed linguistic multiple attribute decision making models. Next, we provide a discussion on ongoing challenges and future research directions from the perspective of data science and explainable artificial intelligence.


Fuzzy OWL-BOOST: Learning Fuzzy Concept Inclusions via Real-Valued Boosting

arXiv.org Artificial Intelligence

OWL ontologies are nowadays a quite popular way to describe structured knowledge in terms of classes, relations among classes and class instances. In this paper, given a target class T of an OWL ontology, we address the problem of learning fuzzy concept inclusion axioms that describe sufficient conditions for being an individual instance of T. To do so, we present Fuzzy OWL-BOOST that relies on the Real AdaBoost boosting algorithm adapted to the (fuzzy) OWL case. We illustrate its effectiveness by means of an experimentation. An interesting feature is that the learned rules can be represented directly into Fuzzy OWL 2. As a consequence, any Fuzzy OWL 2 reasoner can then be used to automatically determine/classify (and to which degree) whether an individual belongs to the target class T.


Diagnosis of Coronary Artery Disease Using Artificial Intelligence Based Decision Support System

arXiv.org Artificial Intelligence

This research is about the development a fuzzy decision support system for the diagnosis of coronary artery disease based on evidence. The coronary artery disease data sets taken from University California Irvine (UCI) are used. The knowledge base of fuzzy decision support system is taken by using rules extraction method based on Rough Set Theory. The rules then are selected and fuzzified based on information from discretization of numerical attributes. Fuzzy rules weight is proposed using the information from support of extracted rules. UCI heart disease data sets collected from U.S., Switzerland and Hungary, data from Ipoh Specialist Hospital Malaysia are used to verify the proposed system. The results show that the system is able to give the percentage of coronary artery blocking better than cardiologists and angiography. The results of the proposed system were verified and validated by three expert cardiologists and are considered to be more efficient and useful.


Explainable Artificial Intelligence: a Systematic Review

arXiv.org Artificial Intelligence

This has led to the development of a plethora of domain-dependent and context-specific methods for dealing with the interpretation of machine learning (ML) models and the formation of explanations for humans. Unfortunately, this trend is far from being over, with an abundance of knowledge in the field which is scattered and needs organisation. The goal of this article is to systematically review research works in the field of XAI and to try to define some boundaries in the field. From several hundreds of research articles focused on the concept of explainability, about 350 have been considered for review by using the following search methodology. In a first phase, Google Scholar was queried to find papers related to "explainable artificial intelligence", "explainable machine learning" and "interpretable machine learning". Subsequently, the bibliographic section of these articles was thoroughly examined to retrieve further relevant scientific studies. The first noticeable thing, as shown in figure 2 (a), is the distribution of the publication dates of selected research articles: sporadic in the 70s and 80s, receiving preliminary attention in the 90s, showing raising interest in 2000 and becoming a recognised body of knowledge after 2010. The first research concerned the development of an explanation-based system and its integration in a computer program designed to help doctors make diagnoses [3]. Some of the more recent papers focus on work devoted to the clustering of methods for explainability, motivating the need for organising the XAI literature [4, 5, 6].


Parallel processor scheduling: formulation as multi-objective linguistic optimization and solution using Perceptual Reasoning based methodology

arXiv.org Artificial Intelligence

In the era of Industry 4.0, the focus is on the minimization of human element and maximizing the automation in almost all the industrial and manufacturing establishments. These establishments contain numerous processing systems, which can execute a number of tasks, in parallel with minimum number of human beings. This parallel execution of tasks is done in accordance to a scheduling policy. However, the minimization of human element beyond a certain point is difficult. In fact, the expertise and experience of a group of humans, called the experts, becomes imminent to design a fruitful scheduling policy. The aim of the scheduling policy is to achieve the optimal value of an objective, like production time, cost, etc. In real-life situations, there are more often than not, multiple objectives in any parallel processing scenario. Furthermore, the experts generally provide their opinions, about various scheduling criteria (pertaining to the scheduling policies) in linguistic terms or words. Word semantics are best modeled using fuzzy sets (FSs). Thus, all these factors have motivated us to model the parallel processing scenario as a multi-objective linguistic optimization problem (MOLOP) and use the novel perceptual reasoning (PR) based methodology for solving it. We have also compared the results of the PR based solution methodology with those obtained from the 2-tuple based solution methodology. PR based solution methodology offers three main advantages viz., it generates unique recommendations, here the linguistic recommendations match a codebook word, and also the word model comes before the word. 2-tuple based solution methodology fails to give all these advantages. Thus, we feel that our work is novel and will provide directions for the future research.


Perceptual reasoning based solution methodology for linguistic optimization problems

arXiv.org Artificial Intelligence

Decision making in real-life scenarios may often be modeled as an optimization problem. It requires the consideration of various attributes like human preferences and thinking, which constrain achieving the optimal value of the problem objectives. The value of the objectives may be maximized or minimized, depending on the situation. Numerous times, the values of these problem parameters are in linguistic form, as human beings naturally understand and express themselves using words. These problems are therefore termed as linguistic optimization problems (LOPs), and are of two types, namely single objective linguistic optimization problems (SOLOPs) and multi-objective linguistic optimization problems (MOLOPs). In these LOPs, the value of the objective function(s) may not be known at all points of the decision space, and therefore, the objective function(s) as well as problem constraints are linked by the if-then rules. Tsukamoto inference method has been used to solve these LOPs; however, it suffers from drawbacks. As, the use of linguistic information inevitably calls for the utilization of computing with words (CWW), and therefore, 2-tuple linguistic model based solution methodologies were proposed for LOPs. However, we found that 2-tuple linguistic model based solution methodologies represent the semantics of the linguistic information using a combination of type-1 fuzzy sets and ordinal term sets. As, the semantics of linguistic information are best modeled using the interval type-2 fuzzy sets, hence we propose solution methodologies for LOPs based on CWW approach of perceptual computing, in this paper. The perceptual computing based solution methodologies use a novel design of CWW engine, called the perceptual reasoning (PR). PR in the current form is suitable for solving SOLOPs and, hence, we have also extended it to the MOLOPs.


Augmentation of the Reconstruction Performance of Fuzzy C-Means with an Optimized Fuzzification Factor Vector

arXiv.org Artificial Intelligence

Information granules have been considered to be the fundamental constructs of Granular Computing (GrC). As a useful unsupervised learning technique, Fuzzy C-Means (FCM) is one of the most frequently used methods to construct information granules. The FCM-based granulation-degranulation mechanism plays a pivotal role in GrC. In this paper, to enhance the quality of the degranulation (reconstruction) process, we augment the FCM-based degranulation mechanism by introducing a vector of fuzzification factors (fuzzification factor vector) and setting up an adjustment mechanism to modify the prototypes and the partition matrix. The design is regarded as an optimization problem, which is guided by a reconstruction criterion. In the proposed scheme, the initial partition matrix and prototypes are generated by the FCM. Then a fuzzification factor vector is introduced to form an appropriate fuzzification factor for each cluster to build up an adjustment scheme of modifying the prototypes and the partition matrix. With the supervised learning mode of the granulation-degranulation process, we construct a composite objective function of the fuzzification factor vector, the prototypes and the partition matrix. Subsequently, the particle swarm optimization (PSO) is employed to optimize the fuzzification factor vector to refine the prototypes and develop the optimal partition matrix. Finally, the reconstruction performance of the FCM algorithm is enhanced. We offer a thorough analysis of the developed scheme. In particular, we show that the classical FCM algorithm forms a special case of the proposed scheme. Experiments completed for both synthetic and publicly available datasets show that the proposed approach outperforms the generic data reconstruction approach.