Goto

Collaborating Authors

 Expert Systems


Incremental personalized E-mail spam filter using novel TFDCR feature selection with dynamic feature update

arXiv.org Machine Learning

Communication through e-mails remains to be highly formalized, conventional and indispensable method for the exchange of information over the Internet. An ever-increasing ratio and adversary nature of spam e-mails have posed a great many challenges such as uneven class distribution, unequal error cost, frequent change of content and personalized context-sensitive discrimination. In this research, we propose a novel and distinctive approach to develop an incremental personalized e-mail spam filter. The proposed work is described using three significant contributions. First, we applied a novel term frequency difference and category ratio based feature selection function TFDCR to select the most discriminating features irrespective of the number of samples in each class. Second, an incremental learning model is used which enables the classifier to update the discriminant function dynamically. Third, a heuristic function called selectionRankWeight is introduced to upgrade the existing feature set that determines new features carrying strong discriminating ability from an incoming set of e-mails. Three public e-mail datasets possessing different characteristics are used to evaluate the filter performance. Experiments are conducted to compare the feature selection efficiency of TFDCR and to observe the filter performance under both the batch and the incremental learning mode. The results demonstrate the superiority of TFDCR as the most effective f eature selection function. The incremental learning model incorporating dynamic feature update function overcomes the problem of drifting concepts. The proposed filter validates its efficiency and feasibility by substantially improving the classification accuracy and reducing the false positive error of misclassifying legitimate e-mail as spam.


Survey on Automated Machine Learning

arXiv.org Artificial Intelligence

Machine learning has become a vital part in many aspects of our daily life. However, building well performing machine learning applications requires highly specialized data scientists and domain experts. Automated machine learning (AutoML) aims to reduce the demand for data scientists by enabling domain experts to automatically build machine learning applications without extensive knowledge of statistics and machine learning. In this survey, we summarize the recent developments in academy and industry regarding AutoML. First, we introduce a holistic problem formulation. Next, approaches for solving various subproblems of AutoML are presented. Finally, we provide an extensive empirical evaluation of the presented approaches on synthetic and real data.


Sparse Neural Attentive Knowledge-based Models for Grade Prediction

arXiv.org Machine Learning

Grade prediction for future courses not yet taken by students is important as it can help them and their advisers during the process of course selection as well as for designing personalized degree plans and modifying them based on their performance. One of the successful approaches for accurately predicting a student's grades in future courses is Cumulative Knowledge-based Regression Models (CKRM). CKRM learns shallow linear models that predict a student's grades as the similarity between his/her knowledge state and the target course. A student's knowledge state is built by linearly accumulating the learned provided knowledge components of the courses he/she has taken in the past, weighted by his/her grades in them. However, not all the prior courses contribute equally to the target course. In this paper, we propose a novel Neural Attentive Knowledge-based model (NAK) that learns the importance of each historical course in predicting the grade of a target course. Compared to CKRM and other competing approaches, our experiments on a large real-world dataset consisting of $\sim$1.5 grades show the effectiveness of the proposed NAK model in accurately predicting the students' grades. Moreover, the attention weights learned by the model can be helpful in better designing their degree plans.


Relation Discovery with Out-of-Relation Knowledge Base as Supervision

arXiv.org Machine Learning

Unsupervised relation discovery aims to discover new relations from a given text corpus without annotated data. However, it does not consider existing human annotated knowledge bases even when they are relevant to the relations to be discovered. In this paper, we study the problem of how to use out-of-relation knowledge bases to supervise the discovery of unseen relations, where out-of-relation means that relations to discover from the text corpus and those in knowledge bases are not overlapped. We construct a set of constraints between entity pairs based on the knowledge base embedding and then incorporate constraints into the relation discovery by a variational auto-encoder based algorithm. Experiments show that our new approach can improve the state-of-the-art relation discovery performance by a large margin.


Explainability in Human-Agent Systems

arXiv.org Artificial Intelligence

This paper presents a taxonomy of explainability in Human-Agent Systems. We consider fundamental questions about the Why, Who, What, When and How of explainability. First, we define explainability, and its relationship to the related terms of interpretability, transparency, explicitness, and faithfulness. These definitions allow us to answer why explainability is needed in the system, whom it is geared to and what explanations can be generated to meet this need. We then consider when the user should be presented with this information. Last, we consider how objective and subjective measures can be used to evaluate the entire system. This last question is the most encompassing as it will need to evaluate all other issues regarding explainability.


Usage of Decision Support Systems for Conflicts Modelling during Information Operations Recognition

arXiv.org Artificial Intelligence

Application of decision support systems for conflict modeling in information operations recognition is presented. An information operation is considered as a complex weakly structured system. The model of conflict between two subjects is proposed based on the second-order rank reflexive model. The method is described for construction of the design pattern for knowledge bases of decision support systems. In the talk, the methodology is proposed for using of decision support systems for modeling of conflicts in information operations recognition based on the use of expert knowledge and content monitoring.


OpenKI: Integrating Open Information Extraction and Knowledge Bases with Relation Inference

arXiv.org Machine Learning

In this paper, we consider advancing web-scale knowledge extraction and alignment by integrating OpenIE extractions in the form of (subject, predicate, object) triples with Knowledge Bases (KB). Traditional techniques from universal schema and from schema mapping fall in two extremes: either they perform instance-level inference relying on embedding for (subject, object) pairs, thus cannot handle pairs absent in any existing triples; or they perform predicate-level mapping and completely ignore background evidence from individual entities, thus cannot achieve satisfying quality. We propose OpenKI to handle sparsity of OpenIE extractions by performing instance-level inference: for each entity, we encode the rich information in its neighborhood in both KB and OpenIE extractions, and leverage this information in relation inference by exploring different methods of aggregation and attention. In order to handle unseen entities, our model is designed without creating entity-specific parameters. Extensive experiments show that this method not only significantly improves state-of-the-art for conventional OpenIE extractions like ReVerb, but also boosts the performance on OpenIE from semi-structured data, where new entity pairs are abundant and data are fairly sparse.


Generating Animations from Screenplays

arXiv.org Artificial Intelligence

Automatically generating animation from natural language text finds application in a number of areas e.g. movie script writing, instructional videos, and public safety. However, translating natural language text into animation is a challenging task. Existing text-to-animation systems can handle only very simple sentences, which limits their applications. In this paper, we develop a text-to-animation system which is capable of handling complex sentences. We achieve this by introducing a text simplification step into the process. Building on an existing animation generation system for screenwriting, we create a robust NLP pipeline to extract information from screenplays and map them to the system's knowledge base. We develop a set of linguistic transformation rules that simplify complex sentences. Information extracted from the simplified sentences is used to generate a rough storyboard and video depicting the text. Our sentence simplification module outperforms existing systems in terms of BLEU and SARI metrics.We further evaluated our system via a user study: 68 % participants believe that our system generates reasonable animation from input screenplays.


Adaptive Learning Expert System for Diagnosis and Management of Viral Hepatitis

arXiv.org Artificial Intelligence

Viral hepatitis is the regularly found health problem throughout the world among other easily transmitted diseases, such as tuberculosis, human immune virus, malaria and so on. Among all hepatitis viruses, the uppermost numbers of deaths are result from the long-lasting hepatitis C infection or long-lasting hepatitis B. In order to develop this system, the knowledge is acquired using both structured and semi-structured interviews from internists of St.Paul Hospital. Once the knowledge is acquired, it is modeled and represented using rule based reasoning techniques. Both forward and backward chaining is used to infer the rules and provide appropriate advices in the developed expert system. For the purpose of developing the prototype expert system SWI-prolog editor also used. The proposed system has the ability to adapt with dynamic knowledge by generalizing rules and discover new rules through learning the newly arrived knowledge from domain experts adaptively without any help from the knowledge engineer.


Decomposing Temperature Time Series with Non-Negative Matrix Factorization

arXiv.org Machine Learning

During the fabrication of casting parts sensor data is typically automatically recorded and accumulated for process monitoring and defect diagnosis. As casting is a thermal process with many interacting process parameters, root cause analysis tends to be tedious and ineffective. We show how a decomposition based on non-negative matrix factorization (NMF), which is guided by a knowledge-based initialization strategy, is able to extract physical meaningful sources from temperature time series collected during a thermal manufacturing process. The approach assumes the time series to be generated by a superposition of several simultaneously acting component processes. NMF is able to reverse the superposition and to identify the hidden component processes. The latter can be linked to ongoing physical phenomena and process variables, which cannot be monitored directly. Our approach provides new insights into the underlying physics and offers a tool, which can assist in diagnosing defect causes. We demonstrate our method by applying it to real world data, collected in a foundry during the series production of casting parts for the automobile industry.