Africa
RES - a Relative Method for Evidential Reasoning
An, Zhi, Bell, David A., Hughes, John G.
In this paper we describe a novel method for evidential reasoning [1]. It involves modelling the process of evidential reasoning in three steps, namely, evidence structure construction, evidence accumulation, and decision making. The proposed method, called RES, is novel in that evidence strength is associated with an evidential support relationship (an argument) between a pair of statements and such strength is carried by comparison between arguments. This is in contrast to the onventional approaches, where evidence strength is represented numerically and is associated with a statement.
Boolean Equi-propagation for Concise and Efficient SAT Encodings of Combinatorial Problems
Metodi, A., Codish, M., Stuckey, P. J.
We present an approach to propagation-based SAT encoding of combinatorial problems, Boolean equi-propagation, where constraints are modeled as Boolean functions which propagate information about equalities between Boolean literals. This information is then applied to simplify the CNF encoding of the constraints. A key factor is that considering only a small fragment of a constraint model at one time enables us to apply stronger, and even complete, reasoning to detect equivalent literals in that fragment. Once detected, equivalences apply to simplify the entire constraint model and facilitate further reasoning on other fragments. Equi-propagation in combination with partial evaluation and constraint simplification provide the foundation for a powerful approach to SAT-based finite domain constraint solving. We introduce a tool called BEE (Ben-Gurion Equi-propagation Encoder) based on these ideas and demonstrate for a variety of benchmarks that our approach leads to a considerable reduction in the size of CNF encodings and subsequent speed-ups in SAT solving times.
Continuous-time Infinite Dynamic Topic Models
Topic models are probabilistic models for discovering topical themes in collections of documents. In real world applications, these models provide us with the means of organizing what would otherwise be unstructured collections. They can help us cluster a huge collection into different topics or find a subset of the collection that resembles the topical theme found in an article at hand. The first wave of topic models developed were able to discover the prevailing topics in a big collection of documents spanning a period of time. It was later realized that these time-invariant models were not capable of modeling 1) the time varying number of topics they discover and 2) the time changing structure of these topics. Few models were developed to address this two deficiencies. The online-hierarchical Dirichlet process models the documents with a time varying number of topics. It varies the structure of the topics over time as well. However, it relies on document order, not timestamps to evolve the model over time. The continuous-time dynamic topic model evolves topic structure in continuous-time. However, it uses a fixed number of topics over time. In this dissertation, I present a model, the continuous-time infinite dynamic topic model, that combines the advantages of these two models 1) the online-hierarchical Dirichlet process, and 2) the continuous-time dynamic topic model. More specifically, the model I present is a probabilistic topic model that does the following: 1) it changes the number of topics over continuous time, and 2) it changes the topic structure over continuous-time. I compared the model I developed with the two other models with different setting values. The results obtained were favorable to my model and showed the need for having a model that has a continuous-time varying number of topics and topic structure.
Generating Extractive Summaries of Scientific Paradigms
Qazvinian, V., Radev, D. R., Mohammad, S. M., Dorr, B., Zajic, D., Whidby, M., Moon, T.
Researchers and scientists increasingly find themselves in the position of having to quickly understand large amounts of technical material. Our goal is to effectively serve this need by using bibliometric text mining and summarization techniques to generate summaries of scientific literature. We show how we can use citations to produce automatically generated, readily consumable, technical extractive summaries. We first propose C-LexRank, a model for summarizing single scientific articles based on citations, which employs community detection and extracts salient information-rich sentences. Next, we further extend our experiments to summarize a set of papers, which cover the same scientific topic. We generate extractive summaries of a set of Question Answering (QA) and Dependency Parsing (DP) papers, their abstracts, and their citation sentences and show that citations have unique information amenable to creating a summary.
Evaluating Indirect Strategies for Chinese-Spanish Statistical Machine Translation
Costa-jussà, M. R., Henríquez, C. A., Banchs, R. E.
Although, Chinese and Spanish are two of the most spoken languages in the world, not much research has been done in machine translation for this language pair. This paper focuses on investigating the state-of-the-art of Chinese-to-Spanish statistical machine translation (Smt), which nowadays is one of the most popular approaches to machine translation. For this purpose, we report details of the available parallel corpus which are Basic Traveller Expressions Corpus (Btec), Holy Bible and United Nations (Un). Additionally, we conduct experimental work with the largest of these three corpora to explore alternative Smt strategies by means of using a pivot language. Three alternatives are considered for pivoting: cascading, pseudo-corpus and triangulation. As pivot language, we use either English, Arabic or French. Results show that, for a phrase-based Smt system, English is the best pivot language between Chinese and Spanish. We propose a system output combination using the pivot strategies which is capable of outperforming the direct translation strategy. The main objective of this work is motivating and involving the research community to work in this important pair of languages given their demographic impact.
Visualization and clustering by 3D cellular automata: Application to unstructured data
Hamou, Reda Mohamed, Amine, Abdelmalek, Lokbani, Ahmed Chaouki, Simonet, Michel
Given the limited performance of 2D cellular automata in terms of space when the number of documents increases and in terms of visualization clusters, our motivation was to experiment these cellular automata by increasing the size to view the impact of size on quality of results. The representation of textual data was carried out by a vector model whose components are derived from the overall balancing of the used corpus Term Frequency - Inverse Document Frequency (TF - IDF).The WorldNet thesaurus has been used to address the problem of the lemmatization of the words because the representation used in this study is that of the bags of words. Another independent method of the language was used to represent textual records is that of the n-grams. Several measures of similarity have been tested. To validate the classification we have used two measures of assessment based on the recall and precision (f-measure and entropy). The results are promising and confirm the idea to increase the dimension to the problem of the spatiality of the classes. The results obtained in terms of purity class (ie the minimum value of entropy) shows that the number of documents over longer believes the results are better for 3D cellular automata, which was not obvious to 2D the dimension. In terms of spatial navigation, cellular automata provide very good 3D performance visualization than 2D cellular automata.
A Logic and Adaptive Approach for Efficient Diagnosis Systems using CBR
Bitar, Ibrahim El, Belouadha, Fatima-Zahra, Roudies, Ounsa
Case Based Reasoning (CBR) is an intelligent way of thinking based on experience and capitalization of already solved cases (source cases) to find a solution to a new problem (target case). Retrieval phase consists on identifying source cases that are similar to the target case. This phase may lead to erroneous results if the existing knowledge imperfections are not taken into account. This work presents a novel solution based on Fuzzy logic techniques and adaptation measures which aggregate weighted similarities to improve the retrieval results. To confirm the efficiency of our solution, we have applied it to the industrial diagnosis domain. The obtained results are more efficient results than those obtained by applying typical measures.
Dynamic Decision Support System Based on Bayesian Networks Application to fight against the Nosocomial Infections
Ltifi, Hela, Trabelsi, Ghada, Ayed, Mounir Ben, Alimi, Adel M.
The improvement of medical care quality is a significant interest for the future years. The fight against nosocomial infections (NI) in the intensive care units (ICU) is a good example. We will focus on a set of observations which reflect the dynamic aspect of the decision, result of the application of a Medical Decision Support System (MDSS). This system has to make dynamic decision on temporal data. We use dynamic Bayesian network (DBN) to model this dynamic process. It is a temporal reasoning within a real-time environment; we are interested in the Dynamic Decision Support Systems in healthcare domain (MDDSS).
Detecting and Generating Ironic Comparisons: An Application of Creative Information Retrieval
Veale, Tony (Korean Advanced Institute of Science and Technology)
Ironic utterances promise an expected meaning that never arrives, and deliver instead a meaning that exposes the failure of our expectations. Though they can appear contextually inappropriate, ironic statements succeed when they subvert their context of use, so it is the context rather than the utterance that is shown to be incongruous. Every ironic statement thus poses two related questions: the first, “what is unexpected about my meaning?” helps us answer the second, “what is unexpected about my context of use?”. Like metaphor, irony is not overtly marked, and relies instead on a listener’s understanding of stereotypical norms to unpack its true meaning. In this paper we consider how irony relies upon and subverts our stereotypical knowledge of a domain, and show how this knowledge can be exploited to both recognize and generate ironic similes for a topic.
Automatic Formalization of Clinical Practice Guidelines
Gerber, Matthew (University of Virginia) | Brown, Donald (University of Virginia) | Harrison, James (University of Virginia)
Current efforts aim to incorporate knowledge from clinical practice guidelines (CPGs) into computer systems using sophisticated interchange formats. Due to their complexity, such formats require expensive manual formalization work. This paper presents a preliminary study of using natural language processing (NLP) to automatically formalize CPG recommendations. We developed a CPG representation using concepts from the Systematized Nomenclature of Medicine – Clinical Terms (SNOMED–CT), and manually applied this representation to a sample of CPG recommendations that is representative of multiple medical domains and recommendation types. Using this resource, we trained and evaluated a supervised classification model that formalizes new CPG recommendations according to the SNOMED–CT representation, achieving a precision of 75% and recall of 42% (F1 = 54%). We have identified two important lines of future investigation: (1) feature engineering to address the unique linguistic properties of CPG recommendations, and (2) alternative model formulations that are more robust to processing errors. A third line of investigation – creating additional training data for the NLP model – is shown to be of little utility.