Collaborating Authors


Neurocognitive Informatics Manifesto Artificial Intelligence

Theoretical and abstract approaches to information have made great advances, but human information processing is still unmatched in many areas, including information management, representation and understanding. Neurocognitive informatics is a new, emerging field that should help to improve the matching of artificial and natural systems, and inspire better computational algorithms to solve problems that are still beyond the reach of machines. In this position paper examples of neurocognitive inspirations and promising directions in this area are given.

Towards Metaheuristics "In the Large" Artificial Intelligence

Following decades of sustained improvement, metaheuristics are one of the great success stories of optimization research. However, in order for research in metaheuristics to avoid fragmentation and a lack of reproducibility, there is a pressing need for stronger scientific and computational infrastructure to support the development, analysis and comparison of new approaches. We argue that, via principled choice of infrastructure support, the field can pursue a higher level of scientific enquiry. We describe our vision and report on progress, showing how the adoption of common protocols for all metaheuristics can help liberate the potential of the field, easing the exploration of the design space of metaheuristics.

A decision support framework for prediction of avian influenza


For years, avian influenza has influenced economies and human health around the world. The emergence and spread of avian influenza virus have been uncertain and sudden. The virus is likely to spread through several pathways such as poultry transportation and wild bird migration. The complicated and global spread of avian influenza calls for surveillance tools for timely and reliable prediction of disease events. These tools can increase situational awareness and lead to faster reaction to events. Here, we aimed to design and evaluate a decision support framework that aids decision makers by answering their questions regarding the future risk of events at various geographical scales. Risk patterns were driven from pre-built components and combined in a knowledge base. Subsequently, questions were answered by direct queries on the knowledge base or through a built-in algorithm. The evaluation of the system in detecting events resulted in average sensitivity and specificity of 69.70% and 85.50%, respectively. The presented framework here can support health care authorities by providing them with an opportunity for early control of emergency situations.

SharePoint Syntex to automate content categorization and build a foundation for knowledge curation


Microsoft announced the general availability of Microsoft SharePoint Syntex as of Oc. 1, 2020. This is the first packaged product to come out of the code-name Project Cortex initiative first announced in November 2019. Project Cortex reflects Microsoft's ongoing investment in intelligent content services and graph APIs to proactively explore and categorize digital assets from Microsoft 365 and other connected sources. Teams need tools to help them collaborate and stay productive while remotely working. SharePoint Syntex will be available to M365 customers with E3 or E5 licenses for a small per-user uplift.

Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases Artificial Intelligence

Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal of AI. Over the last decade, large-scale knowledge bases, also known as knowledge graphs, have been automatically constructed from web contents and text sources, and have become a key asset for search engines. This machine knowledge can be harnessed to semantically interpret textual phrases in news, social media and web tables, and contributes to question answering, natural language processing and data analytics. This article surveys fundamental concepts and practical methods for creating and curating large knowledge bases. It covers models and methods for discovering and canonicalizing entities and their semantic types and organizing them into clean taxonomies. On top of this, the article discusses the automatic extraction of entity-centric properties. To support the long-term life-cycle and the quality assurance of machine knowledge, the article presents methods for constructing open schemas and for knowledge curation. Case studies on academic projects and industrial knowledge graphs complement the survey of concepts and methods.

What is an intelligent system? Artificial Intelligence

Mankind has made significant progress through the development of increasingly powerful and sophisticated tools. In the age of the industrial revolution, a large number of tools were built as machines that automated tasks requiring physical effort. In the digital age, computer-based tools are being created to automate tasks that require mental effort. The capabilities of these tools have been progressively increased to perform tasks that require more and more intelligence. This evolution has generated a type of tool that we call intelligent system. Intelligent systems help us performing specialized tasks in professional domains such as medical diagnosis (e.g., recognize tumors on x-ray images) or airport management (e.g., generate a new assignment of airport gates in the presence of an incident).

Learn to Talk via Proactive Knowledge Transfer Artificial Intelligence

Knowledge Transfer has been applied in solving a wide variety of problems. For example, knowledge can be transferred between tasks (e.g., learning to handle novel situations by leveraging prior knowledge) or between agents (e.g., learning from others without direct experience). Without loss of generality, we relate knowledge transfer to KL-divergence minimization, i.e., matching the (belief) distributions of learners and teachers. The equivalence gives us a new perspective in understanding variants of the KL-divergence by looking at how learners structure their interaction with teachers in order to acquire knowledge. In this paper, we provide an in-depth analysis of KL-divergence minimization in Forward and Backward orders, which shows that learners are reinforced via on-policy learning in Backward. In contrast, learners are supervised in Forward. Moreover, our analysis is gradient-based, so it can be generalized to arbitrary tasks and help to decide which order to minimize given the property of the task. By replacing Forward with Backward in Knowledge Distillation, we observed +0.7-1.1 BLEU gains on the WMT'17 De-En and IWSLT'15 Th-En machine translation tasks.

Efficient Knowledge Graph Validation via Cross-Graph Representation Learning Artificial Intelligence

Recent advances in information extraction have motivated the automatic construction of huge Knowledge Graphs (KGs) by mining from large-scale text corpus. However, noisy facts are unavoidably introduced into KGs that could be caused by automatic extraction. To validate the correctness of facts (i.e., triplets) inside a KG, one possible approach is to map the triplets into vector representations by capturing the semantic meanings of facts. Although many representation learning approaches have been developed for knowledge graphs, these methods are not effective for validation. They usually assume that facts are correct, and thus may overfit noisy facts and fail to detect such facts. Towards effective KG validation, we propose to leverage an external human-curated KG as auxiliary information source to help detect the errors in a target KG. The external KG is built upon human-curated knowledge repositories and tends to have high precision. On the other hand, although the target KG built by information extraction from texts has low precision, it can cover new or domain-specific facts that are not in any human-curated repositories. To tackle this challenging task, we propose a cross-graph representation learning framework, i.e., CrossVal, which can leverage an external KG to validate the facts in the target KG efficiently. This is achieved by embedding triplets based on their semantic meanings, drawing cross-KG negative samples and estimating a confidence score for each triplet based on its degree of correctness. We evaluate the proposed framework on datasets across different domains. Experimental results show that the proposed framework achieves the best performance compared with the state-of-the-art methods on large-scale KGs.

AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types Artificial Intelligence

Can one build a knowledge graph (KG) for all products in the world? Knowledge graphs have firmly established themselves as valuable sources of information for search and question answering, and it is natural to wonder if a KG can contain information about products offered at online retail sites. There have been several successful examples of generic KGs, but organizing information about products poses many additional challenges, including sparsity and noise of structured data for products, complexity of the domain with millions of product types and thousands of attributes, heterogeneity across large number of categories, as well as large and constantly growing number of products. We describe AutoKnow, our automatic (self-driving) system that addresses these challenges. The system includes a suite of novel techniques for taxonomy construction, product property identification, knowledge extraction, anomaly detection, and synonym discovery. AutoKnow is (a) automatic, requiring little human intervention, (b) multi-scalable, scalable in multiple dimensions (many domains, many products, and many attributes), and (c) integrative, exploiting rich customer behavior logs. AutoKnow has been operational in collecting product knowledge for over 11K product types.

Classifying histograms of medical data using information geometry of beta distributions Machine Learning

It can be seen The differential geometric approach to probability theory as a natural choice of metric as it is the only Riemannian and statistics has met increasing interest in the past metric that is invariant with respect to transformation by years, from the theoretical point of view as well as in a sufficient statistic, or a diffeomorphic transformation of applications. In this approach, probability distributions the support in the nonparametric case (Cencov, 2000; are seen as elements of a differentiable manifold, on which Bauer et al., 2016). Arguably the most famous example a metric structure is defined through the choice of a of Fisher information geometry of a statistical model is Riemannian metric. Two very important ones are the that of the univariate Gaussian model, which is hyperbolic. Wasserstein metric, central in optimal transport, and The geometries of other parametric families such as the Fisher information metric (also called Fisher-Rao the multivariate Gaussian model (Atkinson and Mitchell, metric), essential in information geometry. Unlike optimal 1981; Skovgaard, 1984), the family of gamma distributions transport, information geometry is foremost concerned (Arwini and Dodson, 2008; Rebbah et al., 2019), or more with parametric families of probability distributions, and generally location-scale models (Said et al., 2019), among defines a Riemannian structure on the parameter space others, have also received a lot of attention.