Collaborating Authors

AI For Compliance: What, Why, How -


With the constant rise and use of technology, Artificial Intelligence (AI) has become a great companion to compliance. Compliance is one of the biggest playing fields and plays a pivotal role in banking institutions. It aims to identify, diminish, and manage risks such as insider trading, spoofing attacks, exploitation of the market, front-running, and more by ensuring that banks operate with integrity and adhere to policies, laws, and regulations during the decision making process. In this post, I will dive into the concept of Artificial Intelligence and compliance, and share some thoughts about why it matters and how to achieve better compliance with AI. In relation to financial services such as Banks, the compliance department is the body responsible for ensuring that the institution as a whole remains in accordance with set rules or standards.

Automated Drug-Related Information Extraction from French Clinical Documents: ReLyfe Approach Artificial Intelligence

Structuring medical data in France remains a challenge mainly because of the lack of medical data due to privacy concerns and the lack of methods and approaches on processing the French language. One of these challenges is structuring drug-related information in French clinical documents. To our knowledge, over the last decade, there are less than five relevant papers that study French prescriptions. This paper proposes a new approach for extracting drug-related information from French clinical scanned documents while preserving patients' privacy. In addition, we deployed our method in a health data management platform where it is used to structure drug medical data and help patients organize their drug schedules. It can be implemented on any web or mobile platform. This work closes the gap between theoretical and practical work by creating an application adapted to real production problems. It is a combination of a rule-based phase and a Deep Learning approach. Finally, numerical results show the outperformance and relevance of the proposed methodology.

Fine-Grained Entity Typing for Domain Independent Entity Linking Artificial Intelligence

Neural entity linking models are very powerful, but run the risk of overfitting to the domain they are trained in. For this problem, a domain can be narrowly construed as a particular distribution of entities, as models can even overfit by memorizing properties of specific frequent entities in a dataset. We tackle the problem of building robust entity linking models that generalize effectively and do not rely on labeled entity linking data with a specific entity distribution. Rather than predicting entities directly, our approach models fine-grained entity properties, which can help disambiguate between even closely related entities. We derive a large inventory of types (tens of thousands) from Wikipedia categories, and use hy-perlinked mentions in Wikipedia to distantly label data and train an entity typing model. At test time, we classify a mention with this typing model and use soft type predictions to link the mention to the most similar candidate entity. We evaluate our entity linking system on the CoNLL-Y AGO (Hoffart et al., 2011) dataset and show that our approach outperforms prior domain-independent entity linking systems. We also test our approach in a harder setting derived from the WikilinksNED dataset (Eshel et al., 2017) where all the mention-entity pairs are unseen during test time. Results indicate that our approach generalizes better than a state-of-the-art neural model on the dataset. 1 Introduction Historically, systems for entity linking to Wikipedia relied on heuristics such as anchor text distributions (Cucerzan, 2007; Milne and Witten, 2008), tf-idf (Ratinov et al., 2011), and Wikipedia relatedness of nearby entities (Hoffart et al., 2011). These systems have few parameters, making them relatively flexible across domains. More recent systems have typically been parameter-rich neural network models (Sun et al., 2015; Y amada et al., 2016; Francis-Landau et al., 2016; Eshel et al., 2017).

Graph integration of structured, semistructured and unstructured data for data journalism Artificial Intelligence

Nowadays, journalism is facilitated by the existence of large amounts of digital data sources, including many Open Data ones. Such data sources are extremely heterogeneous, ranging from highly struc-tured (relational databases), semi-structured (JSON, XML, HTML), graphs (e.g., RDF), and text. Journalists (and other classes of users lacking advanced IT expertise, such as most non-governmental-organizations, or small public administrations) need to be able to make sense of such heterogeneous corpora, even if they lack the ability to de ne and deploy custom extract-transform-load work ows. These are di cult to set up not only for arbitrary heterogeneous inputs , but also given that users may want to add (or remove) datasets to (from) the corpus. We describe a complete approach for integrating dynamic sets of heterogeneous data sources along the lines described above: the challenges we faced to make such graphs useful, allow their integration to scale, and the solutions we proposed for these problems. Our approach is implemented within the ConnectionLens system; we validate it through a set of experiments.

Named-Entity Linking Using Deep Learning For Legal Documents: A Transfer Learning Approach Artificial Intelligence

In the legal domain it is important to differentiate between words in general, and afterwards to link the occurrences of the same entities. The topic to solve these challenges is called Named-Entity Linking (NEL). Current supervised neural networks designed for NEL use publicly available datasets for training and testing. However, this paper focuses especially on the aspect of applying transfer learning approach using networks trained for NEL to legal documents. Experiments show consistent improvement in the legal datasets that were created from the European Union law in the scope of this research. Using transfer learning approach, we reached F1-score of 98.90\% and 98.01\% on the legal small and large test dataset.