to

### Report on the Thirty-Fourth International Florida Artificial Intelligence Research Society Conference (FLAIRS-34)

The Thirty-Third International Florida Artificial Intelligence Research Society Conference (FLAIRS-34) was to be held May 17-19, 2021, at the Double Tree Ocean Point Resort and Spa in North Miami Beach, Florida, USA. Due to COVID-19 pandemic and travel restriction, the conference held both virtual and in-person. The planned conference events included tutorials, invited speakers, special tracks, and presentations of papers, posters, and awards. The conference chair was Keith Brawner from the Army Research Laboratory. The program co-chairs were Roman Barták from Charles University, Prague, and Eric Bell, USA.

### Data-Driven Design-by-Analogy: State of the Art and Future Directions

Design-by-Analogy (DbA) is a design methodology, wherein new solutions are generated in a target domain based on inspiration drawn from a source domain through cross-domain analogical reasoning [1, 2, 3]. DbA is an active research area in engineering design and various methods and tools have been proposed to support the implement of its process [4, 5, 6, 7, 8]. Studies have shown that DbA can help designers mitigate design fixation [9] and improve design ideation outcomes [10]. Fig.1 presents an example of DbA applications [11]. This case aims to solve an engineering design problem: How might we rectify the loud sonic boom generated when trains travel at high speeds through tunnels in atmospheric conditions [11, 12]? For potential design solutions to this problem, engineers explored structures in other design fields than trains or in the nature that effectively "break" the sonic-boom effect. When looking into the nature, engineers discovered that kingfisher birds could slice through the air and dive into the water at extremely high speeds to catch prey while barely making a splash. By analogy, engineers re-designed the train's front-end nose to mimic the geometry of the kingfisher's beak. This analogical design reduced noise and eliminated tunnel booms.

### Discovering the Rationale of Decisions: Experiments on Aligning Learning and Reasoning

In AI and law, systems that are designed for decision support should be explainable when pursuing justice. In order for these systems to be fair and responsible, they should make correct decisions and make them using a sound and transparent rationale. In this paper, we introduce a knowledge-driven method for model-agnostic rationale evaluation using dedicated test cases, similar to unit-testing in professional software development. We apply this new method in a set of machine learning experiments aimed at extracting known knowledge structures from artificial datasets from fictional and non-fictional legal settings. We show that our method allows us to analyze the rationale of black box machine learning systems by assessing which rationale elements are learned or not. Furthermore, we show that the rationale can be adjusted using tailor-made training data based on the results of the rationale evaluation.

### On the Explanation of Similarity for Developing and Deploying CBR Systems

During the early stages of developing Case-Based Reasoning (CBR) systems the definition of similarity measures is challenging since this task requires transferring implicit knowledge of domain experts into knowledge representations. While an entire CBR system is very explanatory, the similarity measure determines the ranking but do not necessarily show which features contribute to high (or low) rankings. In this paper we present our work on opening the knowledge engineering process for similarity modelling. This work present is a result of an interdisciplinary research collaboration between AI and public health researchers developing e-Health applications. During this work explainability and transparency of the development process is crucial to allow in-depth quality assurance of the by the domain experts.

### Twin Systems for DeepCBR: A Menagerie of Deep Learning and Case-Based Reasoning Pairings for Explanation and Data Augmentation

Recently, it has been proposed that fruitful synergies may exist between Deep Learning (DL) and Case Based Reasoning (CBR); that there are insights to be gained by applying CBR ideas to problems in DL (what could be called DeepCBR). In this paper, we report on a program of research that applies CBR solutions to the problem of Explainable AI (XAI) in the DL. We describe a series of twin-systems pairings of opaque DL models with transparent CBR models that allow the latter to explain the former using factual, counterfactual and semi-factual explanation strategies. This twinning shows that functional abstractions of DL (e.g., feature weights, feature importance and decision boundaries) can be used to drive these explanatory solutions. We also raise the prospect that this research also applies to the problem of Data Augmentation in DL, underscoring the fecundity of these DeepCBR ideas.

### Algorithmic Factors Influencing Bias in Machine Learning

It is fair to say that many of the prominent examples of bias in Machine Learning (ML) arise from bias that is there in the training data. In fact, some would argue that supervised ML algorithms cannot be biased, they reflect the data on which they are trained. In this paper we demonstrate how ML algorithms can misrepresent the training data through underestimation. We show how irreducible error, regularization and feature and class imbalance can contribute to this underestimation. The paper concludes with a demonstration of how the careful management of synthetic counterfactuals can ameliorate the impact of this underestimation bias.

### [P] Entity Embed: fuzzy and scalable Entity Resolution using Approximate Nearest Neighbors

Entity Embed is based on and is a special case of the AutoBlock model described by Amazon. It allows you to transform entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors. Using Entity Embed, you can train a deep learning model to transform records into vectors in an N-dimensional embedding space. Thanks to a contrastive loss, those vectors are organized to keep similar records close and dissimilar records far apart in this embedding space. Embedding records enables scalable ANN search, which means finding thousands of candidate duplicate pairs of records per second per CPU.

### Explainable Artificial Intelligence Reveals Novel Insight into Tumor Microenvironment Conditions Linked with Better Prognosis in Patients with Breast Cancer

We investigated the data-driven relationship between features in the tumor microenvironment (TME) and the overall and 5-year survival in triple-negative breast cancer (TNBC) and non-TNBC (NTNBC) patients by using Explainable Artificial Intelligence (XAI) models. We used clinical information from patients with invasive breast carcinoma from The Cancer Genome Atlas and from two studies from the cbioPortal, the PanCanAtlas project and the GDAC Firehose study. In this study, we used a normalized RNA sequencing data-driven cohort from 1,015 breast cancer patients, alive or deceased, from the UCSC Xena data set and performed integrated deconvolution with the EPIC method to estimate the percentage of seven different immune and stromal cells from RNA sequencing data. Novel insights derived from our XAI model showed that CD4+ T cells and B cells are more critical than other TME features for enhanced prognosis for both TNBC and NTNBC patients. Our XAI model revealed the critical inflection points (i.e., threshold fractions) of CD4+ T cells and B cells above or below which 5-year survival rates improve. Subsequently, we ascertained the conditional probabilities of $\geq$ 5-year survival in both TNBC and NTNBC patients under specific conditions inferred from the inflection points. In particular, the XAI models revealed that a B-cell fraction exceeding 0.018 in the TME could ensure 100% 5-year survival for NTNBC patients. The findings from this research could lead to more accurate clinical predictions and enhanced immunotherapies and to the design of innovative strategies to reprogram the TME of breast cancer patients.

### Case-based Reasoning for Natural Language Queries over Knowledge Bases

It is often challenging for a system to solve a new complex problem from scratch, but much easier if the system can access other similar problems and description of their solutions -- a paradigm known as case-based reasoning (CBR). We propose a neuro-symbolic CBR approach for question answering over large knowledge bases (CBR-KBQA). While the idea of CBR is tempting, composing a solution from cases is nontrivial, when individual cases only contain partial logic to the full solution. To resolve this, CBR-KBQA consists of two modules: a non-parametric memory that stores cases (question and logical forms) and a parametric model which can generate logical forms by retrieving relevant cases from memory. Through experiments, we show that CBR-KBQA can effectively derive novel combination of relations not presented in case memory that is required to answer compositional questions. On several KBQA datasets that test compositional generalization, CBR-KBQA achieves competitive performance. For example, on the challenging ComplexWebQuestions dataset, CBR-KBQA outperforms the current state of the art by 11% accuracy. Furthermore, we show that CBR-KBQA is capable of using new cases \emph{without} any further training. Just by incorporating few human-labeled examples in the non-parametric case memory, CBR-KBQA is able to successfully generate queries containing unseen KB relations.