Collaborating Authors

GPT-3 Creative Fiction


What if I told a story here, how would that story start?" Thus, the summarization prompt: "My second grader asked me what this passage means: …" When a given prompt isn't working and GPT-3 keeps pivoting into other modes of completion, that may mean that one hasn't constrained it enough by imitating a correct output, and one needs to go further; writing the first few words or sentence of the target output may be necessary.

Large expert-curated database for benchmarking document similarity detection in biomedical literature search


Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations.

New Polynomial Classes for Logic-Based Abduction

AAAI Conferences

We address the problem of propositional logic-based abduction, i.e., the problem of searching for a best explanation for a given propositional observation according to a given propositional knowledge base. We give a general algorithm, based on the notion of projection; then we study restrictions over the representations of the knowledge base and of the query, and find new polynomial classes of abduction problems.

Towards automated symptoms assessment in mental health Machine Learning

Activity and motion analysis has the potential to be used as a diagnostic tool for mental disorders. However, to-date, little work has been performed in turning stratification measures of activity into useful symptom markers. The research presented in this thesis has focused on the identification of objective activity and behaviour metrics that could be useful for the analysis of mental health symptoms in the above mentioned dimensions. Particular attention is given to the analysis of objective differences between disorders, as well as identification of clinical episodes of mania and depression in bipolar patients, and deterioration in borderline personality disorder patients. A principled framework is proposed for mHealth monitoring of psychiatric patients, based on measurable changes in behaviour, represented in physical activity time series, collected via mobile and wearable devices. The framework defines methods for direct computational analysis of symptoms in disorganisation and psychomotor dimensions, as well as measures for indirect assessment of mood, using patterns of physical activity, sleep and circadian rhythms. The approach of computational behaviour analysis, proposed in this thesis, has the potential for early identification of clinical deterioration in ambulatory patients, and allows for the specification of distinct and measurable behavioural phenotypes, thus enabling better understanding and treatment of mental disorders.

A review of machine learning applications in wildfire science and management Machine Learning

Artificial intelligence has been applied in wildfire science and management since the 1990s, with early applications including neural networks and expert systems. Since then the field has rapidly progressed congruently with the wide adoption of machine learning (ML) in the environmental sciences. Here, we present a scoping review of ML in wildfire science and management. Our objective is to improve awareness of ML among wildfire scientists and managers, as well as illustrate the challenging range of problems in wildfire science available to data scientists. We first present an overview of popular ML approaches used in wildfire science to date, and then review their use in wildfire science within six problem domains: 1) fuels characterization, fire detection, and mapping; 2) fire weather and climate change; 3) fire occurrence, susceptibility, and risk; 4) fire behavior prediction; 5) fire effects; and 6) fire management. We also discuss the advantages and limitations of various ML approaches and identify opportunities for future advances in wildfire science and management within a data science context. We identified 298 relevant publications, where the most frequently used ML methods included random forests, MaxEnt, artificial neural networks, decision trees, support vector machines, and genetic algorithms. There exists opportunities to apply more current ML methods (e.g., deep learning and agent based learning) in wildfire science. However, despite the ability of ML models to learn on their own, expertise in wildfire science is necessary to ensure realistic modelling of fire processes across multiple scales, while the complexity of some ML methods requires sophisticated knowledge for their application. Finally, we stress that the wildfire research and management community plays an active role in providing relevant, high quality data for use by practitioners of ML methods.