Lasenby, Joan
OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene Understanding
Engelmann, Francis, Takmaz, Ayca, Schult, Jonas, Fedele, Elisabetta, Wald, Johanna, Peng, Songyou, Wang, Xi, Litany, Or, Tang, Siyu, Tombari, Federico, Pollefeys, Marc, Guibas, Leonidas, Tian, Hongbo, Wang, Chunjie, Yan, Xiaosheng, Wang, Bingwen, Zhang, Xuanyang, Liu, Xiao, Nguyen, Phuc, Nguyen, Khoi, Tran, Anh, Pham, Cuong, Huang, Zhening, Wu, Xiaoyang, Chen, Xi, Zhao, Hengshuang, Zhu, Lei, Lasenby, Joan
This report provides an overview of the challenge hosted at the OpenSUN3D Workshop on Open-Vocabulary 3D Scene Understanding held in conjunction with ICCV 2023. The goal of this workshop series is to provide a platform for exploration and discussion of open-vocabulary 3D scene understanding tasks, including but not limited to segmentation, detection and mapping. We provide an overview of the challenge hosted at the workshop, present the challenge dataset, the evaluation methodology, and brief descriptions of the winning methods. Additional details are available on the OpenSUN3D workshop website.
Evaluating Self-Supervised Learning for Molecular Graph Embeddings
Wang, Hanchen, Kaddour, Jean, Liu, Shengchao, Tang, Jian, Lasenby, Joan, Liu, Qi
Graph Self-Supervised Learning (GSSL) provides a robust pathway for acquiring embeddings without expert labelling, a capability that carries profound implications for molecular graphs due to the staggering number of potential molecules and the high cost of obtaining labels. However, GSSL methods are designed not for optimisation within a specific domain but rather for transferability across a variety of downstream tasks. This broad applicability complicates their evaluation. Addressing this challenge, we present "Molecular Graph Representation Evaluation" (MOLGRAPHEVAL), generating detailed profiles of molecular graph embeddings with interpretable and diversified attributes. MOLGRAPHEVAL offers a suite of probing tasks grouped into three categories: (i) generic graph, (ii) molecular substructure, and (iii) embedding space properties. By leveraging MOLGRAPHEVAL to benchmark existing GSSL methods against both current downstream datasets and our suite of tasks, we uncover significant inconsistencies between inferences drawn solely from existing datasets and those derived from more nuanced probing. These findings suggest that current evaluation methodologies fail to capture the entirety of the landscape.
Sky-image-based solar forecasting using deep learning with multi-location data: training models locally, globally or via transfer learning?
Nie, Yuhao, Paletta, Quentin, Scott, Andea, Pomares, Luis Martin, Arbod, Guillaume, Sgouridis, Sgouris, Lasenby, Joan, Brandt, Adam
Solar forecasting from ground-based sky images has shown great promise in reducing the uncertainty in solar power generation. With more and more sky image datasets open sourced in recent years, the development of accurate and reliable deep learning-based solar forecasting methods has seen a huge growth in potential. In this study, we explore three different training strategies for solar forecasting models by leveraging three heterogeneous datasets collected globally with different climate patterns. Specifically, we compare the performance of local models trained individually based on single datasets and global models trained jointly based on the fusion of multiple datasets, and further examine the knowledge transfer from pre-trained solar forecasting models to a new dataset of interest. The results suggest that the local models work well when deployed locally, but significant errors are observed when applied offsite. The global model can adapt well to individual locations at the cost of a potential increase in training efforts. Pre-training models on a large and diversified source dataset and transferring to a target dataset generally achieves superior performance over the other two strategies. With 80% less training data, it can achieve comparable performance as the local baseline trained using the entire dataset.
A Temporally Consistent Image-based Sun Tracking Algorithm for Solar Energy Forecasting Applications
Paletta, Quentin, Lasenby, Joan
Improving irradiance forecasting is critical to further increase the share of solar in the energy mix. On a short time scale, fish-eye cameras on the ground are used to capture cloud displacements causing the local variability of the electricity production. As most of the solar radiation comes directly from the Sun, current forecasting approaches use its position in the image as a reference to interpret the cloud cover dynamics. However, existing Sun tracking methods rely on external data and a calibration of the camera, which requires access to the device. To address these limitations, this study introduces an image-based Sun tracking algorithm to localise the Sun in the image when it is visible and interpolate its daily trajectory from past observations. We validate the method on a set of sky images collected over a year at SIRTA's lab. Experimental results show that the proposed method provides robust smooth Sun trajectories with a mean absolute error below 1% of the image size.
Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence
Bai, Xiang, Wang, Hanchen, Ma, Liya, Xu, Yongchao, Gan, Jiefeng, Fan, Ziwei, Yang, Fan, Ma, Ke, Yang, Jiehua, Bai, Song, Shu, Chang, Zou, Xinyu, Huang, Renhao, Zhang, Changzheng, Liu, Xiaowu, Tu, Dandan, Xu, Chuou, Zhang, Wenqing, Wang, Xi, Chen, Anguo, Zeng, Yu, Yang, Dehua, Wang, Ming-Wei, Holalkere, Nagaraj, Halin, Neil J., Kamel, Ihab R., Wu, Jia, Peng, Xuehua, Wang, Xiang, Shao, Jianbo, Mongkolwat, Pattanasak, Zhang, Jianjun, Liu, Weiyang, Roberts, Michael, Teng, Zhongzhao, Beer, Lucian, Sanchez, Lorena Escudero, Sala, Evis, Rubin, Daniel, Weller, Adrian, Lasenby, Joan, Zheng, Chuangsheng, Wang, Jianming, Li, Zhen, Schรถnlieb, Carola-Bibiane, Xia, Tian
Title: Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence One sentence summary: An efficient and effective privacy-preserving AI framework is proposed for CT-based COVID-19 diagnosis, based on 9,573 CT scans of 3,336 patients, from 23 hospitals in China and the UK. Abstract Artificial intelligence (AI) provides a promising substitution for streamlining COVID-19 diagnoses. However, concerns surrounding security and trustworthiness impede the collection of large-scale representative medical data, posing a considerable challenge for training a well-generalised model in clinical practices. To address this, we launch the Unified CT-COVID AI Diagnostic Initiative (UCADI), where the AI model can be distributedly trained and independently executed at each host institution under a federated learning framework (FL) without data sharing. Here we show that our FL model outperformed all the local models by a large yield (test sensitivity /specificity in China: 0.973/0.951, in the UK: 0.730/0.942), We further evaluated the model on the hold-out (collected from another two hospitals leaving out the FL) and heterogeneous (acquired with contrast materials) data, provided visual explanations for decisions made by the model, and analysed the trade-offs between the model performance and the communication costs in the federated training process. Our study is based on 9,573 chest computed tomography scans (CTs) from 3,336 patients collected from 23 hospitals located in China and the UK. Collectively, our work advanced the prospects of utilising federated learning for privacy-preserving AI in digital health. MAIN TEXT Introduction As the gold standard for identifying COVID-19 carriers, reverse transcription-polymerase chain reaction (RT-PCR) is the primary diagnostic modality to detect viral nucleotide in specimens from cases with suspected infection. It has been reported that coronavirus carriers present certain radiological features in chest CTs, including ground-glass opacity, interlobular septal thickening, and consolidation, which can be exploited to identify COVID-19 cases.
ECLIPSE : Envisioning Cloud Induced Perturbations in Solar Energy
Paletta, Quentin, Hu, Anthony, Arbod, Guillaume, Lasenby, Joan
Efficient integration of solar energy into the electricity mix depends on a reliable anticipation of its intermittency. A promising approach to forecast the temporal variability of solar irradiance resulting from the cloud cover dynamics, is based on the analysis of sequences of ground-taken sky images. Despite encouraging results, a recurrent limitation of current Deep Learning approaches lies in the ubiquitous tendency of reacting to past observations rather than actively anticipating future events. This leads to a systematic temporal lag and little ability to predict sudden events. To address this challenge, we introduce ECLIPSE, a spatio-temporal neural network architecture that models cloud motion from sky images to predict both future segmented images and corresponding irradiance levels. We show that ECLIPSE anticipates critical events and considerably reduces temporal delay while generating visually realistic futures.
The unreasonable effectiveness of the forget gate
van der Westhuizen, Jos, Lasenby, Joan
Given the success of the gated recurrent unit, a natural question is whether all the gates of the long short-term memory (LSTM) network are necessary. Previous research has shown that the forget gate is one of the most important gates in the LSTM. Here we show that a forget-gate-only version of the LSTM with chrono-initialized biases, not only provides computational savings but outperforms the standard LSTM on multiple benchmark datasets and competes with some of the best contemporary models. Our proposed network, the JANET, achieves accuracies of 99% and 92.5% on the MNIST and pMNIST datasets, outperforming the standard LSTM which yields accuracies of 98.5% and 91%.
What does an LSTM look for in classifying heartbeats?
van der Westhuizen, Jos, Lasenby, Joan
Long short-term memory (LSTM) recurrent neural networks are renowned for being uninterpretable "black boxes". In the medical domain where LSTMs have shown promise, this is specifically concerning because it is imperative to understand the decisions made by machine learning models in such acute situations. This study employs techniques used in the convolutional neural network domain to elucidate the inputs that are important when LSTMs classify electrocardiogram signals. Of the various techniques available to determine input feature saliency, it was found that learning an occlusion mask is the most effective.