Greenwich
Designing Discontinuities
Ferwana, Ibtihal, Park, Suyoung, Wu, Ting-Yi, Varshney, Lav R.
Discontinuities can be fairly arbitrary but also cause a significant impact on outcomes in larger systems. Indeed, their arbitrariness is why they have been used to infer causal relationships among variables in numerous settings. Regression discontinuity from econometrics assumes the existence of a discontinuous variable that splits the population into distinct partitions to estimate the causal effects of a given phenomenon. Here we consider the design of partitions for a given discontinuous variable to optimize a certain effect previously studied using regression discontinuity. To do so, we propose a quantization-theoretic approach to optimize the effect of interest, first learning the causal effect size of a given discontinuous variable and then applying dynamic programming for optimal quantization design of discontinuities to balance the gain and loss in that effect size. We also develop a computationally-efficient reinforcement learning algorithm for the dynamic programming formulation of optimal quantization. We demonstrate our approach by designing optimal time zone borders for counterfactuals of social capital, social mobility, and health. This is based on regression discontinuity analyses we perform on novel data, which may be of independent empirical interest.
HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination & Visual Illusion in Large Vision-Language Models
Guan, Tianrui, Liu, Fuxiao, Wu, Xiyang, Xian, Ruiqi, Li, Zongxia, Liu, Xiaoyu, Wang, Xijun, Chen, Lichang, Huang, Furong, Yacoob, Yaser, Manocha, Dinesh, Zhou, Tianyi
We introduce HallusionBench, a comprehensive benchmark designed for the evaluation of image-context reasoning. This benchmark presents significant challenges to advanced large visual-language models (LVLMs), such as GPT-4V(Vision) and LLaVA-1.5, by emphasizing nuanced understanding and interpretation of visual data. The benchmark comprises 346 images paired with 1129 questions, all meticulously crafted by human experts. We introduce a novel structure for these visual questions designed to establish control groups. This structure enables us to conduct a quantitative analysis of the models' response tendencies, logical consistency, and various failure modes. In our evaluation on HallusionBench, we benchmarked 13 different models, highlighting a 31.42% question-pair accuracy achieved by the state-of-the-art GPT-4V. Notably, all other evaluated models achieve accuracy below 16%. Moreover, our analysis not only highlights the observed failure modes, including language hallucination and visual illusion, but also deepens an understanding of these pitfalls. Our comprehensive case studies within HallusionBench shed light on the challenges of hallucination and illusion in LVLMs. Based on these insights, we suggest potential pathways for their future improvement. The benchmark and codebase can be accessed at https://github.com/tianyi-lab/HallusionBench.
DiffSpectralNet : Unveiling the Potential of Diffusion Models for Hyperspectral Image Classification
Sigger, Neetu, Nguyen, Tuan Thanh, Tozzi, Gianluca, Vien, Quoc-Tuan, Van Nguyen, Sinh
Hyperspectral images (HSI) have become popular for analysing remotely sensed images in multiple domain like agriculture, medical. However, existing models struggle with complex relationships and characteristics of spectral-spatial data due to the multi-band nature and data redundancy of hyperspectral data. To address this limitation, we propose a new network called DiffSpectralNet, which combines diffusion and transformer techniques. Our approach involves a two-step process. First, we use an unsupervised learning framework based on the diffusion model to extract both high-level and low-level spectral-spatial features. The diffusion method is capable of extracting diverse and meaningful spectral-spatial features, leading to improvement in HSI classification. Then, we employ a pretrained denoising U-Net to extract intermediate hierarchical features for classification. Finally, we use a supervised transformer-based classifier to perform the HSI classification. Through comprehensive experiments on HSI datasets, we evaluate the classification performance of DiffSpectralNet. The results demonstrate that our framework significantly outperforms existing approaches, achieving state-of-the-art performance.
Merging satellite and gauge-measured precipitation using LightGBM with an emphasis on extreme quantiles
Tyralis, Hristos, Papacharalampous, Georgia, Doulamis, Nikolaos, Doulamis, Anastasios
Knowing the actual precipitation in space and time is critical in hydrological modelling applications, yet the spatial coverage with rain gauge stations is limited due to economic constraints. Gridded satellite precipitation datasets offer an alternative option for estimating the actual precipitation by covering uniformly large areas, albeit related estimates are not accurate. To improve precipitation estimates, machine learning is applied to merge rain gauge-based measurements and gridded satellite precipitation products. In this context, observed precipitation plays the role of the dependent variable, while satellite data play the role of predictor variables. Random forests is the dominant machine learning algorithm in relevant applications. In those spatial predictions settings, point predictions (mostly the mean or the median of the conditional distribution) of the dependent variable are issued. The aim of the manuscript is to solve the problem of probabilistic prediction of precipitation with an emphasis on extreme quantiles in spatial interpolation settings. Here we propose, issuing probabilistic spatial predictions of precipitation using Light Gradient Boosting Machine (LightGBM). LightGBM is a boosting algorithm, highlighted by prize-winning entries in prediction and forecasting competitions. To assess LightGBM, we contribute a large-scale application that includes merging daily precipitation measurements in contiguous US with PERSIANN and GPM-IMERG satellite precipitation data. We focus on extreme quantiles of the probability distribution of the dependent variable, where LightGBM outperforms quantile regression forests (QRF, a variant of random forests) in terms of quantile score at extreme quantiles. Our study offers understanding of probabilistic predictions in spatial settings using machine learning.
Emotional Expression Detection in Spoken Language Employing Machine Learning Algorithms
Hosain, Mehrab, Arafat, Most. Yeasmin, Islam, Gazi Zahirul, Uddin, Jia, Hossain, Md. Mobarak, Alam, Fatema
There are a variety of features of the human voice that can be classified as pitch, timbre, loudness, and vocal tone. It is observed in numerous incidents that human expresses their feelings using different vocal qualities when they are speaking. The primary objective of this research is to recognize different emotions of human beings such as anger, sadness, fear, neutrality, disgust, pleasant surprise, and happiness by using several MATLAB functions namely, spectral descriptors, periodicity, and harmonicity. To accomplish the work, we analyze the CREMA-D (Crowd-sourced Emotional Multimodal Actors Data) & TESS (Toronto Emotional Speech Set) datasets of human speech. The audio file contains data that have various characteristics (e.g., noisy, speedy, slow) thereby the efficiency of the ML (Machine Learning) models increases significantly. The EMD (Empirical Mode Decomposition) is utilized for the process of signal decomposition. Then, the features are extracted through the use of several techniques such as the MFCC, GTCC, spectral centroid, roll-off point, entropy, spread, flux, harmonic ratio, energy, skewness, flatness, and audio delta. The data is trained using some renowned ML models namely, Support Vector Machine, Neural Network, Ensemble, and KNN. The algorithms show an accuracy of 67.7%, 63.3%, 61.6%, and 59.0% respectively for the test data and 77.7%, 76.1%, 99.1%, and 61.2% for the training data. We have conducted experiments using Matlab and the result shows that our model is very prominent and flexible than existing similar works.
Learning to play an instrument could boost your short-term memory
Playing a rhythm-based game for eight weeks helps non-musicians become better at remembering recently seen faces. This suggests that learning to play an instrument could improve short-term memory for non-musical tasks. There have been several studies showing that musicians tend to have better short-term memory than non-musicians when it comes to music-related tasks, such as remembering musical sequences. It is less clear whether these benefits carry over to non-musical tasks or to non-musicians who are learning to play an instrument, and how these changes might actually be seen in the brain. Theodore Zanto at the University of California, San Francisco, and his colleagues, randomly assigned a group of 47 non-musicians, aged between 60 and 79, to play either a tablet-based musical rhythm training game, which emulates learning to hit a drum in time with a teacher, or a word search game for eight weeks.
AI suggests how to make beer with whatever ingredients you have
AI that mimics aspects of the behaviour of flies seeking food can be used to design new beer recipes. It can also accurately recreate an existing brew using alternative ingredients when supplies are unstable. Mohammad Majid al-Rifaie at the University of Greenwich, UK, and his colleagues say that brewers traditionally follow existing recipes or create new ones using the time-consuming process of trial and error. He says that the AI tool is intended to "flip this whole equation around, so instead of โฆ
A Multi-modal Machine Learning Approach and Toolkit to Automate Recognition of Early Stages of Dementia among British Sign Language Users
Liang, Xing, Angelopoulou, Anastassia, Kapetanios, Epaminondas, Woll, Bencie, Al-batat, Reda, Woolfe, Tyron
The ageing population trend is correlated with an increased prevalence of acquired cognitive impairments such as dementia. Although there is no cure for dementia, a timely diagnosis helps in obtaining necessary support and appropriate medication. Researchers are working urgently to develop effective technological tools that can help doctors undertake early identification of cognitive disorder. In particular, screening for dementia in ageing Deaf signers of British Sign Language (BSL) poses additional challenges as the diagnostic process is bound up with conditions such as quality and availability of interpreters, as well as appropriate questionnaires and cognitive tests. On the other hand, deep learning based approaches for image and video analysis and understanding are promising, particularly the adoption of Convolutional Neural Network (CNN), which require large amounts of training data. In this paper, however, we demonstrate novelty in the following way: a) a multi-modal machine learning based automatic recognition toolkit for early stages of dementia among BSL users in that features from several parts of the body contributing to the sign envelope, e.g., hand-arm movements and facial expressions, are combined, b) universality in that it is possible to apply our technique to users of any sign language, since it is language independent, c) given the trade-off between complexity and accuracy of machine learning (ML) prediction models as well as the limited amount of training and testing data being available, we show that our approach is not over-fitted and has the potential to scale up.
The Age of Female Computers
Today, mathematics and computer science often appear as the province of geniuses working at the very edge of human ability and imagination. Even as American high schools struggle to employ qualified math and science teachers, American popular culture has embraced math, science, and computers as a mystic realm of extraordinary intellectual power, even verging on madness. Movies like A Beautiful Mind, Good Will Hunting, and Pi all present human intelligence in the esoteric symbolism of long, indecipherable, but visually captivating equations. One has to think of such prosaic activities as paying the mortgage and grocery shopping to be reminded of the quiet and non-revelatory quality of rudimentary arithmetic. Which is not to put such labor down.