Pattern Recognition
Fairness in Image Search: A Study of Occupational Stereotyping in Image Retrieval and its Debiasing
Multi-modal search engines have experienced significant growth and widespread use in recent years, making them the second most common internet use. While search engine systems offer a range of services, the image search field has recently become a focal point in the information retrieval community, as the adage goes, "a picture is worth a thousand words". Although popular search engines like Google excel at image search accuracy and agility, there is an ongoing debate over whether their search results can be biased in terms of gender, language, demographics, socio-cultural aspects, and stereotypes. This potential for bias can have a significant impact on individuals' perceptions and influence their perspectives. In this paper, we present our study on bias and fairness in web search, with a focus on keyword-based image search. We first discuss several kinds of biases that exist in search systems and why it is important to mitigate them. We narrow down our study to assessing and mitigating occupational stereotypes in image search, which is a prevalent fairness issue in image retrieval. For the assessment of stereotypes, we take gender as an indicator. We explore various open-source and proprietary APIs for gender identification from images. With these, we examine the extent of gender bias in top-tanked image search results obtained for several occupational keywords. To mitigate the bias, we then propose a fairness-aware re-ranking algorithm that optimizes (a) relevance of the search result with the keyword and (b) fairness w.r.t genders identified. We experiment on 100 top-ranked images obtained for 10 occupational keywords and consider random re-ranking and re-ranking based on relevance as baselines. Our experimental results show that the fairness-aware re-ranking algorithm produces rankings with better fairness scores and competitive relevance scores than the baselines.
On Exact Bayesian Credible Sets for Classification and Pattern Recognition
The current definition of a Bayesian credible set cannot, in general, achieve an arbitrarily preassigned credible level. This drawback is particularly acute for classification problems, where there are only a finite number of achievable credible levels. As a result, there is as of today no general way to construct an exact credible set for classification. In this paper, we introduce a generalized credible set that can achieve any preassigned credible level. The key insight is a simple connection between the Bayesian highest posterior density credible set and the Neyman--Pearson lemma, which, as far as we know, hasn't been noticed before. Using this connection, we introduce a randomized decision rule to fill the gaps among the discrete credible levels. Accompanying this methodology, we also develop the Steering Wheel Plot to represent the credible set, which is useful in visualizing the uncertainty in classification. By developing the exact credible set for discrete parameters, we make the theory of Bayesian inference more complete.
GaitPT: Skeletons Are All You Need For Gait Recognition
Catruna, Andy, Cosma, Adrian, Radoi, Emilian
The analysis of patterns of walking is an important area of research that has numerous applications in security, healthcare, sports and human-computer interaction. Lately, walking patterns have been regarded as a unique fingerprinting method for automatic person identification at a distance. In this work, we propose a novel gait recognition architecture called Gait Pyramid Transformer (GaitPT) that leverages pose estimation skeletons to capture unique walking patterns, without relying on appearance information. GaitPT adopts a hierarchical transformer architecture that effectively extracts both spatial and temporal features of movement in an anatomically consistent manner, guided by the structure of the human skeleton. Our results show that GaitPT achieves state-of-the-art performance compared to other skeleton-based gait recognition works, in both controlled and in-the-wild scenarios. GaitPT obtains 82.6% average accuracy on CASIA-B, surpassing other works by a margin of 6%. Moreover, it obtains 52.16% Rank-1 accuracy on GREW, outperforming both skeleton-based and appearance-based approaches.
ChinaTelecom System Description to VoxCeleb Speaker Recognition Challenge 2023
Du, Mengjie, Fang, Xiang, Li, Jie
This technical report describes ChinaTelecom system for Track 1 (closed) of the VoxCeleb2023 Speaker Recognition Challenge (VoxSRC 2023). Our system consists of several ResNet variants trained only on VoxCeleb2, which were fused for better performance later. Score calibration was also applied for each variant and the fused system. The final submission achieved minDCF of 0.1066 and EER of 1.980%.
BIRP: Bitcoin Information Retrieval Prediction Model Based on Multimodal Pattern Matching
Kim, Minsuk, Kim, Byungchul, Yong, Junyeong, Park, Jeongwoo, Kim, Gyeongmin
Financial time series have historically been assumed to be a martingale process under the Random Walk hypothesis. Instead of making investment decisions using the raw prices alone, various multimodal pattern matching algorithms have been developed to help detect subtly hidden repeatable patterns within the financial market. Many of the chart-based pattern matching tools only retrieve similar past chart (PC) patterns given the current chart (CC) pattern, and leaves the entire interpretive and predictive analysis, thus ultimately the final investment decision, to the investors. In this paper, we propose an approach of ranking similar PC movements given the CC information and show that exploiting this as additional features improves the directional prediction capacity of our model. We apply our ranking and directional prediction modeling methodologies on Bitcoin due to its highly volatile prices that make it challenging to predict its future movements.
pNNCLR: Stochastic Pseudo Neighborhoods for Contrastive Learning based Unsupervised Representation Learning Problems
Biswas, Momojit, Buckchash, Himanshu, Prasad, Dilip K.
Nearest neighbor (NN) sampling provides more semantic variations than pre-defined transformations for self-supervised learning (SSL) based image recognition problems. However, its performance is restricted by the quality of the support set, which holds positive samples for the contrastive loss. In this work, we show that the quality of the support set plays a crucial role in any nearest neighbor based method for SSL. We then provide a refined baseline (pNNCLR) to the nearest neighbor based SSL approach (NNCLR). To this end, we introduce pseudo nearest neighbors (pNN) to control the quality of the support set, wherein, rather than sampling the nearest neighbors, we sample in the vicinity of hard nearest neighbors by varying the magnitude of the resultant vector and employing a stochastic sampling strategy to improve the performance. Additionally, to stabilize the effects of uncertainty in NN-based learning, we employ a smooth-weight-update approach for training the proposed network. Evaluation of the proposed method on multiple public image recognition and medical image recognition datasets shows that it performs up to 8 percent better than the baseline nearest neighbor method, and is comparable to other previously proposed SSL methods.
Evaluating the anticipated outcomes of MRI seizure image from open-source tool- Prototype approach
Vajiram, Jayanthi, Senthil, Aishwarya, Maurya, Utkarsh
Clinical neuroscience studies are based on neuroimaging and psychiatric. These studies are very important for genetic data processing, cognitive evaluations, and follow-up. Brain imaging describes the colorful ways to either directly or laterally structure and function the neuroimaging system. Neuroradiologists do the interpretation of structural and functional imaging and clinical analysis of the Brain. Structural imaging deals with the nervous system structure and records the opinion of intracranial complaints of excrescence and injury.
Automated Learning for Deformable Medical Image Registration by Jointly Optimizing Network Architectures and Objective Functions
Fan, Xin, Li, Zi, Li, Ziyang, Wang, Xiaolin, Liu, Risheng, Luo, Zhongxuan, Huang, Hao
Deformable image registration plays a critical role in various tasks of medical image analysis. A successful registration algorithm, either derived from conventional energy optimization or deep networks requires tremendous efforts from computer experts to well design registration energy or to carefully tune network architectures for the specific type of medical data. To tackle the aforementioned problems, this paper proposes an automated learning registration algorithm (AutoReg) that cooperatively optimizes both architectures and their corresponding training objectives, enabling non-computer experts, e.g., medical/clinical users, to conveniently find off-the-shelf registration algorithms for diverse scenarios. Specifically, we establish a triple-level framework to deduce registration network architectures and objectives with an auto-searching mechanism and cooperating optimization. We conduct image registration experiments on multi-site volume datasets and various registration tasks. Extensive results demonstrate that our AutoReg may automatically learn an optimal deep registration network for given volumes and achieve state-of-the-art performance, also significantly improving computation efficiency than the mainstream UNet architectures (from 0.558 to 0.270 seconds for a 3D image pair on the same configuration).
Optical Script Identification for multi-lingual Indic-script
Poddar, Sidhantha, Gupta, Rohan
Script identification and text recognition are some of the major domains in the application of Artificial Intelligence. In this era of digitalization, the use of digital note-taking has become a common practice. Still, conventional methods of using pen and paper is a prominent way of writing. This leads to the classification of scripts based on the method they are obtained. A survey on the current methodologies and state-of-art methods used for processing and identification would prove beneficial for researchers. The aim of this article is to discuss the advancement in the techniques for script pre-processing and text recognition. In India there are twelve prominent Indic scripts, unlike the English language, these scripts have layers of characteristics. Complex characteristics such as similarity in text shape make them difficult to recognize and analyze, thus this requires advance preprocessing methods for their accurate recognition. A sincere attempt is made in this survey to provide a comparison between all algorithms. We hope that this survey would provide insight to a researcher working not only on Indic scripts but also other languages.
Meta-learning in healthcare: A survey
Rafiei, Alireza, Moore, Ronald, Jahromi, Sina, Hajati, Farshid, Kamaleswaran, Rishikesan
UELED by the surge in the collection of diverse data, coupled with advancements in computational models and models in the healthcare domain, they typically perform well algorithms, artificial intelligence (AI) techniques have been on a single task [16], [17]. Meta-learning models, however, striving to establish a strong foothold in healthcare over the prove beneficial both in multi-task scenarios, where taskagnostic past decade [1]-[3]. This burgeoning trend has fostered a knowledge is garnered from a suite of tasks to enhance growing interest in the deployment of innovative data analysis the learning of new tasks within that suite, and in singletask methods and machine learning (ML) techniques across a scenarios, where a single problem is continually solved range of healthcare applications [4]-[7]. As a specialized area and refined solutions for a single problem over numerous within ML, meta-learning, or learning-to-learn, has recently episodes [10], [18]. This multi-task learning capability can gained significant attention due to its impressive theoretical enable a more comprehensive understanding of the complex and practical advancements, making it a primary choice for interrelations and dependencies between various healthcare numerous applications [8]-[10].