quality index
Large margin classifier with graph-based adaptive regularization
Hanriot, Vítor M., Salis, Turíbio T., Torres, Luiz C. B., Coelho, Frederico, Braga, Antonio P.
This paper introduces the use of per-class regularization hyperparameters in Gabriel graph-based binary classifiers. We demonstrate how the quality index used for regularization behaves both in the margin region and in the presence of outliers, and how incorporating this regularization flexibility can lead to solutions that effectively eliminate outliers while training the classifier. We also show how it can address class imbalance by generating higher and lower thresholds for the majority and minority classes, respectively. Thus, rather than having a single solution based on fixed thresholds, flexible thresholds expand the solution space and can be optimized through hyperparameter tuning algorithms. Friedman test shows that flexible thresholds are capable of improving Gabriel graph-based classifiers.
Exploring Intra and Inter-language Consistency in Embeddings with ICA
Li, Rongzhi, Matsuda, Takeru, Yanaka, Hitomi
Word embeddings represent words as multidimensional real vectors, facilitating data analysis and processing, but are often challenging to interpret. Independent Component Analysis (ICA) creates clearer semantic axes by identifying independent key features. Previous research has shown ICA's potential to reveal universal semantic axes across languages. However, it lacked verification of the consistency of independent components within and across languages. We investigated the consistency of semantic axes in two ways: both within a single language and across multiple languages. We first probed into intra-language consistency, focusing on the reproducibility of axes by performing ICA multiple times and clustering the outcomes. Then, we statistically examined inter-language consistency by verifying those axes' correspondences using statistical tests. We newly applied statistical methods to establish a robust framework that ensures the reliability and universality of semantic axes.
Bit Distribution Study and Implementation of Spatial Quality Map in the JPEG-AI Standardization
Jia, Panqi, Mao, Jue, Koyuncu, Esin, Koyuncu, A. Burakhan, Solovyev, Timofey, Karabutov, Alexander, Zhao, Yin, Alshina, Elena, Kaup, Andre
Currently, there is a high demand for neural network-based image compression codecs. These codecs employ non-linear transforms to create compact bit representations and facilitate faster coding speeds on devices compared to the hand-crafted transforms used in classical frameworks. The scientific and industrial communities are highly interested in these properties, leading to the standardization effort of JPEG-AI. The JPEG-AI verification model has been released and is currently under development for standardization. Utilizing neural networks, it can outperform the classic codec VVC intra by over 10% BD-rate operating at base operation point. Researchers attribute this success to the flexible bit distribution in the spatial domain, in contrast to VVC intra's anchor that is generated with a constant quality point. However, our study reveals that VVC intra displays a more adaptable bit distribution structure through the implementation of various block sizes. As a result of our observations, we have proposed a spatial bit allocation method to optimize the JPEG-AI verification model's bit distribution and enhance the visual quality. Furthermore, by applying the VVC bit distribution strategy, the objective performance of JPEG-AI verification mode can be further improved, resulting in a maximum gain of 0.45 dB in PSNR-Y.
Fast kernel half-space depth for data with non-convex supports
Castellanos, Arturo, Mozharovskyi, Pavlo, d'Alché-Buc, Florence, Janati, Hicham
Data depth is a statistical function that generalizes order and quantiles to the multivariate setting and beyond, with applications spanning over descriptive and visual statistics, anomaly detection, testing, etc. The celebrated halfspace depth exploits data geometry via an optimization program to deliver properties of invariances, robustness, and non-parametricity. Nevertheless, it implicitly assumes convex data supports and requires exponential computational cost. To tackle distribution's multimodality, we extend the halfspace depth in a Reproducing Kernel Hilbert Space (RKHS). We show that the obtained depth is intuitive and establish its consistency with provable concentration bounds that allow for homogeneity testing. The proposed depth can be computed using manifold gradient making faster than halfspace depth by several orders of magnitude. The performance of our depth is demonstrated through numerical simulations as well as applications such as anomaly detection on real data and homogeneity testing.
Stochastic Deep Koopman Model for Quality Propagation Analysis in Multistage Manufacturing Systems
Chen, Zhiyi, Maske, Harshal, Shui, Huanyi, Upadhyay, Devesh, Hopka, Michael, Cohen, Joseph, Lai, Xingjian, Huan, Xun, Ni, Jun
The modeling of multistage manufacturing systems (MMSs) has attracted increased attention from both academia and industry. Recent advancements in deep learning methods provide an opportunity to accomplish this task with reduced cost and expertise. This study introduces a stochastic deep Koopman (SDK) framework to model the complex behavior of MMSs. Specifically, we present a novel application of Koopman operators to propagate critical quality information extracted by variational autoencoders. Through this framework, we can effectively capture the general nonlinear evolution of product quality using a transferred linear representation, thus enhancing the interpretability of the data-driven model. To evaluate the performance of the SDK framework, we carried out a comparative study on an open-source dataset. The main findings of this paper are as follows. Our results indicate that SDK surpasses other popular data-driven models in accuracy when predicting stagewise product quality within the MMS. Furthermore, the unique linear propagation property in the stochastic latent space of SDK enables traceability for quality evolution throughout the process, thereby facilitating the design of root cause analysis schemes. Notably, the proposed framework requires minimal knowledge of the underlying physics of production lines. It serves as a virtual metrology tool that can be applied to various MMSs, contributing to the ultimate goal of Zero Defect Manufacturing.
Towards Sustainable Development: A Novel Integrated Machine Learning Model for Holistic Environmental Health Monitoring
Mazumder, Anirudh, Engala, Sarthak, Nallaparaju, Aditya
Urbanization enables economic growth but also harms the environment through degradation. Traditional methods of detecting environmental issues have proven inefficient. Machine learning has emerged as a promising tool for tracking environmental deterioration by identifying key predictive features. Recent research focused on developing a predictive model using pollutant levels and particulate matter as indicators of environmental state in order to outline challenges. Machine learning was employed to identify patterns linking areas with worse conditions. This research aims to assist governments in identifying intervention points, improving planning and conservation efforts, and ultimately contributing to sustainable development.
A Visual Quality Index for Fuzzy C-Means
Oztürk, Aybükë, Lallich, Stéphane, Darmont, Jérôme
Cluster analysis is widely used in the areas of machine learning and data mining. Fuzzy clustering is a particular method that considers that a data point can belong to more than one cluster. Fuzzy clustering helps obtain flexible clusters, as needed in such applications as text categorization. The performance of a clustering algorithm critically depends on the number of clusters, and estimating the optimal number of clusters is a challenging task. Quality indices help estimate the optimal number of clusters. However, there is no quality index that can obtain an accurate number of clusters for different datasets. Thence, in this paper, we propose a new cluster quality index associated with a visual, graph-based solution that helps choose the optimal number of clusters in fuzzy partitions.
A New Algorithm of Speckle Filtering using Stochastic Distances
Torres, Leonardo, Cavalcante, Tamer, Frery, Alejandro C.
This paper presents a new approach for filter design based on stochastic distances and tests between distributions. A window is defined around each pixel, overlapping samples are compared and only those which pass a goodness-of-fit test are used to compute the filtered value. The technique is applied to intensity SAR data with homogeneous regions using the Gamma model. The proposal is compared with the Lee's filter using a protocol based on Monte Carlo. Among the criteria used to quantify the quality of filters, we employ the equivalent number of looks, line and edge preservation. Moreover, we also assessed the filters by the Universal Image Quality Index and the Pearson's correlation on edges regions.