st layer
MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge
Du, Yuntao, Jiang, Kailin, Gao, Zhi, Shi, Chenrui, Zheng, Zilong, Qi, Siyuan, Li, Qing
Knowledge editing techniques have emerged as essential tools for updating the factual knowledge of large language models (LLMs) and multimodal models (LMMs), allowing them to correct outdated or inaccurate information without retraining from scratch. However, existing benchmarks for multimodal knowledge editing primarily focus on entity-level knowledge represented as simple triplets, which fail to capture the complexity of real-world multimodal information. To address this issue, we introduce MMKE-Bench, a comprehensive MultiModal Knowledge Editing Benchmark, designed to evaluate the ability of LMMs to edit diverse visual knowledge in real-world scenarios. MMKE-Bench addresses these limitations by incorporating three types of editing tasks: visual entity editing, visual semantic editing, and user-specific editing. Besides, MMKE-Bench uses free-form natural language to represent and edit knowledge, offering a more flexible and effective format. The benchmark consists of 2,940 pieces of knowledge and 8,363 images across 33 broad categories, with evaluation questions automatically generated and human-verified. We assess five state-of-the-art knowledge editing methods on three prominent LMMs, revealing that no method excels across all criteria, and that visual and user-specific edits are particularly challenging. MMKE-Bench sets a new standard for evaluating the robustness of multimodal knowledge editing techniques, driving progress in this rapidly evolving field.
Development of Risk-Free COVID-19 Screening Algorithm from Routine Blood Test using Ensemble Machine Learning
Raihan, Md. Mohsin Sarker, Khan, Md. Mohi Uddin, Akter, Laboni, Shams, Abdullah Bin
The Reverse Transcription Polymerase Chain Reaction (RTPCR) test is the silver bullet diagnostic test to discern COVID infection. Rapid antigen detection is a screening test to identify COVID positive patients in little as 15 minutes, but has a lower sensitivity than the PCR tests. Besides having multiple standardized test kits, many people are getting infected & either recovering or dying even before the test due to the shortage and cost of kits, lack of indispensable specialists and labs, time-consuming result compared to bulk population especially in developing and underdeveloped countries. Intrigued by the parametric deviations in immunological & hematological profile of a COVID patient, this research work leveraged the concept of COVID-19 detection by proposing a risk-free and highly accurate Stacked Ensemble Machine Learning model to identify a COVID patient from communally available-widespread-cheap routine blood tests which gives a promising accuracy, precision, recall & F1-score of 100%. Analysis from R-curve also shows the preciseness of the risk-free model to be implemented. The proposed method has the potential for large scale ubiquitous low-cost screening application. This can add an extra layer of protection in keeping the number of infected cases to a minimum and control the pandemic by identifying asymptomatic or pre-symptomatic people early.
Improving Deep Image Clustering With Spatial Transformer Layers
Souza, Thiago V. M., Zanchettin, Cleber
Deep image clustering is a recent research area, but with exciting published works [15]. The approaches use the most diverse architectures varying the structure of the deep networks, theclustering algorithms and the combination of both parts. Approachessuch as the Deep Clustering Network (DCN) [9] use a pretrained autoencoder combined with the k-means algorithm. Methods such as Joint Unsupervised Learning (JULE) [10] combines deep convolutional networks with hierarchical clustering. Deep Embbed Cluster (DEC) [11], also uses a pretrained autoencoder, then removes the decoder part and uses the encoder as a feature extractor to feed the clustering method. After that, the network is fine-tuned using the cluster assignment hardening loss. Meanwhile, the clusters are iteratively tuned by minimizing the KL-divergence between the distribution of soft labels and the auxiliary target distribution.