Sarajevo
Safer Skin Lesion Classification with Global Class Activation Probability Map Evaluation and SafeML
Paxton, Kuniko, Aslansefat, Koorosh, Akagić, Amila, Thakker, Dhavalkumar, Papadopoulos, Yiannis
Recent advancements in skin lesion classification models have significantly improved accuracy, with some models even surpassing dermatologists' diagnostic performance. However, in medical practice, distrust in AI models remains a challenge. Beyond high accuracy, trustworthy, explainable diagnoses are essential. Existing explainability methods have reliability issues, with LIME-based methods suffering from inconsistency, while CAM-based methods failing to consider all classes. To address these limitations, we propose Global Class Activation Probabilistic Map Evaluation, a method that analyses all classes' activation probability maps probabilistically and at a pixel level. By visualizing the diagnostic process in a unified manner, it helps reduce the risk of misdiagnosis. Furthermore, the application of SafeML enhances the detection of false diagnoses and issues warnings to doctors and patients as needed, improving diagnostic reliability and ultimately patient safety. We evaluated our method using the ISIC datasets with MobileNetV2 and Vision Transformers.
- Europe > United Kingdom (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Dermatology (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
Performance comparison of medical image classification systems using TensorFlow Keras, PyTorch, and JAX
Bećirović, Merjem, Kurtović, Amina, Smajlović, Nordin, Kapo, Medina, Akagić, Amila
Medical imaging plays a vital role in early disease diagnosis and monitoring. Specifically, blood microscopy offers valuable insights into blood cell morphology and the detection of hematological disorders. In recent years, deep learning-based automated classification systems have demonstrated high potential in enhancing the accuracy and efficiency of blood image analysis. However, a detailed performance analysis of specific deep learning frameworks appears to be lacking. This paper compares the performance of three popular deep learning frameworks, TensorFlow with Keras, PyTorch, and JAX, in classifying blood cell images from the publicly available BloodMNIST dataset. The study primarily focuses on inference time differences, but also classification performance for different image sizes. The results reveal variations in performance across frameworks, influenced by factors such as image resolution and framework-specific optimizations. Classification accuracy for JAX and PyTorch was comparable to current benchmarks, showcasing the efficiency of these frameworks for medical image classification.
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Ukraine > Khmelnytskyi Oblast > Khmelnytskyi (0.04)
- (2 more...)
- Research Report (0.84)
- Instructional Material > Online (0.62)
- Instructional Material > Course Syllabus & Notes (0.62)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Immunology (0.68)
Improving Diagnostic Accuracy of Pigmented Skin Lesions With CNNs: an Application on the DermaMNIST Dataset
Kadric, Nerma, Akagic, Amila, Kapo, Medina
Pigmented skin lesions represent localized areas of increased melanin and can indicate serious conditions like melanoma, a major contributor to skin cancer mortality. The MedMNIST v2 dataset, inspired by MNIST, was recently introduced to advance research in biomedical imaging and includes DermaMNIST, a dataset for classifying pigmented lesions based on the HAM10000 dataset. This study assesses ResNet-50 and EfficientNetV2L models for multi-class classification using DermaMNIST, employing transfer learning and various layer configurations. One configuration achieves results that match or surpass existing methods. This study suggests that convolutional neural networks (CNNs) can drive progress in biomedical image analysis, significantly enhancing diagnostic accuracy.
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.05)
- North America > United States (0.04)
- Health & Medicine > Therapeutic Area > Dermatology (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (0.59)
Comparative Analysis of CNN Performance in Keras, PyTorch and JAX on PathMNIST
Nezović, Anida, Romano, Jalal, Marić, Nada, Kapo, Medina, Akagić, Amila
Deep learning has significantly advanced the field of medical image classification, particularly with the adoption of Convolutional Neural Networks (CNNs). Various deep learning frameworks such as Keras, PyTorch and JAX offer unique advantages in model development and deployment. However, their comparative performance in medical imaging tasks remains underexplored. This study presents a comprehensive analysis of CNN implementations across these frameworks, using the PathMNIST dataset as a benchmark. We evaluate training efficiency, classification accuracy and inference speed to assess their suitability for real-world applications. Our findings highlight the trade-offs between computational speed and model accuracy, offering valuable insights for researchers and practitioners in medical image analysis.
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.04)
- South America > Brazil > Rio Grande do Sul (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
Search-Based Robot Motion Planning With Distance-Based Adaptive Motion Primitives
Kraljusic, Benjamin, Ajanovic, Zlatan, Covic, Nermin, Lacevic, Bakir
This work proposes a motion planning algorithm for robotic manipulators that combines sampling-based and search-based planning methods. The core contribution of the proposed approach is the usage of burs of free configuration space (C-space) as adaptive motion primitives within the graph search algorithm. Due to their feature to adaptively expand in free C-space, burs enable more efficient exploration of the configuration space compared to fixed-sized motion primitives, significantly reducing the time to find a valid path and the number of required expansions. The algorithm is implemented within the existing SMPL (Search-Based Motion Planning Library) library and evaluated through a series of different scenarios involving manipulators with varying number of degrees-of-freedom (DoF) and environment complexity. Results demonstrate that the bur-based approach outperforms fixed-primitive planning in complex scenarios, particularly for high DoF manipulators, while achieving comparable performance in simpler scenarios.
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
Rethinking Group Recommender Systems in the Era of Generative AI: From One-Shot Recommendations to Agentic Group Decision Support
Jannach, Dietmar, Delić, Amra, Ricci, Francesco, Zanker, Markus
More than twenty-five years ago, first ideas were developed on how to design a system that can provide recommendations to groups of users instead of individual users. Since then, a rich variety of algorithmic proposals were published, e.g., on how to acquire individual preferences, how to aggregate them, and how to generate recommendations for groups of users. However, despite the rich literature on the topic, barely any examples of real-world group recommender systems can be found. This lets us question common assumptions in academic research, in particular regarding communication processes in a group and how recommendation-supported decisions are made. In this essay, we argue that these common assumptions and corresponding system designs often may not match the needs or expectations of users. We thus call for a reorientation in this research area, leveraging the capabilities of modern Generative AI assistants like ChatGPT. Specifically, as one promising future direction, we envision group recommender systems to be systems where human group members interact in a chat and an AI-based group recommendation agent assists the decision-making process in an agentic way. Ultimately, this shall lead to a more natural group decision-making environment and finally to wider adoption of group recommendation systems in practice.
- Europe > Italy (0.04)
- Europe > Austria (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (5 more...)
- Overview (1.00)
- Research Report > New Finding (0.46)
- Leisure & Entertainment (0.67)
- Information Technology (0.46)
- Media (0.46)
- Health & Medicine (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.71)
Toward Safety-First Human-Like Decision Making for Autonomous Vehicles in Time-Varying Traffic Flow
Wang, Xiao, Yu, Junru, Huang, Jun, Wu, Qiong, Vacic, Ljubo, Sun, Changyin
Despite the recent advancements in artificial intelligence technologies have shown great potential in improving transport efficiency and safety, autonomous vehicles(AVs) still face great challenge of driving in time-varying traffic flow, especially in dense and interactive situations. Meanwhile, human have free wills and usually do not make the same decisions even situate in the exactly same scenarios, leading to the data-driven methods suffer from poor migratability and high search cost problems, decreasing the efficiency and effectiveness of the behavior policy. In this research, we propose a safety-first human-like decision-making framework(SF-HLDM) for AVs to drive safely, comfortably, and social compatiblely in effiency. The framework integrates a hierarchical progressive framework, which combines a spatial-temporal attention (S-TA) mechanism for other road users' intention inference, a social compliance estimation module for behavior regulation, and a Deep Evolutionary Reinforcement Learning(DERL) model for expanding the search space efficiently and effectively to make avoidance of falling into the local optimal trap and reduce the risk of overfitting, thus make human-like decisions with interpretability and flexibility. The SF-HLDM framework enables autonomous driving AI agents dynamically adjusts decision parameters to maintain safety margins and adhering to contextually appropriate driving behaviors at the same time.
- Asia > Macao (0.14)
- Asia > China > Anhui Province > Hefei (0.04)
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.04)
- (7 more...)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (1.00)
- Information Technology (0.89)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (4 more...)
HealthBench: Evaluating Large Language Models Towards Improved Human Health
Arora, Rahul K., Wei, Jason, Hicks, Rebecca Soskin, Bowman, Preston, Quiñonero-Candela, Joaquin, Tsimpourlas, Foivos, Sharman, Michael, Shah, Meghan, Vallone, Andrea, Beutel, Alex, Heidecke, Johannes, Singhal, Karan
HealthBench consists of 5,000 multi-turn conversations between a model and an individual user or healthcare professional. Responses are evaluated using conversation-specific rubrics created by 262 physicians. Unlike previous multiple-choice or short-answer benchmarks, Health-Bench enables realistic, open-ended evaluation through 48,562 unique rubric criteria spanning several health contexts (e.g., emergencies, transforming clinical data, global health) and behavioral dimensions (e.g., accuracy, instruction following, communication). HealthBench performance over the last two years reflects steady initial progress (compare GPT-3.5 Turbo's 16% to GPT-4o's 32%) and more rapid recent improvements (o3 scores 60%). Smaller models have especially improved: GPT-4.1 nano outperforms GPT-4o and is 25 times cheaper. We additionally release two HealthBench variations: HealthBench Consensus, which includes 34 particularly important dimensions of model behavior validated via physician consensus, and HealthBench Hard, where the current top score is 32%. We hope that HealthBench grounds progress towards model development and applications that benefit human health.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- (34 more...)
- Research Report > Experimental Study (1.00)
- Research Report > Strength High (0.92)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Surgery (1.00)
- (5 more...)
Case Study: Fine-tuning Small Language Models for Accurate and Private CWE Detection in Python Code
Bappy, Md. Azizul Hakim, Mustafa, Hossen A, Saha, Prottoy, Salehat, Rajinus
Large Language Models (LLMs) have demonstrated significant capabilities in understanding and analyzing code for security vulnerabilities, such as Common Weakness Enumerations (CWEs). However, their reliance on cloud infrastructure and substantial computational requirements pose challenges for analyzing sensitive or proprietary codebases due to privacy concerns and inference costs. This work explores the potential of Small Language Models (SLMs) as a viable alternative for accurate, on-premise vulnerability detection. We investigated whether a 350-million parameter pre-trained code model (codegen-mono) could be effectively fine-tuned to detect the MITRE Top 25 CWEs specifically within Python code. To facilitate this, we developed a targeted dataset of 500 examples using a semi-supervised approach involving LLM-driven synthetic data generation coupled with meticulous human review. Initial tests confirmed that the base codegen-mono model completely failed to identify CWEs in our samples. However, after applying instruction-following fine-tuning, the specialized SLM achieved remarkable performance on our test set, yielding approximately 99% accuracy, 98.08% precision, 100% recall, and a 99.04% F1-score. These results strongly suggest that fine-tuned SLMs can serve as highly accurate and efficient tools for CWE detection, offering a practical and privacy-preserving solution for integrating advanced security analysis directly into development workflows.
- North America > Mexico > Veracruz (0.04)
- Europe > Greece > Attica > Athens (0.04)
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.04)
- (2 more...)
Interview with Amina Mević: Machine learning applied to semiconductor manufacturing
In a series of interviews, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. In this latest interview, we hear from Amina Mević who is applying machine learning to semiconductor manufacturing. Find out more about her PhD research so far, what makes this field so interesting, and how she found the AAAI Doctoral Consortium experience. I am currently pursuing my PhD at the University of Sarajevo, Faculty of Electrical Engineering, Department of Computer Science and Informatics. My research is being carried out in collaboration with Infineon Technologies Austria as part of the Important Project of Common European Interest (IPCEI) in Microelectronics.
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.25)
- Europe > Austria (0.25)
- Semiconductors & Electronics (1.00)
- Information Technology > Hardware (0.75)
- Education > Educational Setting > K-12 Education (0.32)