Jana, Ananya
Comparative Analysis of OpenAI GPT-4o and DeepSeek R1 for Scientific Text Categorization Using Prompt Engineering
Maiti, Aniruddha, Adewumi, Samuel, Tikure, Temesgen Alemayehu, Wang, Zichun, Sengupta, Niladri, Sukhanova, Anastasiia, Jana, Ananya
This study examines how large language models categorize sentences from scientific papers using prompt engineering. We use two advanced web-based models, GPT-4o (by OpenAI) and DeepSeek R1, to classify sentences into predefined relationship categories. DeepSeek R1 has been tested on benchmark datasets in its technical report. However, its performance in scientific text categorization remains unexplored. To address this gap, we introduce a new evaluation method designed specifically for this task. We also compile a dataset of cleaned scientific papers from diverse domains. This dataset provides a platform for comparing the two models. Using this dataset, we analyze their effectiveness and consistency in categorization.
A Critical Analysis of the Limitation of Deep Learning based 3D Dental Mesh Segmentation Methods in Segmenting Partial Scans
Jana, Ananya, Maiti, Aniruddha, Metaxas, Dimitris N.
Tooth segmentation from intraoral scans is a crucial part of digital dentistry. Many Deep Learning based tooth segmentation algorithms have been developed for this task. In most of the cases, high accuracy has been achieved, although, most of the available tooth segmentation techniques make an implicit restrictive assumption of full jaw model and they report accuracy based on full jaw models. Medically, however, in certain cases, full jaw tooth scan is not required or may not be available. Given this practical issue, it is important to understand the robustness of currently available widely used Deep Learning based tooth segmentation techniques. For this purpose, we applied available segmentation techniques on partial intraoral scans and we discovered that the available deep Learning techniques under-perform drastically. The analysis and comparison presented in this work would help us in understanding the severity of the problem and allow us to develop robust tooth segmentation technique without strong assumption of full jaw model.
3D Tooth Mesh Segmentation with Simplified Mesh Cell Representation
Jana, Ananya, Subhash, Hrebesh Molly, Metaxas, Dimitris N.
A vital task in computer aided orthodontic treatment is automated and accurate segmentation of teeth from Manual tooth segmentation of 3D tooth meshes is tedious intraoral scans. The intraoral scanners produce 3D surface and there is variations among dentists. Several deep learning reconstructions of the teeth either in the form of point cloud based methods have been proposed to perform automatic or in a mesh format or both. A highly accurate automated tooth mesh segmentation. Many of the proposed tooth mesh tooth mesh segmentation can help in downstream tasks such segmentation algorithms summarize the mesh cell as - the cell as recognising and classifying different dental/oral conditions center or barycenter, the normal at barycenter, the cell vertices like gingivitis, caries, and white lesions. There are multiple and the normals at the cell vertices. Summarizing of the mesh challenges involved in tooth mesh segmentation such as - cell/triangle in this manner imposes an implicit structural constraint crowded teeth, misaligned teeth, missing teeth. The size and and makes it difficult to work with multiple resolutions shape of teeth can also vary widely across subjects. The second which is done in many point cloud based deep learning algorithms.
Global and Local Interpretation of black-box Machine Learning models to determine prognostic factors from early COVID-19 data
Jana, Ananya, Minacapelli, Carlos D., Rustgi, Vinod, Metaxas, Dimitris
The COVID-19 corona virus has claimed 4.1 million lives, as of July 24, 2021. A variety of machine learning models have been applied to related data to predict important factors such as the severity of the disease, infection rate and discover important prognostic factors. Often the usefulness of the findings from the use of these techniques is reduced due to lack of method interpretability. Some recent progress made on the interpretability of machine learning models has the potential to unravel more insights while using conventional machine learning models. In this work, we analyze COVID-19 blood work data with some of the popular machine learning models; then we employ state-of-the-art post-hoc local interpretability techniques(e.g.- SHAP, LIME), and global interpretability techniques(e.g. - symbolic metamodeling) to the trained black-box models to draw interpretable conclusions. In the gamut of machine learning algorithms, regressions remain one of the simplest and most explainable models with clear mathematical formulation. We explore one of the most recent techniques called symbolic metamodeling to find the mathematical expression of the machine learning models for COVID-19. We identify Acute Kidney Injury (AKI), initial Albumin level (ALBI), Aspartate aminotransferase (ASTI), Total Bilirubin initial(TBILI) and D-Dimer initial (DIMER) as major prognostic factors of the disease severity. Our contributions are- (i) uncover the underlying mathematical expression for the black-box models on COVID-19 severity prediction task (ii) we are the first to apply symbolic metamodeling to this task, and (iii) discover important features and feature interactions.