AITopics | Data Science

RelDenClu: A Relative Density based Biclustering Method for identifying non-linear feature relations

Jain, Namita, Ghosh, Susmita, Murthy, C. A.

arXiv.org Artificial IntelligenceMar-28-2025

The existing biclustering algorithms for finding feature relation based biclusters often depend on assumptions like monotonicity or linearity. Though a few algorithms overcome this problem by using density-based methods, they tend to miss out many biclusters because they use global criteria for identifying dense regions. The proposed method, RelDenClu uses the local variations in marginal and joint densities for each pair of features to find the subset of observations, which forms the bases of the relation between them. It then finds the set of features connected by a common set of observations, resulting in a bicluster. To show the effectiveness of the proposed methodology, experimentation has been carried out on fifteen types of simulated datasets. Further, it has been applied to six real-life datasets. For three of these real-life datasets, the proposed method is used for unsupervised learning, while for other three real-life datasets it is used as an aid to supervised learning. For all the datasets the performance of the proposed method is compared with that of seven different state-of-the-art algorithms and the proposed algorithm is seen to produce better results. The efficacy of proposed algorithm is also seen by its use on COVID-19 dataset for identifying some features (genetic, demographics and others) that are likely to affect the spread of COVID-19.

artificial intelligence, bicluster, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1811.04661

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

GroundHog: Revolutionizing GLDAS Groundwater Storage Downscaling for Enhanced Recharge Estimation in Bangladesh

Ahmed, Saleh Sakib, Zzaman, Rashed Uz, Jony, Saifur Rahman, Himel, Faizur Rahman, Sharmin, Afroza, Rahman, A. H. M. Khalequr, Rahman, M. Sohel, Nowreen, Sara

arXiv.org Artificial IntelligenceMar-28-2025

Long-term groundwater level (GWL) measurement is vital for effective policymaking and recharge estimation using annual maxima and minima. However, current methods prioritize short-term predictions and lack multi-year applicability, limiting their utility. Moreover, sparse in-situ measurements lead to reliance on low-resolution satellite data like GLDAS as the ground truth for Machine Learning models, further constraining accuracy. To overcome these challenges, we first develop an ML model to mitigate data gaps, achieving $R^2$ scores of 0.855 and 0.963 for maximum and minimum GWL predictions, respectively. Subsequently, using these predictions and well observations as ground truth, we train an Upsampling Model that uses low-resolution (25 km) GLDAS data as input to produce high-resolution (2 km) GWLs, achieving an excellent $R^2$ score of 0.96. Our approach successfully upscales GLDAS data for 2003-2024, allowing high-resolution recharge estimations and revealing critical trends for proactive resource management. Our method allows upsampling of groundwater storage (GWS) from GLDAS to high-resolution GWLs for any points independently of officially curated piezometer data, making it a valuable tool for decision-making.

artificial intelligence, groundwater level, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.22771

Country:

North America > United States (1.00)
Africa (0.93)
Asia > Bangladesh > Dhaka Division > Dhaka District (0.15)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (0.68)
Food & Agriculture (0.67)
Water & Waste Management > Water Management > Water Supplies & Services (0.52)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

TS: A Unified Multi-Task Time Series Model

Neural Information Processing SystemsMar-27-2025, 16:28:03 GMT

Although pre-trained transformers and reprogrammed text-based LLMs have shown strong performance on time series tasks, the best-performing architectures vary widely across tasks, with most models narrowly focused on specific areas, such as time series forecasting. Unifying predictive and generative time series tasks within a single model remains challenging.

data mining, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (0.92)

Industry:

Health & Medicine (1.00)
Government > Military (0.92)
Energy (0.67)
Government > Regional Government (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

On the Powerfulness of Textual Outlier Exposure for Visual OoD Detection

Neural Information Processing SystemsMar-27-2025, 16:26:43 GMT

Successful detection of Out-of-Distribution (OoD) data is becoming increasingly important to ensure safe deployment of neural networks. One of the main challenges in OoD detection is that neural networks output overconfident predictions on OoD data, make it difficult to determine OoD-ness of data solely based on their predictions. Outlier exposure addresses this issue by introducing an additional loss that encourages low-confidence predictions on OoD data during training. While outlier exposure has shown promising potential in improving OoD detection performance, all previous studies on outlier exposure have been limited to utilizing visual outliers.

data mining, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East (0.46)
North America > United States > Minnesota (0.28)
Europe > Switzerland (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

Achievable Fairness on Your Data With Utility Guarantees

Neural Information Processing SystemsMar-27-2025, 16:23:58 GMT

In machine learning fairness, training models that minimize disparity across different sensitive groups often leads to diminished accuracy, a phenomenon known as the fairness-accuracy trade-off. The severity of this trade-off inherently depends on dataset characteristics such as dataset imbalances or biases and therefore, using a uniform fairness requirement across diverse datasets remains questionable. To address this, we present a computationally efficient approach to approximate the fairness-accuracy trade-off curve tailored to individual datasets, backed by rigorous statistical guarantees. By utilizing the You-Only-Train-Once (YOTO) framework, our approach mitigates the computational burden of having to train multiple models when approximating the trade-off curve. Crucially, we introduce a novel methodology for quantifying uncertainty in our estimates, thereby providing practitioners with a robust framework for auditing model fairness while avoiding false conclusions due to estimation errors. Our experiments spanning tabular (e.g., Adult), image (CelebA), and language (Jigsaw) datasets underscore that our approach not only reliably quantifies the optimum achievable trade-offs across various data modalities but also helps detect suboptimality in SOTA fairness methods.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

case, please provide a description

Neural Information Processing SystemsMar-27-2025, 16:23:03 GMT

This document is based on Datasheets for Datasets by and edges)? Please see the most updated version The instances of this graph-based dataset comprise here. Link prediction on this dataset is a multi-instance prediction task [3]. For what purpose was the dataset created? Was there a specific task in mind? Was there a specific gap that How many instances are there in total (of each type, needed to be filled?

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.14)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Law (0.94)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.68)
Information Technology > Data Science > Data Mining (0.67)
Information Technology > Information Management (0.67)

Add feedback

MOTIVE: A Drug-Target Interaction Graph For Inductive Link Prediction

Neural Information Processing SystemsMar-27-2025, 16:22:59 GMT

Drug-target interaction (DTI) prediction is crucial for identifying new therapeutics and detecting mechanisms of action.

data mining, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
(2 more...)

Add feedback

c5cf13bfd3762821ef7607e63ee90075-Supplemental-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 16:18:45 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.27)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

a15032f8199511ced4d7a8e2bbb487a5-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsMar-27-2025, 16:13:40 GMT

artificial intelligence, disperse, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Information Technology (0.68)
Government > Regional Government > North America Government > United States Government (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.68)

Add feedback

Diverse Community Data for Benchmarking Data Privacy Algorithms

Neural Information Processing SystemsMar-27-2025, 16:13:36 GMT

The Collaborative Research Cycle (CRC) is a National Institute of Standards and Technology (NIST) benchmarking program intended to strengthen understanding of tabular data deidentification technologies. Deidentification algorithms are vulnerable to the same bias and privacy issues that impact other data analytics and machine learning applications, and it can even amplify those issues by contaminating downstream applications. This paper summarizes four CRC contributions: theoretical work on the relationship between diverse populations and challenges for equitable deidentification; public benchmark data focused on diverse populations and challenging features; a comprehensive open source suite of evaluation metrology for deidentified datasets; and an archive of more than 450 deidentified data samples from a broad range of techniques. The initial set of evaluation results demonstrate the value of the CRC tools for investigations in this field.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.34)

Industry: