AITopics | Southern Denmark

The theory of Local Intrinsic Dimensionality (LID) has become a valuable tool for characterizing local complexity within and across data manifolds, supporting a range of data mining and machine learning tasks. Accurate LID estimation requires samples drawn from small neighborhoods around each query to avoid biases from nonlocal effects and potential manifold mixing, yet limited data within such neighborhoods tends to cause high estimation variance. As a variance reduction strategy, we propose an ensemble approach that uses subbagging to preserve the local distribution of nearest neighbor (NN) distances. The main challenge is that the uniform reduction in total sample size within each subsample increases the proximity threshold for finding a fixed number k of NNs around the query. As a result, in the specific context of LID estimation, the sampling rate has an additional, complex interplay with the neighborhood size, where both combined determine the sample size as well as the locality and resolution considered for estimation. We analyze both theoretically and experimentally how the choice of the sampling rate and the k-NN size used for LID estimation, alongside the ensemble size, affects performance, enabling informed prior selection of these hyper-parameters depending on application-based preferences. Our results indicate that within broad and well-characterized regions of the hyper-parameters space, using a bagged estimator will most often significantly reduce variance as well as the mean squared error when compared to the corresponding non-bagged baseline, with controllable impact on bias. We additionally propose and evaluate different ways of combining bagging with neighborhood smoothing for substantial further improvements on LID estimation performance.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

2603.24384

Country:

Europe > Denmark > Southern Denmark (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New Jersey > Essex County > Newark (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

82240d93542b74d0c4fdffca39cb779f-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 05:19:26 GMT

machine learning, reinforcement, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > Denmark > Southern Denmark (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Add feedback

8db0d67d22e0ec08c95b810be3a66907-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 19:43:30 GMT

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Add feedback

3be14af22f0b311325664277f48111f4-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 10:54:39 GMT

gaussian process, international conference, neural information processing system, (13 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)
Europe > Denmark > Southern Denmark (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

11faf17bf7e5412d9cded369f97db23d-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 03:15:22 GMT

algorithm, privacy, privacy loss, (15 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Towards a pretrained deep learning estimator of the Linfoot informational correlation

Berg, Stéphanie M. van den, Halekoh, Ulrich, Möller, Sören, Jensen, Andreas Kryger, Hjelmborg, Jacob von Bornemann

arXiv.org Machine LearningDec-16-2025

We develop a supervised deep-learning approach to estimate mutual information between two continuous random variables. As labels, we use the Linfoot informational correlation, a transformation of mutual information that has many important properties. Our method is based on ground truth labels for Gaussian and Clayton copulas. We compare our method with estimators based on kernel density, k-nearest neighbours and neural estimators. We show generally lower bias and lower variance. As a proof of principle, future research could look into training the model with a more diverse set of examples from other copulas for which ground truth labels are available.

correlation, estimator, linfoot correlation, (13 more...)

arXiv.org Machine Learning

2512.12358

Country:

North America > United States > New York (0.04)
Europe > Denmark > Southern Denmark (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Who built Scandinavia's oldest wooden plank boat? An ancient fingerprint offers clues.

Science Archaeology Who built Scandinavia's oldest wooden plank boat? An ancient fingerprint offers clues. Archeologists are closer to solving the Hjortspring Boat's mysteries. Breakthroughs, discoveries, and DIY tips sent every weekday. Archaeologists examining an ancient boat discovered in Denmark over a century ago are getting some help from a clue usually associated with crime scenes .

artificial intelligence, boat, denmark, (12 more...)

Popular Science

Country:

Europe > Sweden (0.72)
Europe > Norway (0.72)
Europe > Northern Europe (0.06)
(3 more...)

Genre: Research Report > New Finding (0.36)

Industry: Media > Photography (0.31)

Technology: Information Technology > Artificial Intelligence (0.36)

Add feedback

Benchmarking of Clustering Validity Measures Revisited

Simpson, Connor, Campello, Ricardo J. G. B., Stojanovski, Elizabeth

arXiv.org Machine LearningNov-11-2025

Clustering is an unsupervised learning technique that aims to identify patterns that consist of similar or interrelated observations within data [39, 87]. Many existing clustering algorithms are often categorised into three primary groups [39, 82]: partitioning algorithms such as K-Means [39] and Spectral Clustering [88], hierarchical algorithms such as Single Linkage [39] and HDBSCAN* [7, 8], and soft (fuzzy or probabilistic) algorithms such as Fuzzy c-Means (FCM) [4] and Expectation Maximisation with Gaussian Mixture Models (EM-GMM) [20]. Partitioning clustering algorithms partition data into a given number of k clusters, while hierarchical clustering algorithms produce a sequence of nested partitions with incrementally varying numbers of clusters. Soft clustering algorithms are similar to partitioning techniques except that each data observation is assigned a degree of membership or probability to each cluster, rather than a full assignment to a single cluster. It is worth mentioning that within the aforementioned categories there are clustering algorithms that may not necessarily assign all observations to clusters, due to outlier trimming or noise detection. Two examples of such algorithms are trimmed K-means [14] and the previously mentioned HDBSCAN*, each of which may produce solutions where not all observations are assigned to clusters. Clustering validation or validity is an important step of the clustering process irrespective of the algorithm used [39, 25], as it is crucial to determine the best produced partition(s) and number of clusters within the data [23].

artificial intelligence, machine learning, partition, (18 more...)

arXiv.org Machine Learning

2511.05983

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning

Neural Information Processing SystemsOct-10-2025, 07:44:20 GMT

We present a theoretical result demonstrating the strong dependency of suboptimality on the number of Monte Carlo samples taken per Bellman target calculation.

dataset, offline reinforcement, reinforcement, (14 more...)

Neural Information Processing Systems

Country: