AITopics | cumulative histogram

Collaborating Authors

cumulative histogram

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep-Learning Based Docking Methods: Fair Comparisons to Conventional Docking Workflows

Jain, Ajay N., Cleves, Ann E., Walters, W. Patrick

arXiv.org Artificial IntelligenceDec-9-2024

The diffusion learning method, DiffDock, for docking small-molecule ligands into protein binding sites was recently introduced. Results included comparisons to more conventional docking approaches, with DiffDock showing superior performance. Here, we employ a fully automatic workflow using the Surflex-Dock methods to generate a fair baseline for conventional docking approaches. Results were generated for the common and expected situation where a binding site location is known and also for the condition of an unknown binding site. For the known binding site condition, Surflex-Dock success rates at 2.0 Angstroms RMSD far exceeded those for DiffDock (Top-1/Top-5 success rates, respectively, were 68/81% compared with 45/51%). Glide performed with similar success rates (67/73%) to Surflex-Dock for the known binding site condition, and results for AutoDock Vina and Gnina followed this pattern. For the unknown binding site condition, using an automated method to identify multiple binding pockets, Surflex-Dock success rates again exceeded those of DiffDock, but by a somewhat lesser margin. DiffDock made use of roughly 17,000 co-crystal structures for learning (98% of PDBBind version 2020, pre-2019 structures) for a training set in order to predict on 363 test cases (2% of PDBBind 2020) from 2019 forward. DiffDock's performance was inextricably linked with the presence of near-neighbor cases of close to identical protein-ligand complexes in the training set for over half of the test set cases. DiffDock exhibited a 40 percentage point difference on near-neighbor cases (two-thirds of all test cases) compared with cases with no near-neighbor training case. DiffDock has apparently encoded a type of table-lookup during its learning process, rendering meaningful applications beyond its reach. Further, it does not perform even close to competitively with a competently run modern docking workflow.

artificial intelligence, diffdock, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2412.02889

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Rheinland-Pfalz > Mainz (0.04)

Genre:

Research Report (0.82)
Workflow (0.70)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Earth Mover's Pinball Loss: Quantiles for Histogram-Valued Regression

List, Florian

arXiv.org Machine LearningJun-3-2021

Although ubiquitous in the sciences, histogram data have not received much attention by the Deep Learning community. Whilst regression and classification tasks for scalar and vector data are routinely solved by neural networks, a principled approach for estimating histogram labels as a function of an input vector or image is lacking in the literature. We present a dedicated method for Deep Learning-based histogram regression, which incorporates cross-bin information and yields distributions over possible histograms, expressed by $\tau$-quantiles of the cumulative histogram in each bin. The crux of our approach is a new loss function obtained by applying the pinball loss to the cumulative histogram, which for 1D histograms reduces to the Earth Mover's distance (EMD) in the special case of the median ($\tau = 0.5$), and generalizes it to arbitrary quantiles. We validate our method with an illustrative toy example, a football-related task, and an astrophysical computer vision problem. We show that with our loss function, the accuracy of the predicted median histograms is very similar to the standard EMD case (and higher than for per-bin loss functions such as cross-entropy), while the predictions become much more informative at almost no additional computational cost.

earth mover, histogram, quantile, (15 more...)

arXiv.org Machine Learning

2106.02051

Country:

Oceania > Australia (0.14)
North America > United States (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Exploratory Data Analysis with pandas

#artificialintelligenceNov-22-2019, 11:25:45 GMT

As a Data Scientist, I use pandas daily and I am always amazed by how many functionalities it has. These 5 pandas tricks will make you better with Exploratory Data Analysis, which is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. Many complex visualizations can be achieved with pandas and usually, there is no need to import other libraries. To run the examples download this Jupyter notebook. In case you've missed 5 lesser-known pandas tricks.

cumulative histogram, exploratory data analysis, histogram, (15 more...)

#artificialintelligence

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.75)

Add feedback