AITopics | hyper-geometric distribution

Collaborating Authors

hyper-geometric distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Note on Optimal Sampling Strategy for Structural Variant Detection Using Optical Mapping

Li, Weiwei, Hannig, Jan, Jones, Corbin

arXiv.org Machine LearningOct-4-2019

A Note on Optimal Sampling Strategy for Structural V ariant Detection Using Optical Mapping Weiwei Li Department of Statistics and Operations Research University of North Carolina at Chapel Hill weiweili@live.unc.edu Abstract Structural variants compose the majority of human genetic variation, but are difficult to assess using current genomic sequencing technologies. Optical mapping technologies, which measure the size of chromosomal fragments between labeled markers, offer an alternative approach. As these technologies mature towards becoming clinical tools, there is a need to develop an approach for determining the optimal strategy for sampling biological material in order to detect a variant at some threshold. Here we develop an optimization approach using a simple, yet realistic, model of the genomic mapping process using a hyper-geometric distribution and probabilistic concentration inequalities. Our approach is both computationally and analytically tractable and includes a novel approach to getting tail bounds of hyper-geometric distribution. We show that if a genomic mapping technology can sample most of the chromosomal fragments within a sample, comparatively little biological material is needed to detect a variant at high confidence. 1 Introduction Structural variants (SV), insertions, deletions, translocations, copy number variants, are by far the most common types of human genetic variation (Chaisson et al., 2015). They have been linked to large number of heritable disorders (Hurles et al., 2008). Technology to assay the presence or absence of these variants has steadily improved in ease and resolution (Huddleston and Eichler, 2016; Audano et al., 2019).

hyper-geometric distribution, sequence, target sequence, (15 more...)

arXiv.org Machine Learning

1910.04067

Country: North America > United States > North Carolina (0.24)

Genre: Research Report (0.70)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.75)
Education > Educational Setting > Higher Education (0.54)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

New probabilistic interest measures for association rules

Hahsler, Michael, Hornik, Kurt

arXiv.org Machine LearningMar-6-2008

Mining association rules is an important technique for discovering meaningful patterns in transaction databases. Many different measures of interestingness have been proposed for association rules. However, these measures fail to take the probabilistic properties of the mined data into account. In this paper, we start with presenting a simple probabilistic framework for transaction data which can be used to simulate transaction data when no associations are present. We use such data and a real-world database from a grocery outlet to explore the behavior of confidence and lift, two popular interest measures used for rule mining. The results show that confidence is systematically influenced by the frequency of the items in the left hand side of rules and that lift performs poorly to filter random noise in transaction data. Based on the probabilistic framework we develop two new interest measures, hyper-lift and hyper-confidence, which can be used to filter or order mined association rules. The new measures show significantly better performance than lift for applications where spurious rules are problematic.

artificial intelligence, database, expert system, (19 more...)

arXiv.org Machine Learning

0803.0966

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)

Add feedback