In this post we will look at two probability distributions you will encounter almost each time you do data science, statistics, or machine learning. Imagine that we are doing a research on the height of various people in a city. We go down the street and measure a bunch of random people. Now we decide that some Exploratory Data Analysis won't hurt. But statistical software like R isn't available at the moment, so we just make a histogram out of people.
The original goal of this post was to explore the relationship between the softmax and sigmoid functions. In truth, this relationship had always seemed just out of reach: "One has an exponent in the numerator! One has a 1 in the denominator!" And of course, the two have different names. Once derived, I quickly realized how this relationship backed out into a more general modeling framework motivated by the conditional probability axiom itself.
I hope you found the last few posts on search easy to learn yet challenging enough to keep you going. I'd love to hear your feedback so I can improve these tutorials. So far we've been discussing the topic of search, but the breadth-first search algorithm we implemented is hardly'intelligent'; the algorithm follows a simple set of rules to reach its goal state. To have the machine make more reasoned'choices', we need to go beyond blindly following these rules. This week we'll put more of the I into AI with a new topic: stochastic models.
This work analyses the potential of restarts for probSAT, a quite successful algorithm for k-SAT, by estimating its runtime distributions on random 3-SAT instances that are close to the phase transition. We estimate an optimal restart time from empirical data, reaching a potential speedup factor of 1.39. Calculating restart times from fitted probability distributions reduces this factor to a maximum of 1.30. A spin-off result is that the Weibull distribution approximates the runtime distribution for over 93% of the used instances well. A machine learning pipeline is presented to compute a restart time for a fixed-cutoff strategy to exploit this potential. The main components of the pipeline are a random forest for determining the distribution type and a neural network for the distribution's parameters. ProbSAT performs statistically significantly better than Luby's restart strategy and the policy without restarts when using the presented approach. The structure is particularly advantageous on hard problems.
Machine learning provides algorithms that can learn from data and make inferences or predictions on data. Bayesian networks are a class of graphical models that allow to represent a collection of random variables and their condititional dependencies by directed acyclic graphs. In this paper, an inference algorithm for the hidden random variables of a Bayesian network is given by using the tropicalization of the marginal distribution of the observed variables. By restricting the topological structure to graded networks, an inference algorithm for graded Bayesian networks will be established that evaluates the hidden random variables rank by rank and in this way yields the most probable states of the hidden variables. This algorithm can be viewed as a generalized version of the Viterbi algorithm for graded Bayesian networks.