lognormal distribution
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Education (0.93)
Mini-Game Lifetime Value Prediction in WeChat
Chen, Aochuan, Niu, Yifan, Gao, Ziqi, Sun, Yujie, Liu, Shoujun, Chen, Gong, Liu, Yang, Li, Jia
The LifeTime Value (LTV) prediction, which endeavors to forecast the cumulative purchase contribution of a user to a particular item, remains a vital challenge that advertisers are keen to resolve. A precise LTV prediction system enhances the alignment of user interests with meticulously designed advertisements, thereby generating substantial profits for advertisers. Nonetheless, this issue is complicated by the paucity of data typically observed in real-world advertising scenarios. The purchase rate among registered users is often as critically low as 0.1%, resulting in a dataset where the majority of users make only several purchases. Consequently, there is insufficient supervisory signal for effectively training the LTV prediction model. An additional challenge emerges from the interdependencies among tasks with high correlation. It is a common practice to estimate a user's contribution to a game over a specified temporal interval. Varying the lengths of these intervals corresponds to distinct predictive tasks, which are highly correlated. For instance, predictions over a 7-day period are heavily reliant on forecasts made over a 3-day period, where exceptional cases can adversely affect the accuracy of both tasks. In order to comprehensively address the aforementioned challenges, we introduce an innovative framework denoted as Graph-Represented Pareto-Optimal LifeTime Value prediction (GRePO-LTV). Graph representation learning is initially employed to address the issue of data scarcity. Subsequently, Pareto-Optimization is utilized to manage the interdependence of prediction tasks.
- North America > Canada > Ontario > Toronto (0.05)
- Asia > China > Guangdong Province > Guangzhou (0.05)
- Asia > China > Hong Kong (0.05)
- (7 more...)
- Marketing (1.00)
- Information Technology > Services (0.83)
- Leisure & Entertainment > Games > Computer Games (0.46)
Stochastic Processes with Modified Lognormal Distribution Featuring Flexible Upper Tail
Hristopulos, Dionissios T., Baxevani, Anastassia, Kaniadakis, Giorgio
Asymmetric, non-Gaussian probability distributions are often observed in the analysis of natural and engineering datasets. The lognormal distribution is a standard model for data with skewed frequency histograms and fat tails. However, the lognormal law severely restricts the asymptotic dependence of the probability density and the hazard function for high values. Herein we present a family of three-parameter non-Gaussian probability density functions that are based on generalized kappa-exponential and kappa-logarithm functions and investigate its mathematical properties. These kappa-lognormal densities represent continuous deformations of the lognormal with lighter right tails, controlled by the parameter kappa. In addition, bimodal distributions are obtained for certain parameter combinations. We derive closed-form analytic expressions for the main statistical functions of the kappa-lognormal distribution. For the moments, we derive bounds that are based on hypergeometric functions as well as series expansions. Explicit expressions for the gradient and Hessian of the negative log-likelihood are obtained to facilitate numerical maximum-likelihood estimates of the kappa-lognormal parameters from data. We also formulate a joint probability density function for kappa-lognormal stochastic processes by applying Jacobi's multivariate theorem to a latent Gaussian process. Estimation of the kappa-lognormal distribution based on synthetic and real data is explored. Furthermore, we investigate applications of kappa-lognormal processes with different covariance kernels in time series forecasting and spatial interpolation using warped Gaussian process regression. Our results are of practical interest for modeling skewed distributions in various scientific and engineering fields.
Using Sequential Runtime Distributions for the Parallel Speedup Prediction of SAT Local Search
Arbelaez, Alejandro, Truchet, Charlotte, Codognet, Philippe
This paper presents a detailed analysis of the scalability and parallelization of local search algorithms for the Satisfiability problem. We propose a framework to estimate the parallel performance of a given algorithm by analyzing the runtime behavior of its sequential version. Indeed, by approximating the runtime distribution of the sequential process with statistical methods, the runtime behavior of the parallel process can be predicted by a model based on order statistics. We apply this approach to study the parallel performance of two SAT local search solvers, namely Sparrow and CCASAT, and compare the predicted performances to the results of an actual experimentation on parallel hardware up to 384 cores. We show that the model is accurate and predicts performance close to the empirical data. Moreover, as we study different types of instances (random and crafted), we observe that the local search solvers exhibit different behaviors and that their runtime distributions can be approximated by two types of distributions: exponential (shifted and non-shifted) and lognormal.
- North America > United States > Nevada > Clark County > Las Vegas (0.05)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- (7 more...)
Towards an Understanding of Long-Tailed Runtimes of SLS Algorithms
Lorenz, Jan-Hendrik, Wörz, Florian
The satisfiability problem is one of the most famous problems in computer science. Its NP-completeness has been used to argue that SAT is intractable. However, there have been tremendous advances that allow SAT solvers to solve instances with millions of variables. A particularly successful paradigm is stochastic local search. In most cases, there are different ways of formulating the underlying problem. While it is known that this has an impact on the runtime of solvers, finding a helpful formulation is generally non-trivial. The recently introduced GapSAT solver [Lorenz and W\"orz 2020] demonstrated a successful way to improve the performance of an SLS solver on average by learning additional information which logically entails from the original problem. Still, there were cases in which the performance slightly deteriorated. This justifies in-depth investigations into how learning logical implications affects runtimes for SLS. In this work, we propose a method for generating logically equivalent problem formulations, generalizing the ideas of GapSAT. This allows a rigorous mathematical study of the effect on the runtime of SLS solvers. If the modification process is treated as random, Johnson SB distributions provide a perfect characterization of the hardness. Since the observed Johnson SB distributions approach lognormal distributions, our analysis also suggests that the hardness is long-tailed. As a second contribution, we theoretically prove that restarts are useful for long-tailed distributions. This implies that additional restarts can further refine all algorithms employing above mentioned modification technique. Since the empirical studies compellingly suggest that the runtime distributions follow Johnson SB distributions, we investigate this property theoretically. We succeed in proving that the runtimes for Sch\"oning's random walk algorithm are approximately Johnson SB.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)
- (4 more...)
A theory of learning with constrained weight-distribution
Zhong, Weishun, Sorscher, Ben, Lee, Daniel D, Sompolinsky, Haim
A central question in computational neuroscience is how structure determines function in neural networks. The emerging high-quality large-scale connectomic datasets raise the question of what general functional principles can be gleaned from structural information such as the distribution of excitatory/inhibitory synapse types and the distribution of synaptic weights. Motivated by this question, we developed a statistical mechanical theory of learning in neural networks that incorporates structural information as constraints. We derived an analytical solution for the memory capacity of the perceptron, a basic feedforward model of supervised learning, with constraint on the distribution of its weights. Our theory predicts that the reduction in capacity due to the constrained weight-distribution is related to the Wasserstein distance between the imposed distribution and that of the standard normal distribution. To test the theoretical predictions, we use optimal transport theory and information geometry to develop an SGD-based algorithm to find weights that simultaneously learn the input-output task and satisfy the distribution constraint. We show that training in our algorithm can be interpreted as geodesic flows in the Wasserstein space of probability distributions. We further developed a statistical mechanical theory for teacher-student perceptron rule learning and ask for the best way for the student to incorporate prior knowledge of the rule. Our theory shows that it is beneficial for the learner to adopt different prior weight distributions during learning, and shows that distribution-constrained learning outperforms unconstrained and sign-constrained learning. Our theory and algorithm provide novel strategies for incorporating prior knowledge about weights into learning, and reveal a powerful connection between structure and function in neural networks.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Education (0.93)
Probability Distributions To Be Aware Of For Data Science (With Code)
Probability and statistics knowledge is at the core of data science and machine learning; You'll require both statistics and probability knowledge to effectively gather, review, analyze and communicate with data. This means it's essential for you to have a good grasp of some fundamental terminologies, what they mean, and how to identify them. One such term you'll hear thrown around a lot is'distribution.' All this is in reference to is the properties of the data. There's several instances of phenomena in the real world that are considered to be statistical in nature (i.e. This means there are several instances in which we've been able to develop methodologies that help us model nature through mathematical functions that can describe the characteristics of the data.
Evidence for Long-Tails in SLS Algorithms
Wörz, Florian, Lorenz, Jan-Hendrik
Stochastic local search (SLS) is a successful paradigm for solving the satisfiability problem of propositional logic. A recent development in this area involves solving not the original instance, but a modified, yet logically equivalent one. Empirically, this technique was found to be promising as it improves the performance of state-of-the-art SLS solvers. Currently, there is only a shallow understanding of how this modification technique affects the runtimes of SLS solvers. Thus, we model this modification process and conduct an empirical analysis of the hardness of logically equivalent formulas. Our results are twofold. First, if the modification process is treated as a random process, a lognormal distribution perfectly characterizes the hardness; implying that the hardness is long-tailed. This means that the modification technique can be further improved by implementing an additional restart mechanism. Thus, as a second contribution, we theoretically prove that all algorithms exhibiting this long-tail property can be further improved by restarts. Consequently, all SAT solvers employing this modification technique can be enhanced.
- Europe > Finland > Uusimaa > Helsinki (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- North America > United States > New York (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.66)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.49)
- (2 more...)
A System for Generating Non-Uniform Random Variates using Graphene Field-Effect Transistors
Tye, Nathaniel Joseph, Meech, James Timothy, Bilgin, Bilgesu Arif, Stanley-Marbell, Phillip
We introduce a new method for hardware non-uniform random number generation based on the transfer characteristics of graphene field-effect transistors (GFETs) which requires as few as two transistors and a resistor (or transimpedance amplifier). The method could be integrated into a custom computing system to provide samples from arbitrary univariate distributions. We also demonstrate the use of wavelet decomposition of the target distribution to determine GFET bias voltages in a multi-GFET array. We implement the method by fabricating multiple GFETs and experimentally validating that their transfer characteristics exhibit the nonlinearity on which our method depends. We use the characterization data in simulations of a proposed architecture for generating samples from dynamically-selectable non-uniform probability distributions. Using a combination of experimental measurements of GFETs under a range of biasing conditions and simulation of the GFET-based non-uniform random variate generator architecture, we demonstrate a speedup of Monte Carlo integration by a factor of up to 2$\times$. This speedup assumes the analog-to-digital converters reading the outputs from the circuit can produce samples in the same amount of time that it takes to perform memory accesses.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > Quebec > Montreal (0.14)
- Europe > Finland (0.04)
Fishing in Fortnite: decoding the Algorithm
For example, thermal fish give players a short competitive advantage by allowing them to have a thermal view, which allows spotting other players in the surrounding, even behind buildings. To understand how Fortnite generates fishes, I will need to estimate the right probability distribution function used to generate fishes, then convert it into code to simulate the same fish generator. The first choice I had in mind for the fish distribution was a lognormal distribution. The majority of the fish sizes would have been clustered around average size. However, finding the perfect tuning to create a lognormal distribution with a peak on 35/40 and an upper limit of 100 has proven challenging: the more I play with the parameters, the more the distribution mimics a normal distribution.