crossover point
- Asia > Japan > Honshū > Tōhoku (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > Norway (0.04)
- (2 more...)
- Law (0.68)
- Education (0.68)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
- Asia > Japan > Honshū > Tōhoku (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > Norway (0.04)
- (2 more...)
- Law (0.68)
- Education (0.68)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Stronger Than You Think: Benchmarking Weak Supervision on Realistic Tasks
Zhang, Tianyi, Cai, Linrong, Li, Jeffrey, Roberts, Nicholas, Guha, Neel, Lee, Jinoh, Sala, Frederic
Weak supervision (WS) is a popular approach for label-efficient learning, leveraging diverse sources of noisy but inexpensive weak labels to automatically annotate training data. Despite its wide usage, WS and its practical value are challenging to benchmark due to the many knobs in its setup, including: data sources, labeling functions (LFs), aggregation techniques (called label models), and end model pipelines. Existing evaluation suites tend to be limited, focusing on particular components or specialized use cases. Moreover, they often involve simplistic benchmark tasks or de-facto LF sets that are suboptimally written, producing insights that may not generalize to real-world settings. We address these limitations by introducing a new benchmark, BOXWRENCH, designed to more accurately reflect real-world usages of WS. This benchmark features tasks with (1) higher class cardinality and imbalance, (2) notable domain expertise requirements, and (3) multilingual variations across parallel corpora. For all tasks, LFs are written using a careful procedure aimed at mimicking real-world settings. In contrast to existing WS benchmarks, we show that supervised learning requires substantial amounts (1000+) of labeled examples to match WS in many settings.
- Asia > Japan > Honshū > Tōhoku (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Law (0.68)
- Education (0.68)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Expected Runtime Comparisons Between Breadth-First Search and Constant-Depth Restarting Random Walks
Platnick, Daniel, Valenzano, Richard Anthony
When greedy search algorithms encounter a local minima or plateau, the search typically devolves into a breadth-first search (BrFS), or a local search technique is used in an attempt to find a way out. In this work, we formally analyze the performance of BrFS and constant-depth restarting random walks (RRW) -- two methods often used for finding exits to a plateau/local minima -- to better understand when each is best suited. In particular, we formally derive the expected runtime for BrFS in the case of a uniformly distributed set of goals at a given goal depth. We then prove RRW will be faster than BrFS on trees if there are enough goals at that goal depth. We refer to this threshold as the crossover point. Our bound shows that the crossover point grows linearly with the branching factor of the tree, the goal depth, and the error in the random walk depth, while the size of the tree grows exponentially in branching factor and goal depth. Finally, we discuss the practical implications and applicability of this bound.
- North America > Canada > Ontario > Toronto (0.05)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- North America > United States > Nevada > Clark County > Las Vegas (0.04)
- (7 more...)
SMC Is All You Need: Parallel Strong Scaling
Liang, Xinzhu, Lohani, Sanjaya, Lukens, Joseph M., Kirby, Brian T., Searles, Thomas A., Law, Kody J. H.
In the general framework of Bayesian inference, the target distribution can only be evaluated up-to a constant of proportionality. Classical consistent Bayesian methods such as sequential Monte Carlo (SMC) and Markov chain Monte Carlo (MCMC) have unbounded time complexity requirements. We develop a fully parallel sequential Monte Carlo (pSMC) method which provably delivers parallel strong scaling, i.e. the time complexity (and per-node memory) remains bounded if the number of asynchronous processes is allowed to grow. More precisely, the pSMC has a theoretical convergence rate of MSE$ = O(1/NR)$, where $N$ denotes the number of communicating samples in each processor and $R$ denotes the number of processors. In particular, for suitably-large problem-dependent $N$, as $R \rightarrow \infty$ the method converges to infinitesimal accuracy MSE$=O(\varepsilon^2)$ with a fixed finite time-complexity Cost$=O(1)$ and with no efficiency leakage, i.e. computational complexity Cost$=O(\varepsilon^{-2})$. A number of Bayesian inference problems are taken into consideration to compare the pSMC and MCMC methods.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
- (4 more...)
- Government > Regional Government > North America Government > United States Government (0.68)
- Energy (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Introduction to Genetic Algorithms -- Including Example Code
A genetic algorithm is a search heuristic that is inspired by Charles Darwin's theory of natural evolution. This algorithm reflects the process of natural selection where the fittest individuals are selected for reproduction in order to produce offspring of the next generation. The process of natural selection starts with the selection of fittest individuals from a population. They produce offspring which inherit the characteristics of the parents and will be added to the next generation. If parents have better fitness, their offspring will be better than parents and have a better chance at surviving.
Electric Vehicle Adoption -- About To Explode? Or Slow & Steady?
I wrote a version of this article below almost a year ago for another company in order to explain the EV market and its future potential. The electric vehicle market 10 years ago was basically nonexistent. Almost zero market analysts or investors were on the lookout for a promising electric vehicle startup. You couldn't find one person out of 100, probably not one out of 1,000, and maybe not even one out of 1 million, who expected an electric car to be the best selling automobile in some notable country and regional markets by 2020. Chances are good that you did not predict a Silicon Valley automaker would be outselling BMW, Mercedes, and Audi in the United States in the luxury car market.
- North America > United States > California (0.24)
- Europe > Netherlands (0.06)
- Europe > Norway (0.05)
- (8 more...)
- Transportation > Ground > Road (1.00)
- Transportation > Electric Vehicle (1.00)
Introduction to Genetic Algorithms -- Including Example Code
A genetic algorithm is a search heuristic that is inspired by Charles Darwin's theory of natural evolution. This algorithm reflects the process of natural selection where the fittest individuals are selected for reproduction in order to produce offspring of the next generation. The process of natural selection starts with the selection of fittest individuals from a population. They produce offspring which inherit the characteristics of the parents and will be added to the next generation. If parents have better fitness, their offspring will be better than parents and have a better chance at surviving.
Building a Deep Learning Model for Process Optimisation
The objective of this paper is to present the process of building a Deep Learning Model for optimising the output for a Production Process from a Training sample using Weka Multilayer Perceptron. The scope is limited to implementation only and does not cover the theory behind Artificial Neural Networks. This work is the outcome of a comprehensive prototyping and proof-of-concept exercise conducted at Turing Point (http://www.turing-point.com/) a consulting company focused on providing genuine Enterprise Machine Learning solutions based on highly advanced techniques such as 3D discrete event simulation, deep learning and genetic algorithms. Predictive Analytics is the process of extracting information from the data for predicting future trends. There are a number of Machine Learning approaches available to model the behaviour.
Problem Difficulty and the Phase Transition in Heuristic Search
Cohen, Eldan (University of Toronto) | Beck, J. Christopher (University of Toronto)
In the recent years, there has been significant work on the difficulty of heuristic search problems, identifying different problem instance characteristics that can have a significant impact on search effort. Phase transitions in the solubility of random problem instances have proved useful in the study of problem difficulty for other classes of computational problems, notably SAT and CSP, and it has been shown that the hardest problems typically occur during this rapid transition. In this work, we perform the first empirical investigation of the phase transition phenomena for heuristic search. We establish the existence of a rapid transition in the solubility of an abstract model of heuristic search problems and show that, for greedy best first search, the hardest instances are associated with the phase transition region. We then perform a novel investigation of the behavior of heuristics of different strength across the solubility spectrum. Finally, we demonstrate that the behavior of our abstract model carries over to commonly used benchmark problems including the Pancake Problem, Grid Navigation, TopSpin, and the Towers of Hanoi. An interesting deviation is observed and explained in the Sliding Puzzle.
- Asia > Vietnam > Hanoi > Hanoi (0.24)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)