Goto

Collaborating Authors

 Decision Tree Learning


#ftag=RSSbaffb68

ZDNet

New Zealand fixed-line telecommunications provider Chorus has announced its full-year financial results for 2015-16, reporting earnings before interest, tax, depreciation, and amortisation (EBITDA) of NZ 594 million, down by NZ 8 million due to the regulator's copper pricing decision in December. In December, the New Zealand Commerce Commission released its final pricing determination for broadband services being delivered over Chorus' legacy copper network. A breakdown of revenue saw basic copper services contribute NZ 489 million, NZ 2 million less than last year; enhanced copper contribute NZ 242 million, down from NZ 268 million; and fibre contribute NZ 133 million, up by 35.7 percent year on year from NZ 98 million. Value Added Network Services, Field Services, and Infrastructure were all down by NZ 1 million to contribute NZ 35 million, NZ 83 million, and NZ 20 million, respectively.


Random Forest for Label Ranking

arXiv.org Machine Learning

Label ranking aims to learn a mapping from instances to rankings over a finite number of predefined labels. Random forest is a powerful and one of the most successfully general-purpose machine learning algorithms of modern times. In the literature, there seems no research has yet been done in applying random forest to label ranking. In this paper, We present a powerful random forest label ranking method which uses random decision trees to retrieve nearest neighbors that are not only similar in the feature space but also in the ranking space. We have developed a novel two-step rank aggregation strategy to effectively aggregate neighboring rankings discovered by the random forest into a final predicted ranking. Compared with existing methods, the new random forest method has many advantages including its intrinsically scalable tree data structure, highly parallel-able computational architecture and much superior performances. We present extensive experimental results to demonstrate that our new method achieves the best predictive accuracy performances compared with state-of-the-art methods for datasets with complete ranking and datasets with only partial ranking information.


Get Immersed in AI with the Complete Machine Learning Bundle

#artificialintelligence

Why guess what will happen in the future? Put your computer to work and allow it to predict it for you. Machine Learning allows computers to learn from and make predictions on data, saving you the headache of trying to do it. With the Complete Machine Learning Bundle you can learn all you need to know about Machine Learning. You'll get over sixty hours of learning about artificial intelligence with courses on quantitative trading, R, Hadoop and MapReduce, Java, decision trees and random forests, deep learning and computer vision, and Python.


Random Forest Tutorial: Predicting Crime in San Francisco

#artificialintelligence

Announcement: Layman Tutorials for Data Science site Annalyzin is now called Algobeans! We're creating a new mailing list to deliver tutorials to your inbox. If you like to be included, sign up below. If you're already subscribed, signing up to this new mailing list will remove you from the old one. Can several wrongs make a right?


Unifying Decision Trees Split Criteria Using Tsallis Entropy

arXiv.org Machine Learning

The construction of efficient and effective decision trees remains a key topic in machine learning because of their simplicity and flexibility. A lot of heuristic algorithms have been proposed to construct near-optimal decision trees. ID3, C4.5 and CART are classical decision tree algorithms and the split criteria they used are Shannon entropy, Gain Ratio and Gini index respectively. All the split criteria seem to be independent, actually, they can be unified in a Tsallis entropy framework. Tsallis entropy is a generalization of Shannon entropy and provides a new approach to enhance decision trees' performance with an adjustable parameter $q$. In this paper, a Tsallis Entropy Criterion (TEC) algorithm is proposed to unify Shannon entropy, Gain Ratio and Gini index, which generalizes the split criteria of decision trees. More importantly, we reveal the relations between Tsallis entropy with different $q$ and other split criteria. Experimental results on UCI data sets indicate that the TEC algorithm achieves statistically significant improvement over the classical algorithms.


Decision tree visualization in python - Titanic: Machine Learning from Disaster

#artificialintelligence

Hi friends,I was struggling for Decision tree visualization in python.Sometimes there is error due to pydot and sometimes due to graphviz....even though I have installed both in my windows machine but still no luck... please let me know if you know any easy method for this visualization in ipython notebook


Reweighting with Boosted Decision Trees

arXiv.org Machine Learning

Machine learning tools are commonly used in modern high energy physics (HEP) experiments. Different models, such as boosted decision trees (BDT) and artificial neural networks (ANN), are widely used in analyses and even in the software triggers. In most cases, these are classification models used to select the "signal" events from data. Monte Carlo simulated events typically take part in training of these models. While the results of the simulation are expected to be close to real data, in practical cases there is notable disagreement between simulated and observed data. In order to use available simulation in training, corrections must be introduced to generated data. One common approach is reweighting - assigning weights to the simulated events. We present a novel method of event reweighting based on boosted decision trees. The problem of checking the quality of reweighting step in analyses is also discussed.


Master the Basics of Machine Learning With These 6 Resources

#artificialintelligence

It seems like machine learning and artificial intelligence are topics at the top of everyone's mind in tech. Be it autonomous cars, robots, or machine intelligence in general, everyone's talking about machines getting smarter and being able to do more. At the same time, for many developers, machine learning and artificial intelligence are nebulous terms representing complex mathematical and data problems they just don't have the time to explore and learn. As I've spoken with lots of developers and CTOs about Fuzzy.io and our mission to make it easy for developers to start bringing intelligent decision-making to their software without needing huge amounts of data or AI expertise, some were curious to learn more about the greater landscape of machine learning. Here are some of the links to articles, podcasts and courses discussing some of the basics of machine learning that I've shared with them.


aloysius-lim/bigrf

#artificialintelligence

This is an R implementation of Leo Breiman's and Adele Cutler's Random Forest algorithms for classification and regression, with optimizations for performance and for handling of data sets that are too large to be processed in memory. Forests can be built in parallel at two levels. First, trees can be grown in parallel on a single machine using foreach. Second, multiple forests can be built in parallel on multiple machines, then merged into one. For large data sets, disk-based big.matrix's may be used for storing data and intermediate computations, to prevent excessive virtual memory swapping by the operating system.


8 Tactics to Combat Imbalanced Classes in Your Machine Learning Dataset - Machine Learning Mastery

#artificialintelligence

You are working on your dataset. You create a classification model and get 90% accuracy immediately. You dive a little deeper and discover that 90% of the data belongs to one class. This is an example of an imbalanced dataset and the frustrating results it can cause. In this post you will discover the tactics that you can use to deliver great results on machine learning datasets with imbalanced data.