While hierarchical machine learning approaches have been used to classify texts into different content areas, this approach has, to our knowledge, not been used in the automated assessment of text difficulty. This study compared the accuracy of four classification machine learning approaches (flat, one-vs-one, one-vs-all, and hierarchical) using natural language processing features in predicting human ratings of text difficulty for two sets of texts. The hierarchical classification was the most accurate for the two text sets considered individually (Set A, 77.78%; Set B, 82.05%), while the non-hierarchical approaches, one-vs-one and one-vs-all, performed similar to the hierarchical classification for the combined set (71.43%). These findings suggest both promise and limitations for applying hierarchical approaches to text difficulty classification. It may be beneficial to apply a recursive top-down approach to discriminate the subsets of classes that are at the top of the hierarchy and less related, and then further separate the classes into subsets that may be more similar to one other. These results also suggest that a single approach may not always work for all types of datasets and that it is important to evaluate which machine learning approach and algorithm works best for particular datasets. The authors encourage more work in this area to help suggest which types of algorithms work best as a func-tion of the type of dataset.
I'm working on a PoC of some system and part of the project is to classify images. Let's say there are many (hundreds) "tags" and for each tag there is a hundred of images. Each image is produced as a result of some technological process and looks like set of lines and dots. What would be the simplest way to "learn" the existing images such that for a new image/tag pair I could be able to tell whether it matches the classification. Is there a ready to use library or solution for such a problem?
Machine Learning is the most in demand technical skill in today's business environment. Most of the time though it is reserved for professionals that know how to code. But Microsoft Azure Machine Learning Studio changed that. It brings a drag-n-drop easy to use environment to anyone's fingertips. Microsoft is known for its easy-of-use tools and Azure ML Studio is no different.
Valerio, Vinicius D. ( State University of Maringa (UEM) ) | Pereira, Rodolfo M. (Pontifical Catholic University of Parana (PUCPR) and Federal Institute of Education, Science and Technology of Parana (IFPR)) | Costa, Yandre M. G. ( State University of Maringa (UEM) ) | Bertoini, Diego (Federal Technological University of Parana - Campo Mourao ) | Jr., Carlos N. Silla ( Pontifical Catholic University of Parana )
In real-world problems, modeled as machine learning tasks, the datasets are typically unbalanced, meaning that some classes have much more instances than others. In the Music Information Retrieval field it is not different and songs datasets usually are very unbalanced. Considering this scenario, we propose a novel approach to face the class imbalance problem applied to music genre classification. The proposed method uses vertical sliced spectrograms extracted from the songs' audio signal to apply oversampling and undersampling into the minority and majority classes, respectively. The experimental results for F-Score measure showed that our approach was able to beat the best result of Random Undersampling technique by 0.086, using MultiLayer Perceptrons. Besides, comparing to the baseline results, our approach significantly increased the individual results for all the minority classes.
In this paper, we present some results of evidential reasoning in understanding multispectral images of remote sensing systems. The Dempster-Shafer approach of combination of evidences is pursued to yield contextual classification results, which are compared with previous results of the Bayesian context free classification, contextual classifications of dynamic programming and stochastic relaxation approaches.