Microsoft releases LightGBM


Microsoft has been really increasing their development of tools in the predictive analytics and machine learning space. Another such tool they released recently is LightGBM. From the Github site... LightGBM is a fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. Microsoft is definitely increasing their attempts to capitalize on the machine learning and big data movement. I hope they continue to develop tools such as LightGBM and R with SQL Server.

Clashes break out in Yemen's key port city of Hodeida after cease-fire

The Japan Times

SANAA - Fighting erupted in Yemen's key port city of Hodeida on Sunday, the first significant clashes since warring sides agreed to a U.N.-brokered cease-fire deal in December, security officials and eyewitnesses said. Fires burned on the main front lines in the city's east and south, while exchanges of artillery fire shook buildings in combat that broke out overnight, they said. Both the Shiite Houthi rebels who hold the city and the government-backed troops who oppose them have been seen erecting barricades and digging defensive trenches. "All night long, we hear the loud roar of machine guns and artillery, which had been silent for the past two weeks," said resident Ahmed Nasser, adding that he was worried for relatives who had returned to the July 7 neighborhood on the city's eastern front. The officials spoke on condition of anonymity as they weren't authorized to brief journalists, while witnesses did so for fear of their safety.

Machine Learning for Retail Price Recommendation with Python


It is obvious that the average price is higher when buyer pays shipping. There seems to be various on the average price between each item condition id. After above exploratory data analysis, I decide to use all the features to build our model. Under the umbrella of the DMTK project of Microsoft, LightGBM is a gradient boosting framework that uses tree based learning algorithms. Therefore, we are going to give it a try.

Predicting movie revenue with AdaBoost, XGBoost and LightGBM


Marvel's Avengers: Endgame recently dethroned Avatar as the highest grossing movie in history and while there was no doubt about this movie becoming very successful, I want to understand what makes any given movie a success. I am using data from The Movie Database provided through kaggle. The data set is split into a train and test set with the train set containing 3,000 movies and the test set comprising 4,398. The train data set also contains the target variable revenue. Prequels and Sequels: Maybe unsurprisingly, movies that are either prequels or sequels to related movies earn on average a higher revenue than standalone movies.

Tell Me Something New: A New Framework for Asynchronous Parallel Learning Machine Learning

We present a novel approach for parallel computation in the context of machine learning that we call "Tell Me Something New" (TMSN). This approach involves a set of independent workers that use broadcast to update each other when they observe "something new". TMSN does not require synchronization or a head node and is highly resilient against failing machines or laggards. We demonstrate the utility of TMSN by applying it to learning boosted trees. We show that our implementation is 10 times faster than XGBoost and LightGBM on the splice-site prediction problem.