feature-engine
GitHub - feature-engine/feature_engine: Feature engineering package with sklearn like functionality
Feature-engine is a Python library with multiple transformers to engineer and select features for use in machine learning models. Feature-engine's transformers follow Scikit-learn's functionality with fit() and transform() methods to learn the transforming parameters from the data and then transform it. Find more examples in our Jupyter Notebook Gallery or in the documentation. Feature-engine documentation is built using Sphinx and is hosted on Read the Docs. To build the documentation make sure you have the dependencies installed: from the root directory: pip install -r docs/requirements.txt.
The Challenges of Creating Features for Machine Learning - KDnuggets
When I decided to leave academia and re-train as a data scientist, I quickly found out that I had to learn R or Python, or well… both. That's probably the first time I heard about Python. I never imagined that 3 years later I would be maintaining an increasingly popular open source Python library for feature engineering: Feature-engine. In this article, I want to discuss the challenges of feature engineering and selection both from the technical and operational side, and then lay out how Feature-engine, an open source Python library, can help us mitigate those challenges. I will also highlight the advantages and shortcomings of Feature-engine in the context of other Python libraries. Feature-engine is an open-source Python library for feature engineering and feature selection. It works like Scikit-learn, with methods fit() and transform() that learn parameters from the data and then use those parameters to transform the data.
Alternative Feature Selection Methods in Machine Learning - KDnuggets
You've probably done your online searches on "Feature Selection", and you've probably found tons of articles describing the three umbrella terms that group selection methodologies, i.e., "Filter Methods", "Wrapper Methods" and "Embedded Methods". Under the "Filter Methods", we find statistical tests that select features based on their distributions. These methods are computationally very fast, but in practice they do not render good features for our models. In addition, when we have big datasets, p-values for statistical tests tend to be very small, highlighting as significant tiny differences in distributions, that may not be really important. The "Wrapper Methods" category includes greedy algorithms that will try every possible feature combination based on a step forward, step backward, or exhaustive search.
Feature engine python package for feature engineering
Feature engineering is the process of using domain knowledge of the data to transform existing features or to create new variables from existing ones, for use in machine learning. Using feature engineering, we can pre-process raw data and make it suitable for use in machine learning algorithms. Feature-engine is a Python library with multiple transformers to engineer features for use in machine learning models. Feature-engine's transformers follow Scikit-learn functionality with fit() and transform() methods to first learn the transforming parameters from data and then transform the data I plan to contribute to this package. In August, at Data Science Central, I also plan create a mini e-book on feature engineering which will use this page (co-authored with Aysa Tajeri).