Goto

Collaborating Authors

 feature creation


Augmenting data-driven models for energy systems through feature engineering: A Python framework for feature engineering

arXiv.org Artificial Intelligence

Data-driven modeling is an approach in energy systems modeling that has been gaining popularity. In data-driven modeling, machine learning methods such as linear regression, neural networks or decision-tree based methods are being applied. While these methods do not require domain knowledge, they are sensitive to data quality. Therefore, improving data quality in a dataset is beneficial for creating machine learning-based models. The improvement of data quality can be implemented through preprocessing methods. A selected type of preprocessing is feature engineering, which focuses on evaluating and improving the quality of certain features inside the dataset. Feature engineering methods include methods such as feature creation, feature expansion, or feature selection. In this work, a Python framework containing different feature engineering methods is presented. This framework contains different methods for feature creation, expansion and selection; in addition, methods for transforming or filtering data are implemented. The implementation of the framework is based on the Python library scikit-learn. The framework is demonstrated on a case study of a use case from energy demand prediction. A data-driven model is created including selected feature engineering methods. The results show an improvement in prediction accuracy through the engineered features.


6 Important Steps to build a Machine Learning System

#artificialintelligence

Creating a great machine learning system is an art. There are a lot of things to consider while building a great machine learning system. But often it happens that we as data scientists only worry about certain parts of the project. Most of the time that happens to be modeling, but in reality, the success or failure of a Machine Learning project depends on a lot of other factors. It is essential to understand what happens before training a model and after training the model and deploying it in production.


For Better or Worse Analytics and Data Science are Converging

#artificialintelligence

Summary: Analytic Platforms are rapidly being augmented with features previously reserved for data scientists. They are presented as easy to use but require substantial data literacy and advanced DS skills for the most complex. Business users and analysts can pursue more complex problems on their own, but need good oversight. Data Science Platform developers and Analytics Platform developers have been circling each other for years. The DSP folks see the analyst market presenting a much larger customer base.


Learning Feature Selection for Building and Improving your Machine Learning Model 7wData

#artificialintelligence

Usually, the task of model building gets reduced to trying all sorts of fancy algorithms - from standard machine algorithm to Deep learning models. But, if we are going to feed garbage to our machine learning algorithm, garbage is going to come out of it (GIGO). In model building, Feature selection/creation is a step where maximum time should be spent. Feature selection is somewhat easier than feature creation. Feature selection is a well-researched area and most of the Data Science algorithms offered under Python or R have automated this process. Feature creation is a bigger dragon to slay.


Removing Obstacles to Production Machine Learning with OpnIDS and Dragonfly MLE

#artificialintelligence

Machine learning promises to address many of the challenges faced by network security analysts; however, there are still many obstacles that prevent widespread adoption of machine learning within security operations centers (SOC). The first major challenge is one of trust as discussed in our previous post. The second major set of challenges is around the complexity of deploying machine learning in a production environment. Once a machine-learning model has been trained and validated in the lab, there is often an equal if not larger effort required to deploy that model in a repeatable, production environment.\Transitioning Data science typically operates using an iterative batch process.


Machine Learning Engineer/siliconarmada.com

#artificialintelligence

The work: - Help us build the next personalization platform for one of the largest populations in the world! - Work with massive data from multiple applications and 125 Million customers - Understand the theory and application of theory for common classification, clustering, NLP, and collaborative filtering - Have experience or aptitude in graph databases or graph analytics - Care about designing the full machine learning pipeline - Feature creation, feature creation, feature creation - Design and implement A/B Testing and other validation processes The skills: - You have demonstrable software engineering experience in Java, Go, or Scala. Should you require accommodations during the recruitment and selection process, please let us know. Paytm Labs is an equal opportunity employer. We thank all applicants, however, only those selected for an interview will be contacted.



BECCA: Reintegrating AI for Natural World Interaction

AAAI Conferences

Natural world interaction (NWI), the pursuit of arbitrary goals in unstructured physical environments, is an excellent motivating problem for the reintegration of artificial intelligence. It is the problem set that humans struggle to solve. At a minimum it entails perception, learning, planning, and control, and can also involve language and social behavior. An agent's fitness in NWI is achieved by being able to perform a wide variety of tasks, rather than being able to excel at one. In an attempt to address NWI, a brain-emulating cognition and control architecture (BECCA) was developed. It uses a combination of feature creation and model-based reinforcement learning to capture structure in the environment in order to maximize reward. BECCA avoids making common assumptions about its world, such as stationarity, determinism, and the Markov assumption. BECCA has been demonstrated performing a set of tasks which is non-trivially broad, including a vision-based robotics task. Current development activity is focused on applying BECCA to the problem of general Search and Retrieve, a representative natural world interaction task.