Learn the best practices for building data pipelines
According to Oracle, feature extraction is an attribute reduction process, which results in a much smaller and richer set of attributes. Depending on the requirements, identifying and extracting informative and compact data sets (for an ML model) may need structured data like numbers and dates or unstructured data like categorical features and raw text. If the data volume is large, the feature extraction can be handled separately, and the generated features can be stored in the storage layer. The format of the stored features is ready for direct consumption by the ML training process in the next phase. The feature extraction can be done for a wide range of applications like simple ETL process, model prediction pipeline, or retraining the model based on new data to improve the model accuracy.
Jun-28-2021, 01:06:04 GMT
- Technology: