Computers have become adept at extracting patterns from very large collections of data. For example, shopping transactions can reveal consumers' preferences and message traffic on social networks can reveal political trends.
For example, for personalized recommendations, we have been working with learning to rank methods that learn individual rankings over item sets. Figure 1: Typical data science workflow, starting with raw data that is turned into features and fed into learning algorithms, resulting in a model that is applied on future data. This means that this pipeline is iterated and improved many times, trying out different features, different forms of preprocessing, different learning methods, or maybe even going back to the source and trying to add more data sources. Probably the main difference between production systems and data science systems is that production systems are real-time systems that are continuously running.