The Large Hadron Collider at CERN is one of the world's largest scientific instruments. It captures 5 trillion bits of data every second, and the Geneva-based lab employs a dedicated group of experts to manage the flow. In contrast, the instrument shown here – known as a time-stretch quantitative phase imaging microscope – fits on a bench top, and is managed by a team of one. However, it is also capable of capturing an immense amount of data: 0.8 trillion bits per second. These two examples illustrate just how ubiquitous "big data" has become in physics.
I am a data scientist, big data engineer, and full stack software engineer. For my masters thesis I worked on brain-computer interfaces using near-infrared spectroscopy. These assist non-verbal and non-mobile persons communicate with their family and caregivers. I have worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. I've created new big data pipelines using Hadoop/Pig/MapReduce. I've created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.
As the amount of data continues to grow at an almost incomprehensible rate, being able to understand and process data is becoming a key differentiator for competitive organizations. Machine Learning applications are everywhere, from self-driving cars to spam detection, document search, and trading strategies, to speech recognition. This makes machine learning well-suited to the present-day era of Big Data and Data Science. The main challenge is how to transform data into actionable knowledge. In this course, you'll be introduced to the Natural Processing Language and Recommendation Systems, which help you run multiple algorithms simultaneously.
Elections are a vital part of democracy allowing people to vote for the candidate they think can best lead the country. A candidate's campaign aims to demonstrate to the public why they think they are the best choice. However, in this age of constant media coverage and digital communications, the candidate is scrutinized at every step. A single misquote or negative news about a candidate can be the difference between him winning or losing the election. It becomes crucial to have a public relations manager who can guide and direct the candidate's campaign by prioritizing specific campaign activities. One critical aspect of the PR manager's work is to understand the public perception of their candidate and improve public sentiment about the candidate.
When we are in the outdoors, many of us often feel the need for a camera- that is intelligent enough to follow us, adjust to the terrain heights and visually navigate through the obstacles, while capturing panoramic videos. Here, I am talking about autonomous self-flying drones, very similar to cars on auto pilot. The difference is that we are starting to see proliferation of artificial intelligence into affordable, everyday use cases, compared to relatively expensive cars. This helps them distinguish between objects and get better with more data. Recently, Roni Fontaine at Hortonworks published a blog titled "How Apache Hadoop 3 Adds Value Over Apache Hadoop 2", capturing the high-level themes.