Data Science: Overviews


Understanding the Potential of Artificial Intelligence

#artificialintelligence

In 2008, Daniel Hulme started Satalia, a company that uses data science, machine learning, and optimization (making the best use of resources) to build customized platforms that solve tough logistics problems involving products, services, and people. Lately, Hulme has spent a good portion of his time explaining the ins and outs of artificial intelligence to other CEOs. He sees a big information gap at the top of most companies -- yet this is where technology investment decisions are made. Misunderstanding AI, Hulme believes, can mean both overestimating its value and underestimating its impact. Satalia's work is a leading example of what AI is currently good at. Not coincidentally, it is also the commercialization of Hulme's research at University College London (UCL), where he is the director of the business analytics master's degree program. Satalia's clients are household names in the U.K.; they include Tesco, DFS, and the British Broadcasting Corporation. PwC's Global CEO Survey: Providing unique insight into the thinking of corporate leaders around the world, PwC's annual Global CEO Survey covers issues such as the prospects for economic growth, the challenges of building a workforce, the threats facing companies today, and the impact of AI. www.ceosurvey.pwc The increasingly competitive market for AI expertise is both a blessing and a curse for Satalia.


An Overview of Business Problems and Data Science Solutions -- Part 2

#artificialintelligence

There is an important distinction related to data mining. First the difference between mining the data to find patterns and build models, and second using the results of data mining. Data Mining results inform the data mining process itself. Cross-industry standard process for data mining, known as CRISP-DM, is an open standard process model that describes common approaches used by data mining experts. It is the most widely-used analytics model and breaks the process of data mining into six major process.


Quantum Machine Learning: A look at myths, realities, and future projections

#artificialintelligence

Despite recent advances and press regarding the field, quantum computing is still veiled in mystery and myth, even within the field of data science and technology. Even those within the field of quantum computing and quantum machine learning are still learning the potential for progress and the stark limitations of current systems. However, quantum computing has arrived in its infancy, and many major companies are pouring money into related R&D efforts. D-Wave's system has been commercially available for a couple of years already (albeit at a price tag of $10 million), and other systems have been opened for research purposes and commercial partnerships with quantum machine learning companies. Quantum computing hardware theoretically can take on several different forms, each of which is suited to a different type of machine learning problem.


DataCamp's Data Science And Machine Learning Programs: A Review

#artificialintelligence

One of my favorite places to learn data science is an under-the-radar educational website, DataCamp. DataCamp doesn't get nearly the attention that some of the larger, more well-funded online coding schools get, but, I often find myself on one of their tutorials whenever I'm learning something new related to statistics or machine learning. Over the past few months, I've dedicated at least a few hours a week to learning the underpinnings of automation and, where I find something interesting, to blog about my experience. Unlike almost every other school or tutorial I've encountered, DataCamp has a delightfully distinct and powerful approach to education: every single piece of instruction is paired with a simple example and interactive tutorial. There are no long lectures; there are no complicated diagrams.


Top September Stories: Essential Math for Data Science: Why and How; Machine Learning Cheat Sheets

#artificialintelligence

Here are the most popular posts in KDnuggets in September, based on the number of unique page views (UPV), and social share counts from Facebook, Twitter, and Addthis. Most Shareable (Viral) Blogs Among the top blogs, here are the 5 blogs with the highest ratio of shares/unique views, which suggests that people who read it really liked it. You Aren't So Smart: Cognitive Biases are Making Sure of It, by Matthew Mayo A Winning Game Plan For Building Your Data Science Team, by William Schmarzo What on earth is data science?, by Cassie Kozyrkov Everything You Need to Know About AutoML and Neural Architecture Search, by George Seif The Data Science of "Someone Like You" or Sentiment Analysis of Adele's Songs, by Preetish Panda How many data scientists are there and is there a shortage?, by Gregory Piatetsky Neural Networks and Deep Learning: A Textbook, by Charu Aggarwal 5 Resources to Inspire Your Next Data Science Project, by Conor Dewey Hadoop for Beginners, by Aafreen Dabhoiwala 6 Steps To Write Any Machine Learning Algorithm From Scratch: Perceptron Case Study, by John Sullivan Deep Learning for NLP: An Overview of Recent Trends, by Elvis Saravia (*) Ultimate Guide to Getting Started with TensorFlow, by Brian Zhang (*) How many data scientists are there and is there a shortage?, by Gregory Piatetsky Essential Math for Data Science: 'Why' and'How', by Tirthajyoti Sarkar Journey to Machine Learning - 100 Days of ML Code, by Avik Jain You Aren't So Smart: Cognitive Biases are Making Sure of It, by Matthew Mayo Neural Networks and Deep Learning: A Textbook, by Charu Aggarwal (*) You Aren't So Smart: Cognitive Biases are Making Sure of It, by Matthew Mayo How many data scientists are there and is there a shortage?, by Gregory Piatetsky You Aren't So Smart: Cognitive Biases are Making Sure of It, by Matthew Mayo A Winning Game Plan For Building Your Data Science Team, by William Schmarzo What on earth is data science?, by Cassie Kozyrkov Everything You Need to Know About AutoML and Neural Architecture Search, by George Seif The Data Science of "Someone Like You" or Sentiment Analysis of Adele's Songs, by Preetish Panda You Aren't So Smart: Cognitive Biases are Making Sure of It, by Matthew Mayo What on earth is data science?, by Cassie Kozyrkov


Machine learning and AI – ensuring fairness in smart cities

#artificialintelligence

Digital technologies and AI offer a new wave of opportunities to turn data into actionable insights – creating a balance between social, environmental, and economic opportunities. In 2018, it's safe to say that the Internet, the World Wide Web, and the myriad of technologies derived from their development are all here to stay. With the ceaseless amalgamation of these various innovations, engineers are creating a cyber-physical world where pervasively interconnected objects, things, and processes can potentially unlock a breadth of unprecedented opportunities. However, I should point out that encapsulating the entire medley of possibilities afforded by these technologies is a considerable endeavour requiring a far longer and more comprehensive overview – perhaps in the form of a book, or three – than this article can offer in isolation. More specifically, I'll be focusing on the potential for us to optimally – and transparently – manage and operate city-wide infrastructure.


Towards Differentially Private Truth Discovery for Crowd Sensing Systems

arXiv.org Artificial Intelligence

Nowadays, crowd sensing becomes increasingly more popular due to the ubiquitous usage of mobile devices. However, the quality of such human-generated sensory data varies significantly among different users. To better utilize sensory data, the problem of truth discovery, whose goal is to estimate user quality and infer reliable aggregated results through quality-aware data aggregation, has emerged as a hot topic. Although the existing truth discovery approaches can provide reliable aggregated results, they fail to protect the private information of individual users. Moreover, crowd sensing systems typically involve a large number of participants, making encryption or secure multi-party computation based solutions difficult to deploy. To address these challenges, in this paper, we propose an efficient privacy-preserving truth discovery mechanism with theoretical guarantees of both utility and privacy. The key idea of the proposed mechanism is to perturb data from each user independently and then conduct weighted aggregation among users' perturbed data. The proposed approach is able to assign user weights based on information quality, and thus the aggregated results will not deviate much from the true results even when large noise is added. We adapt local differential privacy definition to this privacy-preserving task and demonstrate the proposed mechanism can satisfy local differential privacy while preserving high aggregation accuracy. We formally quantify utility and privacy trade-off and further verify the claim by experiments on both synthetic data and a real-world crowd sensing system.


Big data in GIS environment - Geospatial World

#artificialintelligence

GIS is virtual world, a world that is represented by points, polygon, line and graph. Processing of these datasets has always been a challenge since the day GIS got established as a field. Processing of huge data has always been a long standing problem not only in traditional Information and Technology(IT) sectors but also in the Geo-Spatial domain. However recent development in the both hardware and software infrastructure has enabled processing of huge data sets. This has given big push and new direction to those industries which were marred by slow data processing capabilities.


Contextual Bandits with Cross-learning

arXiv.org Machine Learning

In the classical contextual bandits problem, in each round $t$, a learner observes some context $c$, chooses some action $a$ to perform, and receives some reward $r_{a,t}(c)$. We consider the variant of this problem where in addition to receiving the reward $r_{a,t}(c)$, the learner also learns the values of $r_{a,t}(c')$ for all other contexts $c'$; i.e., the rewards that would have been achieved by performing that action under different contexts. This variant arises in several strategic settings, such as learning how to bid in non-truthful repeated auctions (in this setting the context is the decision maker's private valuation for each auction). We call this problem the contextual bandits problem with cross-learning. The best algorithms for the classical contextual bandits problem achieve $\tilde{O}(\sqrt{CKT})$ regret against all stationary policies, where $C$ is the number of contexts, $K$ the number of actions, and $T$ the number of rounds. We demonstrate algorithms for the contextual bandits problem with cross-learning that remove the dependence on $C$ and achieve regret $O(\sqrt{KT})$ (when contexts are stochastic with known distribution), $\tilde{O}(K^{1/3}T^{2/3})$ (when contexts are stochastic with unknown distribution), and $\tilde{O}(\sqrt{KT})$ (when contexts are adversarial but rewards are stochastic).


Artificial Intelligence, Machine Learning and Big Data - A Comprehensive Report

#artificialintelligence

Artificial Intelligence and Machine Learning are the hottest jobs in the industry right now. For instance, did you know that more than 50,000 positions related to Data and Analytics are currently vacant in India? We are excited to release a comprehensive report together with Great Learning on how AI, ML and Big Data are changing and evolving the world around us. Additionally, this report aims to provide an overview of the kind of career opportunities available in these fields right now, and the different roles we might see in the future. The aim behind creating this report is to provide our Data Science community with the context of changes happening at a macro level, and how they can best prepare for these upcoming changes.