Mausam is an Associate Professor of Computer Science department at IIT Delhi, and an affiliate faculty member at University of Washington, Seattle. His research explores several threads in artificial intelligence, including scaling probabilistic planning algorithms, large-scale information extraction over the Web, and enabling complex computation over crowdsourced platforms. He received his PhD from University of Washington in 2007 and a B.Tech. ArnetMiner, a global citation aggregator, has rated Mausam as the 25th most influential scholar in AI for 2019. He was recently awarded the AAAI Senior Member status for his long-term participation in AAAI and distinction in the field of artificial intelligence.
With all the mass reach social media has these days, the power that comes with riding on its wave is simply hard to deny. With thousands of posts and tweets, there is seemingly no end to the chatter. But it is important to know if all that chatter is in favor of or against your agendas. Imagine launching a product that has become the talk of the town. But is all that talk good or bad?
AI is already widely used to gauge consumer sentiment. Amazon mines huge amounts of data about search and online purchasing behavior to uncover insights about what customers buy and predict what they might want to buy in the future. Netflix bases its movie recommendations on what you've watched in the past, for how long, and how favorably you review the show. In all of those cases, consumer sentiment is being gauged not by what consumers say they want but by their actual behavior. That's a crucial difference, and one that gives AI a significant advantage over traditional polls.
Sentiment Analysis can be defined as the process of analyzing text data and categorizing them into Positive, Negative, or Neutral sentiments. Sentiment Analysis is used in many cases like Social Media Monitoring, Customer service, Brand Monitoring, political campaigns, etc. Analyzing customer feedback such as social media conversations, product reviews, and survey responses allows companies to understand the customer's emotions better which is becoming more essential to meet their needs. It is almost impossible to manually sort thousands of social media conversations, customer reviews, and surveys. So we have to use either ML/DL to build a model that analyzes the text data and performs the required operations. The problem I am trying to solve here is part of this Kaggle competition.
Being to new text analytics, I haven't gotten the hang of my typical ML workflow given how long processes take to run in the commonly large feature space of text analytics. I would like to know what the typical strategy is to balance effort/time in terms of optimizing transformation decision, feature down-selection, and model tuning. In an effort to get a sense of which of the decision points above I should run further tuning on, I ran untuned RF, Logistic, Naive Bayes, SGD, and KNN models on (with cross validation). No clear decision point was commonly "better" in the resulting f-1 scores, and the difference is often noteworthy. As I have no bias towards a particular algorithm type (only the best f-1 score), I'm stuck in a quandry-- I have not successfully narrowed my decision space enough.
Natural Language Processing (NLP) is the area of research in Artificial Intelligence focused on processing and using Text and Speech data to create smart machines and create insights. One of nowadays most interesting NLP application is creating machines able to discuss with humans about complex topics. IBM Project Debater represents so far one of the most successful approaches in this area. All of these preprocessing techniques can be easily applied to different types of texts using standard Python NLP libraries such as NLTK and Spacy. Additionally, in order to extrapolate the language syntax and structure of our text, we can make use of techniques such as Parts of Speech (POS) Tagging and Shallow Parsing (Figure 1).
Natural Language Processing (NLP) is a branch of Artificial Intelligence (AI) that studies how machines understand human language. Its goal is to build systems that can make sense of text and perform tasks like translation, grammar checking, or topic classification. Companies are increasingly using NLP-equipped tools to gain insights from data and to automate routine tasks. This sentiment analyzer, for instance, can help brands detect emotions in text, such as negative comments on social media. But what exactly is Natural Language Processing?
Millions of people have been ditching Facebook and switching to Mountain View, CA-based social media network MeWe, touted to be the ad-free future of social networking. Advised by Sir Tim Berners-Lee (the inventor of the World Wide Web), MeWe has surged to 9 million users worldwide since its inception in 2013, and has zero paid marketing ads. MeWe CEO Mark Weinstein said in his recent TedX talk that although we check our phones 150 times per day out phones are more dependent on us than we are on them. He says that we are participating in the "greatest socio-economic event in human history" – 'surveillance capitalism'. The business model of Facebook and the other current social media giants is to track, analyse, and monetise our data.
Helen Dixon, head of Ireland's Data Protection Commission, in May submitted a draft decision to more than two dozen of the bloc's privacy regulators for review, as required under the law. Eleven regulators objected to the proposed ruling, sparking a lengthy dispute-resolution mechanism, she said. The contents of the draft decision haven't been disclosed. Twitter's European operations are based in Dublin. "It's a long process," Ms. Dixon said at The Wall Street Journal's virtual CIO Network conference.