By some estimates, 80% of an organization's data is unstructured content. This content includes web pages, call center transcripts, surveys, feedback forms, legal documents, forums, social media, and blog articles. Therefore, organizations must analyze not just transactional information but also textual content to gain insight and boost performance. A powerful way to analyze this textual content is by using text mining. Text mining typically applies machine learning techniques such as clustering, classification, association rules and predictive modeling.
Although reading books and watching lectures is a great way to learn analytics – it is best to start doing. However, it can be quite tricky to start doing when it comes to languages such as Python and R if someone does not have a coding background. Not only do you need to know what you are doing in terms of analytical procedures, but you also need to understand the nuances of programming languages which adds onto the list of things to learn to just get started. Therefore, the best middle ground between knowledge acquisition (books, videos, etc.) and conducting advanced analytics (Python, R, etc.) is by using open-source analytics software. These types of software are great for both knowledge acquisition and actually doing analysis as documentation is built into the software and you can start doing relatively complex tasks with only mouse clicks.
The RapidMiner Inc. data science platform enables users to perform predictive analytics and other advanced data analytics, including machine learning, data mining, text analytics, business analytics and visualization, with little or no coding required. RapidMiner integrates with several data source types, including Excel, Access, Oracle, IBM DB2, Microsoft SQL, Sybase, Ingres, MySQL, Postgres, IBM SPSS, dBASE, text files, and many other structured and unstructured data formats. The RapidMiner data science platform includes the following products. RapidMiner Studio provides an intuitive GUI client that enables users to design code-free analysis processes. It helps users more easily explore, blend and cleanse data, as well as build and validate models.
Recently Tom (@neuralmarket) and I had the chance to work together with Amanda Shiga (@AmandaShiga) from Nonlinear Digital to build web analytics process using RapidMiner. Amanda has an on-going pilot project to apply data mining techniques to clickstream and user behavior data collected from her client's website. The website has a number of value-weighted micro-conversions, such as newsletter signup, or downloading a whitepaper, or event registration. For online retailers, seeing the visitors convert to paying customers is the ultimate goal. The focus of web analytics nowadays has shifted from getting visitors to a website to turning the web visitors into high value customers.