Collaborating Authors


Data Mining: Market Basket Analysis with Apriori Algorithm


Some of us go to the grocery with a standard list; while some of us have a hard time sticking to our grocery shopping list, no matter how determined we are. No matter which type of person you are, retailers will always be experts at making various temptations to inflate your budget. Remember the time when you had the "Ohh, I might need this as well." Retailers boost their sales by relying on this one simple intuition. People that buy this will most likely want to buy that as well. People who buy bread will have a higher chance of buying butter together, therefore an experienced assortment manager will definitely know that having a discount on bread pushes the sales on butter as well.

Data Mining with Rattle


Rattle and R deliver a very sophisticated data mining environment. Data Mining with Rattle is a unique course that instructs with respect to both the concepts of data mining, as well as to the "hands-on" use of a popular, contemporary data mining software tool, "Data Miner," also known as the'Rattle' package in R software. Rattle is a popular GUI-based software tool which'fits on top of' R software. The course focuses on life-cycle issues, processes, and tasks related to supporting a'cradle-to-grave' data mining project. These include: data exploration and visualization; testing data for random variable family characteristics and distributional assumptions; transforming data by scale or by data type; performing cluster analyses; creating, analyzing and interpreting association rules; and creating and evaluating predictive models that may utilize: regression; generalized linear modeling (GLMs); decision trees; recursive partitioning; random forests; boosting; and/or support vector machine (SVM) paradigms. It is both a conceptual and a practical course as it teaches and instructs about data mining, and provides ample demonstrations of conducting data mining tasks using the Rattle R package.


AAAI Conferences

Abstracts of the invited talks presented at the AAAI Fall Symposium on Discovery Informatics: AI Takes a Science-Centered View on Big Data. Talks include A Data Lifecycle Approach to Discovery Informatics, Generating Biomedical Hypotheses Using Semantic Web Technologies, Socially Intelligent Science, Representing and Reasoning with Experimental and Quasi-Experimental Designs, Bioinformatics Computation of Metabolic Models from Sequenced Genomes, Climate Informatics: Recent Advances and Challenge Problems for Machine Learning in Climate Science, Predictive Modeling of Patient State and Therapy Optimization, Case Studies in Data-Driven Systems: Building Carbon Maps to Finding Neutrinos, Computational Analysis of Complex Human Disorders, and Look at This Gem: Automated Data Prioritization for Scientific Discovery of Exoplanets, Mineral Deposits, and More.

Top 100 Most Read Interviews of Influential Tech Leaders by Analytics Insight


'Business is an art and business leaders are artists', a well said a statement that is proving to be true every time a top leader takes amazing decisions for his organization. Although businesses rise and fall as times change, leaders never fail to be at the forefront to give their best. However, the key to long-term sustained success is great leadership and the ability of an executive to embrace the evolving trends. While talking about trends, the first thing that comes to our mind is artificial intelligence and disruptive technologies that are driving the next generation towards major digitization. The idea of technology came to practical usage when men thought that they needed machines to replace human activities. The core of such machines is to mimic or outperform human cognition. Although the concept of artificial intelligence came into existence in the 1950s, it didn't get fruition till the 1990s when technology hit the mainstream applications. Since then, the rise of technology has been enabled by exponentially faster and more powerful computers and large, complex datasets. Today, we have many futuristic technologies like machine learning, autonomous systems, data analytics, data science, and AR/VR in play. On the other hand, the enormous inflow of data has also contributed to this growth. In the digital world, development is highly reliant on technological advancement. Organizations across diverse industries are processing data to find insights and data-driven answers. Apart from laymen and consumers, it is the business leaders and corporate executives who have joined the bandwagon of the population to use artificial intelligence to the fullest. These trailblazing leaders are now increasingly using technology to optimize performance and experiment with new explorations. Their success story is what the world needs to hear. Analytics Insight has listed the top 100 such interviews that describe the journey of tech leaders and companies. Engineering and mining companies have faced a growing range of pressures in recent years, including price volatility, the need to drill down deeper to find new resources, and an industry-wide skills shortage. To address these challenges, many mining companies have embraced digital technology to enhance engineering design and develop smart mines'. Ausenco is a tech-savvy engineering company that delivers innovative, value-add consulting services, project delivery, asset operations, and maintenance solutions to the mining and metals, oil and gas, and industrial sectors….

KnAC: an approach for enhancing cluster analysis with background knowledge and explanations Artificial Intelligence

Pattern discovery in multidimensional data sets has been a subject of research since decades. There exists a wide spectrum of clustering algorithms that can be used for that purpose. However, their practical applications share in common the post-clustering phase, which concerns expert-based interpretation and analysis of the obtained results. We argue that this can be a bottleneck of the process, especially in the cases where domain knowledge exists prior to clustering. Such a situation requires not only a proper analysis of automatically discovered clusters, but also a conformance checking with existing knowledge. In this work, we present Knowledge Augmented Clustering (KnAC), which main goal is to confront expert-based labelling with automated clustering for the sake of updating and refining the former. Our solution does not depend on any ready clustering algorithm, nor introduce one. Instead KnAC can serve as an augmentation of an arbitrary clustering algorithm, making the approach robust and model-agnostic. We demonstrate the feasibility of our method on artificially, reproducible examples and on a real life use case scenario.

Why I'm joining Graphcore to lead our Developer Relations


Created video streaming and network acceleration algorithms that have found their way in screens both large and small. It was also the first patent I had granted with Samsung, it was an equally challenging and fulfilling development crunch. Developed a mixed-reality tool for connected classrooms. It was a project with Samsung's Advanced Solutions Lab where one day I'm prototyping hardware for its Tangible User Interface, and another day I'm coding pattern recognition. While Harvard Business Review's D.J. Patil and Thomas Davenport declared Data Scientist "The Sexiest Job of the 21st Century", I have been fortunate enough to personally have been in a position to lead companies in various industries which were going through their own big data and data science transformations.

Check Out the Top 10 Data Science Ideas for AI Startups in 2022


In the domain of artificial intelligence, data science has been a resonance for the last few years. As more industries and sectors are realizing the need for data science, more opportunities are finding their way. This article lists the top 10 data science ideas for AI startups in 2022. While modern-day digital marketing is an enormous system of channels to which marketers simply must onboard their brands, advertising online is much more complex than the channels alone. Artificial intelligence and data science are the main essences of digital marketing.

cgSpan: Pattern Mining in Conceptual Graphs Artificial Intelligence

Conceptual Graphs (CGs) are a graph-based knowledge representation formalism. In this paper we propose cgSpan a CG frequent pattern mining algorithm. It extends the DMGM-GSM algorithm that takes taxonomy-based labeled graphs as input; it includes three more kinds of knowledge of the CG formalism: (a) the fixed arity of relation nodes, handling graphs of neighborhoods centered on relations rather than graphs of nodes, (b) the signatures, avoiding patterns with concept types more general than the maximal types specified in signatures and (c) the inference rules, applying them during the pattern mining process. The experimental study highlights that cgSpan is a functional CG Frequent Pattern Mining algorithm and that including CGs specificities results in a faster algorithm with more expressive results and less redundancy with vocabulary.

A Mining Software Repository Extended Cookbook: Lessons learned from a literature review Artificial Intelligence

The main purpose of Mining Software Repositories (MSR) is to discover the latest enhancements and provide an insight into how to make improvements in a software project. In light of it, this paper updates the MSR findings of the original MSR Cookbook, by first conducting a systematic mapping study to elicit and analyze the state-of-the-art, and then proposing an extended version of the Cookbook. This extended Cookbook was built on four high-level themes, which were derived from the analysis of a list of 112 selected studies. Hence, it was used to consolidate the extended Cookbook as a contribution to practice and research in the following areas by: 1) including studies published in all available and relevant publication venues; 2) including and updating recommendations in all four high-level themes, with an increase of 84% in comments in this study when compared with the original MSR Cookbook; 3) summarizing the tools employed for each high-level theme; and 4) providing lessons learned for future studies. Thus, the extended Cookbook examined in this work can support new research projects, as upgraded recommendations and the lessons learned are available with the aid of samples and tools.

Modelling the transition to a low-carbon energy supply Artificial Intelligence

A transition to a low-carbon electricity supply is crucial to limit the impacts of climate change. Reducing carbon emissions could help prevent the world from reaching a tipping point, where runaway emissions are likely. Runaway emissions could lead to extremes in weather conditions around the world -- especially in problematic regions unable to cope with these conditions. However, the movement to a low-carbon energy supply can not happen instantaneously due to the existing fossil-fuel infrastructure and the requirement to maintain a reliable energy supply. Therefore, a low-carbon transition is required, however, the decisions various stakeholders should make over the coming decades to reduce these carbon emissions are not obvious. This is due to many long-term uncertainties, such as electricity, fuel and generation costs, human behaviour and the size of electricity demand. A well choreographed low-carbon transition is, therefore, required between all of the heterogenous actors in the system, as opposed to changing the behaviour of a single, centralised actor. The objective of this thesis is to create a novel, open-source agent-based model to better understand the manner in which the whole electricity market reacts to different factors using state-of-the-art machine learning and artificial intelligence methods. In contrast to other works, this thesis looks at both the long-term and short-term impact that different behaviours have on the electricity market by using these state-of-the-art methods.