Goto

Collaborating Authors


Forecasting Granular Audience Size for Online Advertising

arXiv.org Artificial Intelligence

Orchestration of campaigns for online display advertising requires marketers to forecast audience size at the granularity of specific attributes of web traffic, characterized by the categorical nature of all attributes (e.g. {US, Chrome, Mobile}). With each attribute taking many values, the very large attribute combination set makes estimating audience size for any specific attribute combination challenging. We modify Eclat, a frequent itemset mining (FIM) algorithm, to accommodate categorical variables. For consequent frequent and infrequent itemsets, we then provide forecasts using time series analysis with conditional probabilities to aid approximation. An extensive simulation, based on typical characteristics of audience data, is built to stress test our modified-FIM approach. In two real datasets, comparison with baselines including neural network models, shows that our method lowers computation time of FIM for categorical data. On hold out samples we show that the proposed forecasting method outperforms these baselines.


FCP-Growth: Class Itemsets for Class Association Rules

AAAI Conferences

In this search, we focused on supervised learning task using association rules algorithms (association based classification). These algorithms, developed in unsupervised learning, extract all the rules whose the support and confidence exceed a prefixed threshold support. After extracting the frequent itemsets, (i.e their support exceeds the threshold support), algorithms subdivide these itemsets to build the rules, and keep only the rules whose confidence exceeds the threshold confidence. The extraction of class association rules, using these algorithms, have several problems, because of the rules' a posteriori filtering. In the first stage, one extracts useless frequent itemsets, those which do not contain class, whereas the second stage can be simplified, since an itemset containing the class gives place only to only one class rule. In order to be able to work with a low threshold support, we propose FCP-Growth an adaptation of FP-Growth which eliminates the frequent itemsets not containing a class. Moreover, to make the minority class be in advantage during the construction of the class itemsets, we adapt the threshold support, in order to use the same threshold support inside each class.


HybridMiner: Mining Maximal Frequent Itemsets Using Hybrid Database Representation Approach

arXiv.org Artificial Intelligence

In this paper we present a novel hybrid (arraybased layout and vertical bitmap layout) database representation approach for mining complete Maximal Frequent Itemset (MFI) on sparse and large datasets. Our work is novel in terms of scalability, item search order and two horizontal and vertical projection techniques. We also present a maximal algorithm using this hybrid database representation approach. Different experimental results on real and sparse benchmark datasets show that our approach is better than previous state of art maximal algorithms.


Ramp: Fast Frequent Itemset Mining with Efficient Bit-Vector Projection Technique

arXiv.org Artificial Intelligence

Mining frequent itemset using bit-vector representation approach is very efficient for dense type datasets, but highly inefficient for sparse datasets due to lack of any efficient bit-vector projection technique. In this paper we present a novel efficient bit-vector projection technique, for sparse and dense datasets. To check the efficiency of our bit-vector projection technique, we present a new frequent itemset mining algorithm Ramp (Real Algorithm for Mining Patterns) build upon our bit-vector projection technique. The performance of the Ramp is compared with the current best (all, maximal and closed) frequent itemset mining algorithms on benchmark datasets. Different experimental results on sparse and dense datasets show that mining frequent itemset using Ramp is faster than the current best algorithms, which show the effectiveness of our bit-vector projection idea. We also present a new local maximal frequent itemsets propagation and maximal itemset superset checking approach FastLMFI, build upon our PBR bit-vector projection technique. Our different computational experiments suggest that itemset maximality checking using FastLMFI is fast and efficient than a previous will known progressive focusing approach.