A Study on Using Uncertain Time Series Matching Algorithms in MapReduce Applications

Rizvandi, Nikzad Babaii, Taheri, Javid, Zomaya, Albert Y., Moraveji, Reza

Jan-17-2013–arXiv.org Artificial Intelligence

This paper has been originally published as "A study on using uncertain time series matching algorithms for MapReduce applications" in Journal of Concurrency and Computation: Practice and Experience - Special Issue in Cloud Computing Scalability, John Wiley Publisher. We realized that the original title is not appropriate and cannot be found by people working in this area. Therefore, this text is for changing the title but the original paper can be found at the rest of this text (starting from the next page). For citation, please cite the original title as: NB Rizvandi, J Taheri, R Moraveji, AY Zomaya, "A study on using uncertain time series matching algorithms for MapReduce applications", Journal of Concurrency and Computation: Practice and Experience - Special Issue in Cloud Computing Scalability, John Wiley Publisher (2012) A Study on Using Uncertain Time Series Matching Algorithms for MapReduce Applications Abstract--In this paper, we study CPU utilization time patterns of several MapReduce applications. After extracting running patterns of several applications, the patterns along with their statistical information are saved in a reference database to be later used to tweak system parameters to efficiently execute future unknown applications. To achieve this goal, CPU utilization patterns of new applications along with its statistical information are compared with the already known ones in the reference database to find/predict their most probable execution patterns. Because of different pattern lengths, the Dynamic Time Warping (DTW) is utilized for such comparison; a statistical analysis is then applied to DTWs' outcomes to select the most suitable candidates. Furthermore, under a hypothesis, we also proposed another algorithm to classify applications under similar CPU utilization patterns. Finally, dependency between minimum distance/maximum similarity of applications and their scalability (in both input size and number of virtual nodes) are studied.

data mining, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Jan-17-2013

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- North America > United States (0.93)

Genre:
- Research Report (0.64)

Industry:
- Energy (0.68)

Technology:
- Information Technology
  - Cloud Computing (1.00)
  - Data Science > Data Mining (0.94)
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Natural Language (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found