Generating Performance Models for Big Data Systems
- Proactively evaluate the performance of big data applications and clusters
- Estimate performance changes of (new introduced) applications, hardware, and data workload
- Gain insight into performance bottlenecks and visualize components and their resource demands
BD.PMG enables performance engineers to proactively evaluate the performance of big data applications and clusters. It allows to estimate the performance for changes in and/or new introductions of applications, hardware resources, and data workload. BD.PMG also serves as visualization to gain insight into performance bottlenecks of one’s cluster and highlighting application components (e.g., a map operation) and their resource demands.
In particular, BD.PMG fetches measurements from big data clusters and software frameworks (i.e., YARN, MapReduce, Spark) and automatically generates performance models for applications selected by the performance engineer. Currently, Apache YARN, Apache MapReduce, and Apache Spark are supported.
Kroß, Johannes; Krcmar, Helmut (2017):
"Model-based Performance Evaluation of Batch and Stream Applications for Big Data". In: Proceedings of the 2017 IEEE 25th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), September 20-22, Banff, Canada 2017. (bib)
Kroß, Johannes; Krcmar, Helmut (2016):
"Modeling and Simulating Apache Spark Streaming Applications". Softwaretechnik-Trends 36 (4), 2016.
Kroß, Johannes; Brunnert, Andreas; Prehofer, Christian; Runkler, Thomas; Krcmar, Helmut (2015):
"Stream Processing On Demand for Lambda Architectures" In: Computer Performance Engineering 12th European Workshop on Performance Engineering (EPEW), August 31 - September 1, 2015, Madrid, Spain. (bib)