Generating Performance Models for Big Data Systems

  • Proactively evaluate the performance of big data applications and clusters
  • Estimate performance changes of (new introduced) applications, hardware, and data workload
  • Gain insight into performance bottlenecks and visualize components and their resource demands

BD.PMG enables performance engineers to proactively evaluate the performance of big data applications and clusters. It allows to estimate the performance for changes in and/or new introductions of applications, hardware resources, and data workload. BD.PMG also serves as visualization to gain insight into performance bottlenecks of one’s cluster and highlighting application components (e.g., a map operation) and their resource demands.
In particular, BD.PMG fetches measurements from big data clusters and software frameworks (i.e., YARN, MapReduce, Spark) and automatically generates performance models for applications selected by the performance engineer. Currently, Apache YARN, Apache MapReduce, and Apache Spark are supported.

Performance Management Work Tools


Kroß, Johannes; Brunnert, Andreas; Krcmar, Helmut (2016):
"Modeling Big Data Systems by Extending the Palladio Component Model" Softwaretechnik-Trends, 2016. Accepted / to be published.

Kroß, Johannes; Brunnert, Andreas; Prehofer, Christian; Runkler, Thomas; Krcmar, Helmut (2015):
"Stream Processing On Demand for Lambda Architectures" In: Computer Performance Engineering 12th European Workshop on Performance Engineering (EPEW), August 31 - September 1, 2015, Madrid, Spain. (bib)


Johannes Kroß

Johannes Kroß

+49 89 3603522 18