Genislab builds better products and faster go-to-market with Lean project man...
Continuous Performance Testing for Microservices
1. Continuous Performance
Testing for Microservices
Vincenzo Ferme
Reliable Software Systems
Institute for Software Technologies
University of Stuttgart, Germany
HPI Future SOC Lab Day
info@vincenzoferme.it
3. 2
1. Continuous Software Development Environments
2. Continuous Performance Testing of Microservices:
a. Challenges
b. Our Approach
c. Preliminary Results
d. BenchFlow Automation Framework
Outline
4. 2
1. Continuous Software Development Environments
2. Continuous Performance Testing of Microservices:
a. Challenges
b. Our Approach
c. Preliminary Results
d. BenchFlow Automation Framework
3. Approach Extensions:
a. ContinuITy Framework
Outline
5. 2
1. Continuous Software Development Environments
2. Continuous Performance Testing of Microservices:
a. Challenges
b. Our Approach
c. Preliminary Results
d. BenchFlow Automation Framework
3. Approach Extensions:
a. ContinuITy Framework
4. Future Work
Outline
10. 4
The system changes frequently:
- Need to keep the test definition updated (manual is not
an option)
Challenges
11. 4
The system changes frequently:
- Need to keep the test definition updated (manual is not
an option)
Too many tests to be executed:
- Need to reduce the number of executed tests
Challenges
12. 4
The system changes frequently:
- Need to keep the test definition updated (manual is not
an option)
Too many tests to be executed:
- Need to reduce the number of executed tests
Need to integrate tests in delivery pipelines:
- Need to have complete automation for test execution
- Need to have clear KPIs to be evaluated in the pipeline
Challenges
55. BenchFlow Automation Framework
11
BenchFlow
BF
DSL
- Workload Modeling
- SUT Configuration
- SUT Deployment
- BenchFlow
Adapters
Faban
Minio
Data Collection
Data Analysis
Test Execution
https://github.com/benchflow
56. 12
Generation
BenchFlow Automation Framework
[Ferme et al.]
Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development
Environments. In Proc. of ICPE 2018. 261-272.
58. 12
Generation
Test Bundle
Test
Generation
BenchFlow Automation Framework
[Ferme et al.]
Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development
Environments. In Proc. of ICPE 2018. 261-272.
59. 12
Generation
Test Bundle
Test
Generation
Experiment Generation
Experiment Bundle
BenchFlow Automation Framework
[Ferme et al.]
Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development
Environments. In Proc. of ICPE 2018. 261-272.
60. 12
GenerationExecution
Test Bundle
Experiment Execution
Test
Generation
Experiment Generation
Experiment Bundle
BenchFlow Automation Framework
[Ferme et al.]
Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development
Environments. In Proc. of ICPE 2018. 261-272.
62. 12
GenerationExecutionAnalysis
Test Bundle
Failures
Experiment Execution
Result Analysis
Test
Generation
Experiment Generation
Experiment Bundle
Success
BenchFlow Automation Framework
[Ferme et al.]
Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development
Environments. In Proc. of ICPE 2018. 261-272.
63. 12
GenerationExecutionAnalysis
Test Bundle
Metrics
Failures
Experiment Execution
Result Analysis
Test
Generation
Experiment Generation
Experiment Bundle
Success
BenchFlow Automation Framework
[Ferme et al.]
Vincenzo Ferme and Cesare Pautasso. A Declarative Approach for Performance Tests Execution in Continuous Software Development
Environments. In Proc. of ICPE 2018. 261-272.
66. Approach: Summary
13
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
1,2
Baseline Test
Scalability Criteria
Deployment Conf.
BenchFlow
BF
3
67. Approach: Summary
13
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
1,2
Baseline Test
Scalability Criteria
Deployment Conf.
Domain Metric
0.73
4
BenchFlow
BF
3
68. Approach: Summary
13
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
1,2
Baseline Test
Scalability Criteria
Deployment Conf.
Domain Metric
0.73
4
BenchFlow
BF
3
1. Decide if the system is ready for
deployment
Examples of Use of Domain Metric KPI:
69. Approach: Summary
13
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
1,2
Baseline Test
Scalability Criteria
Deployment Conf.
Domain Metric
0.73
4
BenchFlow
BF
3
1. Decide if the system is ready for
deployment
2. Assess different deployment
configurations
Examples of Use of Domain Metric KPI:
73. Approach: Summary
15
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
1,2
Baseline Test
Scalability Criteria
Deployment Conf.
Domain Metric
0.73
4
BenchFlow
BF
3
Examples of Use of Domain Metric KPI:
1. Decide if the system is ready for
deployment
2. Assess different deployment
configurations
74. Approach: Summary
15
Production
Sampled Load Tests
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
1,2
Baseline Test
Scalability Criteria
Deployment Conf.
Domain Metric
0.73
4
BenchFlow
BF
3
3. Configure scaling plans and
autoscalers
Examples of Use of Domain Metric KPI:
1. Decide if the system is ready for
deployment
2. Assess different deployment
configurations
75. 16
Approach: Extensions
0.12 0.14 0.20 0.16 0.11
Observed load situations
Time
LoadLevel
Production Empirical Distribution of Load situations
50 200 350 500
Load Level
Rel.Freq.
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
Baseline Test
Sampled Load Tests
Scalability Criteria
Domain Metric
Test Results
Deployment Conf.
0.73
1
23
4
76. 16
Approach: Extensions
0.12 0.14 0.20 0.16 0.11
Observed load situations
Time
LoadLevel
Production Empirical Distribution of Load situations
50 200 350 500
Load Level
Rel.Freq.
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
Baseline Test
Sampled Load Tests
Scalability Criteria
Domain Metric
Test Results
Deployment Conf.
0.73
1
23
4
77. 16
Approach: Extensions
0.12 0.14 0.20 0.16 0.11
Observed load situations
Time
LoadLevel
Production Empirical Distribution of Load situations
50 200 350 500
Load Level
Rel.Freq.
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
Baseline Test
Sampled Load Tests
Scalability Criteria
Domain Metric
Test Results
Deployment Conf.
0.73
1
23
4
78. ContinuITy Framework
17
Production
[Schulz et al.]
Henning Schulz,Tobias Angerstein, and André van Hoorn.Towards Automating Representative Load Testing in Continuous Software Engineering. In
Proc. of ICPE 2018 Companion. 123-126.
79. ContinuITy Framework
18
Production Sampled Load TestsSampled Load Tests
Load Levels
API Calls
Op. Mix
[Schulz et al.]
Henning Schulz,Tobias Angerstein, and André van Hoorn.Towards Automating Representative Load Testing in Continuous Software Engineering. In
Proc. of ICPE 2018 Companion. 123-126.
80. ContinuITy Framework
18
Workload Model
Production Sampled Load TestsSampled Load Tests
Load Levels
API Calls
Op. Mix
[Schulz et al.]
Henning Schulz,Tobias Angerstein, and André van Hoorn.Towards Automating Representative Load Testing in Continuous Software Engineering. In
Proc. of ICPE 2018 Companion. 123-126.
81. ContinuITy Framework
18
APIs SpecificationWorkload Model
+
Production Sampled Load TestsSampled Load Tests
Load Levels
API Calls
Op. Mix
[Schulz et al.]
Henning Schulz,Tobias Angerstein, and André van Hoorn.Towards Automating Representative Load Testing in Continuous Software Engineering. In
Proc. of ICPE 2018 Companion. 123-126.
82. ContinuITy Framework
18
APIs Specification Annotations ModelWorkload Model
+ +
Production Sampled Load TestsSampled Load Tests
Load Levels
API Calls
Op. Mix
[Schulz et al.]
Henning Schulz,Tobias Angerstein, and André van Hoorn.Towards Automating Representative Load Testing in Continuous Software Engineering. In
Proc. of ICPE 2018 Companion. 123-126.
84. Future Work
19
Automated update of tests definition:
- Automatically update load levels, API calls, operations mix
Declarative Definitions for Load Testing:
- Make performance engineering activities more accessible
85. Future Work
19
Automated update of tests definition:
- Automatically update load levels, API calls, operations mix
Declarative Definitions for Load Testing:
- Make performance engineering activities more accessible
Declarative Performance Test Regression:
- Extend the proposed approach to continuously assess
regressions in performance and PASS/FAIL CI/CD pipelines
86. Future Work
19
Automated update of tests definition:
- Automatically update load levels, API calls, operations mix
Declarative Definitions for Load Testing:
- Make performance engineering activities more accessible
Declarative Performance Test Regression:
- Extend the proposed approach to continuously assess
regressions in performance and PASS/FAIL CI/CD pipelines
Prioritisation and Selection of Load Tests:
- Exploit performance models to reduce even more the # of tests
87. 20
Highlights
4
The system changes frequently:
- Need to keep the test definition updated
Too many tests to be executed:
- Need to reduce the number of executed tests
Need to integrate tests in delivery pipelines:
- Need to have complete automation for test execution
- Need to have clear KPIs to be evaluated in the pipeline
Challenges
Challenges
88. 20
Highlights
4
The system changes frequently:
- Need to keep the test definition updated
Too many tests to be executed:
- Need to reduce the number of executed tests
Need to integrate tests in delivery pipelines:
- Need to have complete automation for test execution
- Need to have clear KPIs to be evaluated in the pipeline
Challenges
Challenges Our Approach
5
Our Approach
Observed load situations
Time
LoadLevel
Production
Baseline Test
Sampled Load Tests
Scalability Criteria
Deployment Conf.
Empirical Distribution of Load situations
50 200 350 500
Load Level
Rel.Freq.
1
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
2
0.12 0.14 0.20 0.16 0.11
Test Results
3
Domain Metric
0.73
4
89. 20
Highlights
4
The system changes frequently:
- Need to keep the test definition updated
Too many tests to be executed:
- Need to reduce the number of executed tests
Need to integrate tests in delivery pipelines:
- Need to have complete automation for test execution
- Need to have clear KPIs to be evaluated in the pipeline
Challenges
Challenges Our Approach
5
Our Approach
Observed load situations
Time
LoadLevel
Production
Baseline Test
Sampled Load Tests
Scalability Criteria
Deployment Conf.
Empirical Distribution of Load situations
50 200 350 500
Load Level
Rel.Freq.
1
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
2
0.12 0.14 0.20 0.16 0.11
Test Results
3
Domain Metric
0.73
4
Preliminary Experiments
6
Preliminary Experiments
Stab = avg + 3σ
Production
12 microservices
Custom Op. Mix
Sampled Load Tests
Empirical Distribution of Load situations
6 Load Levels
50,100,150,200,250,300
Users
1,2
Deployment Config.
10 configurations
Replicas
CPURAM
3
90. 20
Highlights
4
The system changes frequently:
- Need to keep the test definition updated
Too many tests to be executed:
- Need to reduce the number of executed tests
Need to integrate tests in delivery pipelines:
- Need to have complete automation for test execution
- Need to have clear KPIs to be evaluated in the pipeline
Challenges
Challenges Our Approach
5
Our Approach
Observed load situations
Time
LoadLevel
Production
Baseline Test
Sampled Load Tests
Scalability Criteria
Deployment Conf.
Empirical Distribution of Load situations
50 200 350 500
Load Level
Rel.Freq.
1
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
2
0.12 0.14 0.20 0.16 0.11
Test Results
3
Domain Metric
0.73
4
Preliminary Experiments
6
Preliminary Experiments
Stab = avg + 3σ
Production
12 microservices
Custom Op. Mix
Sampled Load Tests
Empirical Distribution of Load situations
6 Load Levels
50,100,150,200,250,300
Users
1,2
Deployment Config.
10 configurations
Replicas
CPURAM
3
Extensions and Future Work
16
Approach: Extensions
0.12 0.14 0.20 0.16 0.11
Observed load situations
Time
LoadLevel
Production Empirical Distribution of Load situations
50 200 350 500
Load Level
Rel.Freq.
Empirical Distribution of Load situations
100 300 500
Sampled Load Level
Aggr.Rel.Freq.
Baseline Test
Sampled Load Tests
Scalability Criteria
Domain Metric
Test Results
Deployment Conf.
0.73
1
23
4