The growth in popularity of the Internet, along with the rapid development of processing and storage technologies, has brought a paradigm shift in the way computing resources are provisioned. The technological trend today is to offer computing resources as services, leased and exposed via the Internet in a pay-as-you-go and on-demand fashion, called cloud computing...
Cloud infrastructure providers are trying to reduce their operating costs while offering their services with higher quality; something they strive to do to stand out among other providers. However, this is becoming challenging as providing such services needs operating large-scale and geographically distributed data centers. On the other hand, the main purpose of customers in using clouds is to achieve a high quality of service (QoS) while reducing their overall costs. Given the variety of offered services in terms of quality and cost, customers are encouraged to simultaneously use services from multiple cloud providers, known as multi-cloud. However, utilizing multi-cloud brings a new set of open challenges, such as selecting and composing the most appropriate services. Furthermore, despite the critical need of customers in having predictable service performance, in general cloud providers do not yet offer any performance guarantees. This gap is due to the complexity of practically addressing this issue in a cost-effective way. Such a complexity mainly comes from the dynamic nature of the cloud, unpredictable workloads, and non-linearity of mapping performance measurements into required cloud resources. Hence, controlling the trade-off between QoS and cost is a challenging goal for both cloud infrastructure providers and customers.
This thesis investigates models, algorithms, and mechanisms to tackle this trade-off from both perspectives. More specifically, in the scope of this thesis, we first take the cloud provider viewpoint by proposing an approach for virtual machine placement across geographically distributed infrastructures. In this approach, a Bayesian network model is used to address decision making under uncertainty. Then, we address the trade-off between QoS and cost from the cloud customer point of view by facilitating the utilization of the multi-cloud paradigm. We propose a service selection approach using prospect theory to rank the comparable service offerings. Furthermore, to guarantee the performance objectives of customers, we propose autonomic resource provisioning techniques. To this aim, control theory is used to design resource provisioning controllers, and fuzzy control is utilized to coordinate multiple controllers toward meeting the service performance objectives in a cost-effective manner. Finally, the evaluations of these contributions
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Quality of Service Control Mechanisms in Cloud Computing Environments
1. Soodeh Farokhi
PhD Defense Presentation - 21 Jan 2016
Vienna University of Technology
Advisor: Dr. Ivona Brandic
External Examiner: Prof. Erich Schikuta
Quality of Service Control Mechanisms
in Cloud Computing Environments
2. Cloud Computing Concepts
Cloud computing is a computing paradigm
• Exposing and leasing computing resources (e.g., server, storage, application) as services
• Cloud services are accessible via the Internet in a pay-as-you-go manner
Cloud Infrastructure-as-a-Service (IaaS)
• Offering virtual machines (VMs) (e.g., )
Service cost | Quality of Service (QoS)
• Measurable levels of non-functional attributes such as availability and performance
2
Infrastructure-as-a-Service
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
3. Cloud Computing Concepts
Cloud elasticity
• Provisioning and releasing resources on demand according to workload changes
o To meet QoS requirements in a cost-efficient way
• The main selling point of cloud computing
Types of elasticity
1. Horizontal elasticity (add or remove VMs)
2. Vertical elasticity (adjusting the capacity of each VM in terms of resources)
3
horizontal elasticity
verticalelasticity
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
4. Problem Statement
How to control the trade-off between QoS and cost
Cloud provider perspective
• Attracting more customers by offering high QoS | Reducing the operating cost
Cloud customer perspective
• Utilizing cloud services to achieve high QoS | Reducing the overall cost
4
Cloud customers
(e.g., application owners)
Cloud infrastructure providers
cloud data center
Two different perspectives of QoS and cost
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
5. Research Questions
RQ 1. How to efficiently control geographically distributed cloud data centers?
RQ 2. How to select the best combination of services from multiple cloud providers?
RQ 3. How to guarantee performance of cloud applications?
Structure of each RQ
• Motivation
• Contribution(s)
• Evaluation(s)
• Publication(s)
5Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
6. Efficient Control of Geo-distributed Data Centers
Cloud provider offering high QoS | reducing the operating cost
• Needs to enlarge and geographically distribute data centers
Considering time- and location-dependent parameters
• E.g., regional electricity price, power outage statistics, regional temperature
Decision making under uncertainty
6
Google data centers distribution map
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
8. Solution Overview
VM placement across geo-distributed cloud data centers (using Bayesian Networks)
Bayesian Networks (BN): A practical form of knowledge representation
1. Modelling the expert domain knowledge as a Bayesian Network (the decision model)
2. Quantify the benefit of each decision by multi-criteria decision analyzing on the BN
• Allocating new arrival VM requests
• Migrating the current VMs
Simulated Evaluation
• Using real-world data traces
• Using two VM placement baselines
• Cost & QoS violation rate as metrics
8
Google data centers distribution map
Soodeh Farokhi*, Dmytro Grygorenko*, and Ivona
Brandic. Virtual Machine Placement across Distributed
Data Centers using Bayesian Networks. 12th
International Conference on Economics of Grids, Clouds,
Systems, and Service (GECON 2015), Cluj-Napoca,
Romania, 15-17 Sep, 2015 (*contributed equally).
C 1
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
9. Research Questions
RQ 1. How to efficiently control geographically distributed cloud data centers?
RQ 2. How to select the best combination of services from multiple cloud providers?
RQ 3. How to guarantee performance of cloud applications?
9Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
10. Selecting Services from Multiple Clouds
Cloud customer achieving a more variety of cost and QoS offerings
Adapting a Multi-Cloud model
• Simultaneously uses services from multiple cloud providers
Considering functional and non-functional requirements (QoS, cost | priority)
10
Multiple cloud providesComputer aided design (CAD) application
Processing using GPU (4 large VMs)
Cost
CAD application UI (1 small VM)
Low latency
CAD models (2 large storages)
Highly available
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
12. Solution Overview
12
Soodeh Farokhi, Foued Jrad, Ivona Brandic, and Achim Streit. HS4MC: Hierarchical
SLA-based Service Selection for Multi-Cloud Environments. In Proceedings of the
4th International Conference on Cloud Computing and Services Science, (CLOSER’14)
Multi-Cloud Special Session, pp. 722-734, Barcelona, Spain, April 3-5, 2014.
Soodeh Farokhi. Towards an SLA-based Service Allocation in Multi-Cloud
Environments. Doctoral Symposium 14th IEEE/ACM International Symposium on
Cluster, Cloud and Grid Computing (CCGrid’14), pp. 591-594, USA, May 26-29, 2014.
C 2
QoS-aware Multi-Cloud service selection (using Prospect Theory)
Prospect Theory as a behavioral economic theory
• Describing the way people choose between different options under uncertainty
• An alternative decision making model for Utility Theory
• More realistic in calculating the user satisfaction
Ranking comparable service offerings using Prospect Theory
• Selecting and composing the best set of services (cost and QoS)
Simulated Evaluation
• Using real-world cloud data (12 commercial clouds)
• A utility-based algorithm [Jrad et al., CSE 2013] as a baseline
• Cost and QoS satisfaction as metrics CAD composite service in a Multi-Cloud
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
13. Research Questions
RQ 1. How to efficiently control geographically distributed cloud data centers?
RQ 2. How to select the best combination of services from multiple cloud providers?
RQ 3. How to guarantee performance of cloud applications?
• Performance is a key for success
“A page load slowdown of just 1sec could cost Amazon $1.6 billion in sales each year” [Eaton, FC 2012]
• No cloud provider guarantees it! Complexity of a practical solution
• Unpredictable workload
13Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
resource
resource
cloud infrastructure
application
PERFORMANCE
14. Performance Guarantee
Over-provisioning
Performance is satisfied but
wasting resources & how to predict?
Provisioning for normal workload
Performance is violated at the peak period
Adaptive solutions?!
(resource elasticity)
Provisioning resources on demand!
14
Unpredictable workload of web applications [Jamshidi, Dagstuhl talk 2014]
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
provisioned resource
workloadworkload
provisioned resource
workload
provisioned resource
15. Self-adaptive Cloud Applications
15
Self-adaptive systems achieved by autonomic computing (a closed control loop)
• MAPE Loop (Monitor-Analysis-Plan-Execute): A software engineering viewpoint
• Feedback control loop: A control engineering perspective
Using Control Theory
• As a systematic way for designing feedback loops
to fast & robustly handle unpredictable runtime changes
• Realizing self-adaptive cloud applications via feedback control loop
How to design the controller?
Soodeh Farokhi, Pooyan Jamshidi, Ivona
Brandic, and Erik Elmroth. Self-adaptation
Challenges for Cloud-based Applications:
A Control Theoretic Perspective. 10th
International Workshop on Feedback
Computing (Feedback Computing’15),
Seattle, USA, April 13, 2015.
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
feedback loop
A standard feedback control loop
17. Performance-based Memory Controller (PMC)
Following a control design process to guarantee the performance of cloud applications
Based on our experiments, choosing the allocated memory as a control knob
Devising the system model (α)
• Capturing the relationship between the allocated memory and response time
o Using linear regression on the past measured data
Designing the controller
• Control formula [1, 2]
17
[1] [Filieri et al., SEAMS 2015]
[2] [Filieri et al., ICSE 2013]
𝑒𝑟𝑟𝑜𝑟𝑖 = 𝑟𝑡𝑖 − 𝑟𝑡𝑖
Cloud
application
workload
measured RT
+
-
desired RT
𝑒𝑟𝑟𝑜𝑟𝑖 Memory
Controller
(PMC)
𝑟𝑡 𝑟𝑡𝑖
C 3
Identify the goals
Identify the
control knobs
Devise the system
model
Design the
controller
Implement the
controller
Validate the
controlled system
The control design process [Filieri et al., SEAMS 2015]
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
𝑐𝑜𝑛𝑡𝑜𝑙𝑖 = 𝑐𝑜𝑛𝑡𝑜𝑙𝑖−1 −
1−𝑝𝑜𝑙𝑒
α
𝑒𝑟𝑟𝑜𝑟𝑖
𝑚𝑒𝑚𝑖 = 𝑐𝑜𝑛𝑡𝑟𝑜𝑙𝑖 . 𝑚𝑒𝑚 𝑚𝑎𝑥 − 𝑚𝑒𝑚 𝑚𝑖𝑛 + 𝑚𝑒𝑚 𝑚𝑖𝑛
memory size
𝑐𝑜𝑛𝑡𝑟𝑜𝑙𝑖 𝑚𝑒𝑚𝑖
18. PMC Evaluation: Experimental Setup
An interactive benchmark application
• RUBBoS (a Slashdot-like app.)
• Sending memory-intensive requests
Real-world workload
• Wikipedia & FIFA traces
Process (3 experiments)
• Time-series analysis
• Aggregate analysis
Baseline: static provisioning
18
The patterns of user requests for the Wikipedia and FIFA websites
C 3
server side
VM2
client side
http GET request
control side
VM1
response time
3 2
workload
1
memory
controller
memory
controller4
Physical machine (server)
KVM hypervisor client
RUBBoS
MySQL
CPU (fixed)
memory (fixed)
RUBBoS
Apache
CPU (fixed)
elastic mem
Appmonitor
Overview of the experimental setup
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
19. PMC Evaluation: Results
Aggregate analysis of RUBBoS (desired RT < 600ms)
Summary of Results
• Meeting the desired performance, while saving on the memory usage (cost)
compared to the over-provisioning strategy
o Up to 47% (Wikipedia) | 57% (FIFA)
• Outperforming baselines in terms of meeting performance with lower memory usage
19
Soodeh Farokhi, Pooyan Jamshidi, Drazen Lucanin,
and Ivona Brandic. Performance-based Vertical
Memory Elasticity. 12th IEEE International
Conference on Autonomic Computing (ICAC‘15),
Gronoble, France, July 7-10, 2015.
C 3
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
Results of FIFA workload traceResults of Wikipedia workload trace
20. Research Questions
RQ 1. How to efficiently control geographically distributed cloud data centers?
RQ 2. How to select the best combination of services from multiple cloud providers?
RQ 3. How to guarantee performance of cloud applications?
• PMC does not have an insight into the current value of resource utilization
How to achieve a more efficient resource usage?
20Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
performance
elastic memory
cloud infrastructure
fixed CPU
application
memory
controller
PERFORMANCE
22. Resource elasticity decision making strategies
Capacity-based
• Based on resource utilization ignoring performance
Performance-based
• Considering performance
Hybrid
• Performance and resource utilization
Hybrid memory controller
F (response time, memory utilization) allocated memory
Following the introduced control design process [Filieri et al., SEAMS 2015]
• Using an adaptive system model
22
C 4
resource
capacity
measured application performance
Performance-
based resource
controller
desired
performance
measured resource utilization
capacity-
based resource
controller
Hybrid Memory Controller (HMC)
Hybrid resource
elasticity controller
application
VM
Hypervisor
application
&
VM
monitor
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
23. HMC Evaluation: Experimental Setup
Using RUBBoS under
• Real-world workload: Wikipedia & FIFA traces
• Synthetic workload: Open & closed user loop models [Schroeder et al., NSDI 2006]
Baselines
• A performance-based memory controller (PMC) [Contibution #3]
• A capacity-based memory controller (CMC) [Moltó et al., PCS 2013]
Process (8 experiments)
• Time-series analysis | Aggregate analysis
C 4
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion 23
24. HMC Evaluation: Results
Aggregate analysis of RUBBoS under Wikipedia workload (desired RT < 20ms)
Summary of Results
• Meet the desired RT by adjusting the right amount of memory
• HMC achieves
oThe best stability of RT| lowest memory usage| relatively high memory utilization
24
C 4
Soodeh Farokhi, Pooyan Jamshidi, Ewnetu Bayuh Lakew, Ivona
Brandic, and Erik Elmroth. A Hybrid Cloud Controller for
Vertical Memory Elasticity: A Control-theoretic Approach.
Future Generation Computer Systems (FGCS Journal), The
International Journal of Grid Computing and eScience,
Elsevier (under revision).
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
25. Research Questions
RQ 1. How to efficiently control geographically distributed cloud data centers?
RQ 2. How to select the best combination of services from multiple cloud providers?
RQ 3. How to guarantee performance of cloud applications?
• Performance-based and hybrid controllers only focus on a single recourse (memory)
How to detect when, how much of which resources are needed at runtime?
25
elastic memory
cloud infrastructure
fixed CPU
application
elastic CPU
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
PERFORMANCE
performance
memory
controller
memory utilization
27. Fuzzy Coordination Approach
Elasticity reasoning for multiple resources
1. Building the fuzzy knowledge base
• Extracting the fuzzy rules
2. Elasticity reasoning by Fuzzy Controller
3. Determining the final resource capacities by CPU Controller [3] & Memory Controller [4]
27
The architectural overview of fuzzy coordination approach
CPU
Controller
Memory
Controller
application
VM
# CPU
memory
Fuzzy Controller
mem coefficient
Hypervisor
application
&
VM
monitor
C 5
[3] [Lakew et al., UCC 2014]
[4] [Contribution #3]
CPU coefficient
A sample fuzzy rule
fuzzy knowledge base
Input variables
[0,100]
Output variables
[-1,1]
response
time
CPU
utilization
memory
utilization
CPU
coefficient
memory
coefficient
Slow High Low 1.0 -0.3
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
28. 28
Evaluation: Experimental Setup
3 interactive benchmark applications
• Olio (Amazon-like app.)
• RUBiS (eBay-like app.)
• RUBBoS
Synthetic workloads
• Open & closed user loop models
Process (10 experiments)
• Aggregate analysis
• Time-series analysis
Baseline
• CPU & memory controllers without coordination
Overview of the experimental setup
C 5
server side
VM2
CPU
Controller
Memory
Controller
Fuzzy
Controller
control side
memory & CPU utilization
response time
elastic mem
Xen hypervisor
App/VMmonitor
Benchmark
applications
Apache
elastic CPU
elastic mem
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
29. Evaluation: Results
Olio benchmark application |under closed user loop workload |desired RT < 0.5sec
Summary of Results
• With careful coordination of elasticity controllers meeting the desired performance -
achieving more efficient resources usage avoiding over- and under-provisioning
29
C 5
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
Soodeh Farokhi*, Ewnetu Bayuh Lakew*, Cristian Klein,
Ivona Brandic, and Erik Elmroth. Coordinating CPU and
Memory Elasticity Controllers to Meet Responses Time
Constraints. IEEE International Conference on Cloud and
Autonomic Computing (CAC’15), Cambridge, MA, USA,
21-24 Sep, 2015 (* contributed equally).
(RT) coordination
(RT) baseline
(CPU usage) coordination
(CPU usage) baseline
(mem usage) coordination
(mem usage) baseline
CPUusage
(#cores)
memusage
(GB)
31. Summary
Controlling the trade-off between QoS and cost
Adapting Bayesian Networks handling uncertainty of cloud control decisions
Applying Prospect Theory at Multi-Cloud service selection better user satisfaction
Adapting Control Theory guaranteeing performance of cloud applications
• Designing controller to realize performance-based vertical memory elasticity
Proposing hybrid elasticity efficient resource usage & performance guarantee
Utilizing Fuzzy Logic multiple resource elasticity reasoning under uncertainty
• Proposing a generic coordination technique can be used for other cloud resources
31Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
32. Conclusion
Self-adaptive cloud applications
• Enabler for large-scale and distributed modern applications (performance-sensitive users)
Vertical resource elasticity needs more attention in both research and industry
• Enabler for efficient resource utilization and realizing Resource-as-a-Service
Future Work
Covering the runtime activities of Multi-Cloud service allocation
Supporting both vertical and horizontal elasticity as a comprehensive solution
Applying the proposed elasticity solutions on a Multi-Cloud environment
32Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
33. Academic Activities
Projects
• HALEY (holistic energy efficient approach for the management of hybrid clouds), led by Dr. Ivona Brandic
• ARiSE (Austrian Society for Rigorous Systems Engineering), Prof. Helmut Veith
Main collaborators
• Dr. Pooyan Jamshidi (post-doc)
• Prof. Erik Elmroth
• Dr. Ewnetu Lakew (post-doc)
Research visit
• Carried out at Umea University, Sweden, Feb-March 2015, granted by STSM COST Action (ACROSS)
Other activities
• Attending 4 conferences, and 5 workshops 5 Scientific talks
• Co-advising 1 master student
• Providing reviewing service for 4 journals, and 12 international conferences
33
34. Scientific Publications
Soodeh Farokhi, Pooyan Jamshidi, Ewnetu Bayuh Lakew, Ivona Brandic, and Erik Elmroth. A Hybrid Cloud Controller for Vertical Memory
Elasticity: A Control-theoretic Approach. Future Generation Computer Systems (FGCS), The International Journal of Grid Computing and
eScience, Elsevier (under revision).
Soodeh Farokhi*, Ewnetu Bayuh Lakew*, Cristian Klein, Ivona Brandic, and Erik Elmroth. Coordinating CPU and Memory Elasticity
Controllers to Meet Responses Time Constraints. IEEE International Conference on Cloud and Autonomic Computing (CAC 2015),
Cambridge, MA, USA, 21-24 Sep, 2015 (* contributed equally).
Soodeh Farokhi*, Dmytro Grygorenko*, and Ivona Brandic. VM Placement across Distributed Data Centers using Bayesian Networks. 12th
International Conference on Economics of Grids, Clouds, Systems, and Service (GECON 2015), Cluj-Napoca, Romania, 15-17 Sep, 2015
(* contributed equally). 33%
Soodeh Farokhi, Pooyan Jamshidi, Ivona Brandic, and Erik Elmroth. Self-adaptation Challenges for Cloud-based Applications: A Control
Theoretic Perspective. 10th International Workshop on Feedback Computing (Feedback Computing 2015), Seattle, USA, April 13, 2015.
Soodeh Farokhi, Pooyan Jamshidi, Drazen Lucanin, and Ivona Brandic. Performance-based Vertical Memory Elasticity. 12th IEEE
International Conference on Autonomic Computing (ICAC 2015), Gronoble, France, July 7-10, 2015. rate: 27%
Soodeh Farokhi. Towards an SLA-based Service Allocation in Multi-Cloud Environments. Doctoral Symposium 14th IEEE/ACM International
Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014), pp. 591-594, USA, May 26-29, 2014. rate: 19%
Soodeh Farokhi, Foued Jrad, Ivona Brandic, and Achim Streit. HS4MC: Hierarchical SLA-based Service Selection for Multi-Cloud
Environments. In Proceedings of the 4th International Conference on Cloud Computing and Services Science, (CLOSER 2014)-Multi-Cloud
Special Session, pp. 722-734, SciTe Press, Barcelona, Spain, April 3-5, 2014.
34
35. Soodeh Farokhi
Vienna University of Technology
soodeh.farokhi@tuwien.ac.at
http://www.ec.tuwien.ac.at/~soodehFa
/soodehFa
Quality of Service Control Mechanisms
in Cloud Computing Environments
38. List of Referenced Papers
[Hevner et al., MIS 2004] Hevner von Alan, Salvatore T March, Jinsoo Park, and Sudha Ram. “Design Science in Information
Systems Research”. MIS Quarterly, 28(1):75–105, 2004.
[Jrad et al., CSE 2013] Foued Jrad, Jie Tao, Achim Streit, Rico Knapper, and Christoph Flath. "A utility–based approach for
customized cloud service selection." International Journal of Computational Science and Engineering 10(1-2):32-44, 2015.
[Eaton, FC 2012] Kit Eaton, “How One Second Could Cost Amazon $1.6 Billion In Sales”, Fast Company, 14 March 2012,
http://www.fastcompany.com/1825005/impatient-america-needs-faster-intertubes .
[Jamshidi et al., SEMAS 2014] Pooyan Jamshidi, Ahmad Aakash, and Claus Pahl. "Autonomic resource provisioning for cloud-
based software." International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEMAS), 2014.
[Filieri et al., SEAMS 2015] Antonio Filieri, Martina Maggio, and et al., “Software Engineering Meets Control Theory”. In
International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS), 2015.
[Filieri et al., ICSE 2013] Antonio Filieri, Henry Hoffmann, and Martina Maggio. “Automated Design of Self-Adaptive Software
with Control-Theoretical formal Guarantees”. International Conference on Software Engineering (ICSE), 2014.
[Schroeder et al., NSDI 2006] Bianca Schroeder, Adam Wierman, and Mor Harchol-Balter. Open Versus Closed: A Cautionary
Tale. Networked Systems Design and Implementation (NSDI), 2006.
[Moltó et al., PCS 2013] Germán Moltó, Miguel Caballer, Eloy Romero, and Carlos de Alfonso. “Elastic Memory Management of
Virtualized Infrastructures for Applications with Dynamic Memory Requirements”. Procedia Computer Science, 2013.
[Lakew et al., UCC 2014] Ewnetu Bayuh Lakew, Cristian Klein, Francisco Hernandez, and Erik Elmroth. “Towards Faster Response
Time Models for Vertical Elasticity.” IEEE Conference on Utility and Cloud Computing (UCC), 2014.
38
39. Research Methodology
Following the design-science methodology [Von Alan et al., MIS 2004]
39
Research methodology
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
40. Limitations of the Thesis
1. The used Bayesian network only supports the discrete values
2. Only covering the design-time activities of Multi-Cloud service allocation
3. Focusing mainly on vertical elasticity, while it may be insufficient for
accommodating large change of runtime workload
4. Designing reactive controllers, while using a proactive approach by utilizing
some workload prediction methods can enhance the resource usage and
decrease possible performance degradations
5. The used fuzzy knowledge-based (i.e., fuzzy rules, and membership functions)
are extracted at design time and are not updated at runtime.
40Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
41. Research Contributions
41
C1. VM placement across geo-distributed cloud data centers (using Bayesian Network)
C2. QoS-aware Multi-Cloud service selection (using Prospect Theory)
C3. Performance-based memory elasticity controller (using Control Theory)
C4. Hybrid memory elasticity controller (considering performance and utilization)
C5. A coordination technique for CPU and memory elasticity controllers (using Fuzzy logic)
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
42. Evaluation Results
Evaluation Setup
• Simulated evaluation using real-world data traces
• Baseline
o NoM: support no migration
o FFD: First-Fit heuristic algorithm
• Evaluation metrics
o Total cost (energy & cooling cost + SLA violation penalty)
o Number of migration
Summary of Results
• decreasing the total cost by
o up to 69% (124$ vs. 407$) in comparison with NoM approach
o up to 45% (225$ vs. 124$ total cost) in comparison with FFD-R (the lowest #migrations)
o up to 18% (151$ vs. 124$ total cost) in comparison with FFD-A (the highest #migrations)
42
Regional electricity price Regional temperature
Aggregate evaluation results
C 1
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
43. What is Bayesian Networks?
Bayesian Networks
• A practical form of knowledge representation
• Variable of interest and their probabilistic dependencies
• Finding hidden relationships between variables
• Dealing with systems where uncertainty is inherent (Where the correlation between
variables cannot be clearly observed)
An snapshot of the designed Bayesian Network
C 1
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
44. What is Prospect Theory?
Prospect Theory is a behavioral economic theory
• By Daniel Kahneman in 1979 (Nobel prize in 2002)
• A descriptive model for decision making under uncertainty
• Based on the potential value of losses and gains rather than the final outcome
Alternative decision making model for the Utility Theory
• More realistic in calculating the user satisfaction
• Psychologically more accurate
First application at cloud service selection problem
44Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
45. QoS-aware Multi-Cloud Service Selection
Step I: SLA construction
• Proposing the concepts of sub-SLA and meta-SLA
• Using Model Driven Architectures (MDA) models
Step II: Service Selection
• Selecting the best combination of services
oRanking comparable service offerings using Prospect Theory
45
Steps of Multi-Cloud Service Selection
C 2
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
46. Multi-Cloud SLA Management Framework
46
C 2
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
47. Evaluation Results
Evaluation Setup (simulation)
• using data of 12 commercial cloud providers
• Baseline: a utility-based algorithm
• 3 differnt software editions (standrad, professional, and enterprise): differnt sub- and meta-SLAs
• focusing on 3 aspects of selected services (cost, meta-SLA, and sub-SLA)
• Evaluation metrics: availability, throughput, latency, reputation, cost
Summary of Results
• Behaving similar in the case of cost
• Outperforming in:
o Satisfying sub-SLAs, too
o no SLA violation
o providing QoS & cost
– close to requested values
47
Aggregate evaluation results for meta-SLA of the enterprise editionSLAviolation
C 2
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
48. Self-adaptation of Cloud Applications
48
main aspects for self-adaptation of cloud software applications
designing controllers for software systems
1) Uncertainty (e.g., due to measurement imprecision and noises)
2) Methodological procedures to synthesize controllers
deploying the controlled software systems in cloud environments
3) Heterogeneous interfaces of cloud services (e.g., different control levels)
4) Unpredictable workloads
5) Detecting the applications’ resource bottlenecks
6) Controlling multi-tier applications
7) Different desired QoS sensitivity levels
8) Using resources from multiple clouds
9) Scalability (e.g., the need for distributed controllers and coordination)
C 3
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
49. 49
Vertical vs. Horizontal Elasticity
Horizontal Elasticity
Course grained (fixed size VMs)
Slow (minutes)
Needs application support
• State synchronization
• Load-balancing
Vertical Elasticity
Fine grained (e.g., a portion of a CPU core)
Fast (sub-second)
Little basic application support
• Multi-threaded
Only needs Hypervisor support
49
C 3
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
50. Hypervisor’s Mechanisms for Vertical Memory Elasticity
1. Hot memory add or remove
Adding or removing resources without having to reboot the system
It is not widely used since it cannot be supported by the guest operating
systems without restarting the VM
2. Memory ballooning
Instead of adding or removing the memory, the VM’s kernel can ban the
usage of a portion of memory in spite of the fact that initially it was
allocated to the VM (by running a ballooning driver)
Reacts almost instantaneously, and the guest OS reflects the memory
change a few moments after the operation is executed via Hypervisor
Supported by all recent Linux kernels, no additional features are required
50Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
51. Control Theory meets Software Engineering
A new trend to apply Control Theory to realize self-adaptive software systems
• A systematic way to design feedback control loops to handle unpredictable runtime changes
• Enhancing the software engineering process with mathematically grounded control formulas
51
C 3
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
52. Motivation of using the Allocated Memory
Relationship between the allocated memory and the cloud application
response time
• Choosing the allocated memory as a control knob
52
C 3
0
1000
2000
3000
4000
100 200 300 400 500 600 700 800 900 1000
Responsetime(ms)
Number of concurrent users
1G
2G
4G
The effect of memory elasticity on RUBiS benchmark application response time
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
53. PMC Evaluation: Time-series Analysis
RUBBoS application
under Wikipedia
workload traces
Desired RT ≤ 600 ms
53
C 3
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
54. Limitation of Memory Elasticity
1. Depending on the memory allocation strategy used in hypervisors,
reducing memory size may not be beneficial for the host operating
system.
• We use ballooning mechanisms in KVM and Xen hypervisors that enable the
usage of memory by other co-located VMs
2. Even when the memory size can be changed at the operating system
level, some applications cannot still support the dynamic memory
allocation and eventually need to be restarted to take advantages of
the new allocated memory, such as Java virtual machine (JVM)
application.
• We focus on BL tier and in particular Apache Web server that can leverage the
new allocated memory in a dynamic and live manners
54Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
C 3
55. Evaluation: Time-series Analysis (CMC results)
55
RUBBoS under Wikipedia workload | Comparison results of 3 controllers | desired RT < 20ms
C 4
time (sec)
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
56. Evaluation: Time-series Analysis (PMC results)
56
C 4
time (sec)
RUBBoS under Wikipedia workload | Comparison results of 3 controllers | desired RT < 20ms
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
57. Evaluation: Time-series Analysis (HMC results)
57
C 4
time (sec)
RUBBoS under Wikipedia workload | Comparison results of 3 controllers | desired RT < 20ms
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
58. Motivation of Multiple Resources Elasticity
58
Experiment with RUBiS (eBay-like benchmark application)
• Under different workload patterns
Elasticity reasoning for multiple resources under uncertainty using fuzzy logic
0
20
40
60
0 250 500 750 1000 1250
CPUutilization[%]
100, 1 1500, 5 100, 0.1 100, 1 1000, 1
0
20
40
60
0 250 500 750 1000 1250
memutilization[%]
Time [seconds] every 250 sec is an interval
100, 1 1500, 5 100, 0.1 100, 1 1000, 1
RUBiS CPU utilization RUBiS memory utilization
Time [seconds] every 250 sec is an interval
C 5
cc, tt
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
1
3 2
4
59. Fuzzy Logic vs. Probability Theory
The two are very closely related and using almost the same tools!
The key difference is meaning.
• In probability, we talk about events, not facts, those events will either occur, or not occur.
• There is nothing fuzzy about it > the ultimate truth is not fuzzy!
Fuzzy logic is all about degrees of truth.
probability theory says nothing about how to reason about things that aren't entirely true
or false!
Summary
• Probability theory doesn't capture the essential property of meaning (partial truth) which is the goal of
fuzzy logic
• Fuzzy logic doesn't capture the essential property of meaning (partial knowledge) which is the goal of
probability theory.
59
C 5
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
60. The Used CPU & Memory Controllers
Both are Performance-based, adaptive & reactive controllers
CPU Controller [Lakew et. al., UCC 2014]
• following an inverse model
• adopting RLS filtering to adaptively measure ß
o ß is the system model parameter
o based on the past measurements
Memory Controller
• Presented performance-based memory controller [contribution #3]
60
The used CPU controller [Lakew et al., UCC 2014]
C 5
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
61. Fuzzy Controller Design
A fuzzy logic control approach is defined by
• Membership functions (MF)
o Degree of truth with a value between 0 to 1
• Fuzzy rules
o A collection of “IF THEN ELSE” rules
Fuzzy controller design process
1. Membership function construction
2. Fuzzy rule elicitation
3. Fuzzy reasoning
MIMO (multi-input multi-output) fuzzy logic system (FLS)
• Three Input variables (Response time, CPU utilization, Memory utilization)
• Two output variables
o CPU and memory coefficients (values between -1 to +1)
– -1 & +1 fully allocate the resource
– -1 < value < +1 partially allocate the resource
61
C 5
Membership function of response time
A sample fuzzy rule
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
62. Step 1: MFs Construction
Input variables
1. Response time
2. CPU utilization
3. Memory utilization
linguistic terms represent the value of input variables
1. Slow (RT), Low (utilizations)
2. Medium (RT), Medium (utilizations)
3. Fast (RT), High (utilizations)
We need to define a MF for each linguistic term
• 9 MFs in total, some triangular, some trapezoidal
62
MFs of Response time
MFs of CPU utilization
C 5
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
63. Step 2: Fuzzy Rule Elicitation
3 input variables => 33 combinations => 27 fuzzy rules
… using expert knowledge to extract them & then empirically update them at run time
63
C 5
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
64. Step 3: Fuzzy Elasticity Reasoning
The process of elasticity reasoning (having “fuzzy knowledge-based”)
1. Measured values of input variables are fuzzified, using MFs
2. Fuzzy controller reasons & produces the output variables
o using the fuzzified input variables & fuzzy rules
3. Feeding the output variables into CPU & memory controller
64
fuzzy
knowledge base
#CPU#mem
Ucpu[%]
Umem[%]
CCPU
Cmem
desired RT
Fuzzy coordination approach
measuredRT
desired RT
measured RT
measured RT
desired RT
fuzzy controller
App- / VM-
level sensors
CPU
controller
memory
controller
Application
deployed on
VM
C 5
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
65. Evaluation: Results
Aggregate analysis of 3 benchmark applications using Fuzzy controller
Summary of Results
• Improvement up to 60% memory usage (RUBiS)| 56% CPU usage (Olio) |79% RT stability (RUBiS)
• Unpredictable system behavior without coordinating elasticity controllers
• With careful coordination of elasticity controllers meeting the application performance with
less amount of resources
65
C 5
Introduction (C1) VM Placement (C2) Multi-Cloud Service Selection (C3) Performance-based Controller (C4) Hybrid Controller (C5) Fuzzy Coordination Conclusion
Hinweis der Redaktion
The results show that the proposed approach decreases the energy cost by up to 69% in comparison with the first baseline approach, and by up to 45% compared to the second baseline approach.
the proposed approach to highlight the necessity of supporting migration strategy while managing distributed data centers in order to decrease the operating costs.
The reduction of operating costs is achieved because of taking into account the time- and location-dependent input parameters for the management decisions such as applying the migration actions at a suitable time of the day or to a proper data center location.
the proposed VM placement approach gains less energy cost while keeping the penalty cost under control in a way that the achieved total operating costs is less for all the three used policies
The reason is due to the utilization of the prediction workload policies, using the extracted knowledge of cloud management, and considering the input parameters such as power outage statistics of cloud data centers modeled as Bayesian networks, the effectiveness of multi-criteria decision analysis method applied on Bayesian network reasoning, and supporting different SLA models
the proposed cost-aware VM placement approach under all workload prediction policies achieves better results in terms of both operating costs in comparison with the two baseline approaches
Applying the theory of controlling industrial plants (i.e., control theory) on the software engineering domain to design self-adaptive software systems can enhance the software engineering
process with a variety of mathematically grounded adaptation formulas
Dagstuhl seminars https://www.dagstuhl.de/programm/dagstuhl-seminare
Cloud control workshop series: http://cloudresearch.org/workshops
as an emerging design approach
control & moving average window = 10 sec
Now at the next slide we explore the aggregate analysis results
1) However, in the case of the used hypervisors and using ballooning mechanism, the released virtual machine memory size can be used for other co-located virtual machines;
2) However, in the scope of this thesis, we focus on the business logic tier that host the Web server, which can leverage the new allocated memory in a dynamic and live manner