Microservices and containers are now influencing application design and deployment patterns. Sixty percent of all new applications will use cloud-enabled continuous delivery microservice architectures and containers. Service discovery, registration, and routing are fundamental tenets of microservices. Kubernetes provides a platform for running microservices. Kubernetes can be used to automate the deployment of Microservices and leverage features such as Kube-DNS, Config Maps, and Ingress service for managing those microservices. This configuration works fine for deployments up to a certain size. However, with complex deployments consisting of a large fleet of microservices, additional features are required to augment Kubernetes.
4. An engineering approach focused on decomposing
an application into single-function modules with
well defined interfaces which are independently
deployed and operated by a small team who owns
the entire lifecycle of the service.
Microservices accelerate delivery by minimizing
communication and coordination between people
while reducing the scope and risk of change.
Microservices!
9. Typically microservices are encapsulated inside containers…
One:One relationship between a microservice and a container
Everyone’s container journey starts with one container….
IBM Bluemix Container Service
10. At first the growth is easy to handle….
IBM Bluemix Container Service
11. But soon it is overwhelming…we need container
and microservices management
IBM Bluemix Container Service
17. Slide Title Goes Here
What is Kubernetes?
• Container orchestrator!
• Runs and manages containers!
• Supports multiple cloud and bare-metal environments!
• Inspired and informed by Google's experiences and internal systems!
• 100% Open source, written in Go!
• Manage applications, not machines!
• Rich ecosystem of plug-ins for scheduling, storage, networking!
18. Intelligent Scheduling Self-healing Horizontal scaling
Service discovery & load balancing Automated rollouts and rollbacks Secret and configuration management
IBM Bluemix Container Service
19. Slide Title Goes Here
Kubernetes Architecture
API!
UI!
CLI!
Kubernetes
Master!
Worker Node 1!
Worker Node 2!
Worker Node 3!
Worker Node n!
Registry
• Etcd
• API Server
• Controller Manager
Server
• Scheduler Server
20. Slide Title Goes Here
Kubernetes Architecture
API!
UI!
CLI!
Kubernetes
Master!
Worker Node 1!
Worker Node 2!
Worker Node 3!
Worker Node n!
Registry
• Etcd
• API Server
• Controller Manager
Server
• Scheduler Server
Nodes – hosts that run
Kubernetes applications!
Master nodes:
• Controls and manages the cluster
• Kubectl (command line)
• REST API (communication with workers)
• Scheduling and replication logic!
Worker nodes:
• Hosts the K8s services
• Kubelet (K8s agent that accepts
commands from the master)
• Kubeproxy (network proxy service
responsible for routing activities for
inbound or ingress traffic)
• Docker host!
21. Slide Title Goes Here
Kubernetes Architecture
API!
UI!
CLI!
Kubernetes
Master!
Worker Node 1!
Worker Node 2!
Worker Node 3!
Worker Node n!
Registry
• Etcd
• API Server
• Controller Manager
Server
• Scheduler Server
Pods:
• Smallest deployment unit in K8s
• Collection of containers that run on a
worker node
• Each has its own IP
• Pod shares a PID namespace,
network, and hostname!
Replication controller:
• Ensures availability and scalability
• Maintains the number of pods as requested by
user
• Uses a template that describes specifically
what each pod should contain!
Labels:
• Metadata assigned to K8s resources
• Key-value pairs for identification
• Critical to K8s as it relies on querying the
cluster for resources that have certain labels!
Service:
• Collections of pods exposed as an
endpoint
• Information stored in the K8s cluster
state and networking info propagated
to all worker nodes!
22. Slide Title Goes Here
IBM Bluemix Container Service
API!
UI!
CLI!
Kubernetes
Master!
Worker Node 1!
Worker Node 2!
Worker Node 3!
Worker Node n!
Registry
• Etcd
• API Server
• Controller Manager
Server
• Scheduler Server
• Single tenant k8s master
• Running in IBM managed account
• Key managed to maintain account separation
• Single tenant worker nodes
• Dedicated nodes are single
tenant top to bottom (worker
nodes, hypervisor,
hardware)
• No core or RAM over
subscription to ensure
allocated resources and
ensure no noisy neighbors
(
https://www.ibm.com/cloud-computing/bluemix/virtual-
servers)
Multi-tenant
services with per
tenant isolation!
24. Slide Title Goes Here
In addition to running MySQL inside a container, we also show advanced capabilities like Bluemix service binding by leveraging
Compose for MySQL service!
!
Developer Works Code: https://developer.ibm.com/code/journey/scalable-wordpress-on-kubernetes!
Github: https://github.com/IBM/scalable-wordpress-deployment-on-kubernetes"
Developer Journeys:
Scalable Wordpress on Kubernetes
25. Slide Title Goes Here
This project shows how a common multi-component application can be deployed. GitLab represents a typical multi-tier app
and each component will have their own container(s).!
!
Developer Works Code: https://developer.ibm.com/code/journey/run-gitlab-kubernetes/ "
Github: https://github.com/IBM/kubernetes-container-service-gitlab-sample"
Developer Journeys:
Deploy a Distributed GitLab on Kubernetes
26. Slide Title Goes Here
Leverages Kubernetes Pods, Service, Replication Controller, StatefulSets!
!
Developer Works Code: https://developer.ibm.com/code/journey/deploy-a-scalable-apache-cassandra-database-on-kubernetes"
Github: https://github.com/IBM/scalable-cassandra-deployment-on-kubernetes"
Developer Journeys:
Scalable Apache Cassandra on Kubernetes
28. Slide Title Goes Here
This journey shows you how to create and deploy Spring Boot microservices within a polyglot application and then deploy the
app to a Kubernetes cluster.!
!
Developer Works Code: https://developer.ibm.com/code/journey/deploy-spring-boot-microservices-on-kubernetes/"
Github: https://github.com/IBM/spring-boot-microservices-on-kubernetes"
"
Developer Journeys:
Spring Boot Microservices on Kubernetes
29. Slide Title Goes Here
With current application architectures, microservices need to co-exist in polyglot environments. In this developer journey, you’ll learn how to
deploy a Java microservices application that runs alongside other polyglot microservices, leveraging service discovery, registration, and routing.!
!
Developer Works Code: https://developer.ibm.com/code/journeys/deploy-java-microservices-on-kubernetes-with-polyglot-support/ "
Github: https://github.com/IBM/GameOn-Java-Microservices-on-Kubernetes!
Developer Journeys:
Deploy Java microservices on Kubernetes within a polyglot ecosystem
30. Slide Title Goes Here
Java based Microservices application using MicroProfile (baseline for Java Microservices architecture) and Microservices
Builder on Kubernetes!
!
Developer Works Code: https://developer.ibm.com/code/journey/deploy-microprofile-java-microservices-on-kubernetes!
Github: https://github.com/IBM/java-microprofile-on-kubernetes"
Developer Journeys:
Java MicroProfile Microservices on Kubernetes
31. Slide Title Goes Here
Java based Microservices application using MicroProfile (baseline for Java Microservices architecture) and Microservices
Builder on Kubernetes!
!
Developer Works Code: https://developer.ibm.com/code/journey/deploy-microprofile-java-microservices-on-kubernetes!
Github: https://github.com/IBM/java-microprofile-on-kubernetes"
Developer Journeys:
Java MicroProfile Microservices on Kubernetes
DEMO
32. Kubernetes is great for
Microservices…
Why do we need a Service
mesh and what is it?
34. What else do we need for"
Microservices?
● Visibility!
● Resiliency & Efficiency!
● Traffic Control!
● Security!
● Policy Enforcement!
!
Enter Service Mesh"
35. What is a ‘Service Mesh’ ?
• A network for services, not bytes!
● Visibility!
● Resiliency & Efficiency!
● Traffic Control!
● Security!
● Policy Enforcement!
36. Microservice-1
Sidecar
SERVICE
DISCOVER
Y
Service Mesh
Control Plane
SERVICE
REGISTRY
Microservice-2
Sidecar
Microservice-3
Sidecar
ROUTING
RULES
TELEMETR
Y
ACCESS
CONTROL
RESILIENC
Y
FEATURES
Service Mesh
Data Plane
• Lightweight sidecars "
to manage traffic "
between services"
• Sidecars can do
much more
than just load
balancing!
How to build a
‘Service Mesh’ ?
38. Istio!
Concepts!
• Pilot - Configures Istio deployments
and propagate configuration to the
other components of the system.
Routing and resiliency rules go here
• Mixer - Responsible for policy
decisions and aggregating telemetry
data from the other components in the
system using a flexible plugin
architecture
• Proxy – Based on Envoy, mediates
inbound and outbound traffic for all
Istio-managed services. It enforces
access control and usage policies, and
provides rich routing, load balancing,
and protocol conversion.
ENVOY
MIXER
ISTIO
PILOT
ISTIO
AUTH
ISTIO
CONTROL
PLANE
ROUTING
RULES
GRAPHANA
/ZIPKIN
MICROSERVICE
ENVOY
MICROSERVICE
ENVOY
MICROSERVICE
ENVOY
MICROSERVICE
ENVOY
ISTIO
DATA
PLANE
40. Istio
Architecture
appA
Proxy
Pod
Proxy
Istio ingress
Controller
Service A
appB
Proxy
Service B
1. All traffic entering and
leaving pod is transparently
routed via Proxy without
requiring any application
changes.
Kube API Server
User/application traffic. HTTP/
1.1, HTTP/2, gRPC, TCP with or
without TLS
Istio control plane traffic.
Request routing rules,
resilience configuration (circuit
breakers, timeouts, retries),
policies (ACLs, rate limits,
auth), and metrics/reports from
proxies.
Prometheus
Metrics & reports
from proxiesIstio Control Plane
Istio Control PlaneIstio Control Plane
(Pilot, Mixer,Auth)
Control Plane REST API
Kubernetes Cluster
Proxy. Based on Envoy, a high
performance L7 proxy from
Lyft, currently being used at
large scale in production.
https://github.com/lyft/envoy
2. Proxy implements intelligent L7
routing, circuit breakers, enforces
policies and reports metrics to
control plane.
42. What is a ‘Service Mesh’ ?
A network for services, not bytes!
" Resiliency & Efficiency
● Traffic Control!
● Visibility!
● Security!
● Policy Enforcement!
43. • Istio adds fault tolerance to your application
without any changes to code
• Resilience features
❖ Timeouts
❖ Retries with timeout budget
❖ Circuit breakers
❖ Health checks
❖ AZ-aware load balancing w/
automatic failover
❖ Control connection pool size and
request load
❖ Systematic fault injection
• // Circuit breakers
• destination: serviceB.example.cluster.local
policy:
- tags:
version: v1
circuitBreaker:
simpleCb:
maxConnections: 100
httpMaxRequests: 1000
httpMaxRequestsPerConnection: 10
httpConsecutiveErrors: 7
sleepWindow: 15m
httpDetectionInterval: 5m
Resiliency
44. • Systematic fault injection to identify weaknesses in failure recovery policies
– HTTP/gRPC error codes
– Delay injection
svcA
Envoy
Service
A
svcB
Envoy
Service
B
svcC
Envoy
Service
C
Timeout: 100ms
Retries: 3
300ms
Timeout: 200ms
Retries: 2
400ms
Resiliency Testing
45. Slide Title Goes Here
Twelve-factor apps make a strong case for designing and implementing your microservices for failure. What that means is with the
proliferation of microservices, failure is inevitable, and applications should be fault-tolerant. Istio, a service mesh, can help make your
microservices resilient without changing application code.!
!
Developer Works Code: https://developer.ibm.com/code/journey/make-java-microservices-resilient-with-istio/"
Github: https://github.com/IBM/resilient-java-microservices-with-istio"
Developer Journey:
Leverage Is3o to create resilient and fault tolerant Microservices
46. Resilient Microservices with Istio
speaker
ENVOY
session
ENVOY
schedule
ENVOY
vote
web-application
ENVOY
ENVOY
MIXER
ISTIO
PILOT
ISTIO
AUTH
ISTIO
CONTROL
PLANE
ISTIO
DATA
PLANE
1
2
3
5
RESILIENC
Y & FAULT
TOLERANC
E RULES
ENVOY
ENVOY
"
Load
Balancing
Pool
NOT
RSPONDING"
- CIRCUIT
BREAK
4
3
5
4
DELAYED
RESPONSE
- TIME OUT
MAX
CONNECTIONS
- CIRCUIT
BREAK
ENVOY
53. • Monitoring & tracing should not be an
afterthought in the infrastructure"
• Goals!
• Metrics without instrumenting apps!
• Consistent metrics across fleet!
• Trace flow of requests across services!
• Portable across metric backend
providers!
Istio Zipkin tracing dashboard
Istio - Grafana dashboard w/ Prometheus backend
Visibility
54. • Mixer collects metrics emitted by Envoys!
• Adapters in the Mixer normalize and
forward to monitoring backends!
• Metrics backend can be swapped at
runtime!
svcA
Envoy
Pod
Service
A
svcB
Envoy
Service
B
API: /svcB
Latency: 10ms
Status Code: 503
Src: 10.0.0.1
Dst: 10.0.0.2
…...
Prometheu
s InfluxDB
Prometheus
Adapter
InfluxDB
Adapter
Custom
Adapter
Mixer
Prometheu
s
Prometheu
s InfluxDB
InfluxDB Custom
backend
Metric Flow
55. • Application do not have to deal
with generating spans or
correlating causality!
• Envoys generate spans!
• Applications need to *forward*
context headers on outbound
calls!
• Envoys send traces to Mixer!
• Adapters at Mixer send traces to
respective backends!
svcA
Envoy
Pod
Service
A
svcB
Envoy
Service
B
Trace Headers
X-B3-TraceId
X-B3-SpanId
X-B3-ParentSpanId
X-B3-Sampled
X-B3-Flags
svcC
Envoy
Service
C
Span
s
Span
s
Prometheu
s InfluxDB
Zipkin
Adapter
Stackdriver
Adapter
Custom
Adapter
Mixer
Prometheu
s
Zipkin
InfluxDB
Stackdriver Custom
backend
Visibility : Tracing
56. Slide Title Goes Here
Microservices and containers have changed application design and deployment patterns. They have also introduced new challenges, such as
service discovery, routing, failure handling, and visibility to microservices. Kubernetes can handle multiple container-based workloads, including
microservices, but when it comes to more sophisticated features like traffic management, failure handling, and resiliency, a microservices mesh
like Istio is required.!
!
Developer Works Code: https://developer.ibm.com/code/journey/manage-microservices-traffic-using-istio/ "
Github: https://github.com/IBM/microservices-traffic-management-using-istio "
Developer Journey:
Manage micro services traffic using Istio on Kubernetes
58. Slide Title Goes Here
Microservices and containers have changed application design and deployment patterns. They have also introduced new challenges, such as
service discovery, routing, failure handling, and visibility to microservices. Kubernetes can handle multiple container-based workloads, including
microservices, but when it comes to more sophisticated features like traffic management, failure handling, and resiliency, a microservices mesh
like Istio is required.!
!
Developer Works Code: https://developer.ibm.com/code/journey/manage-microservices-traffic-using-istio/ "
Github: https://github.com/IBM/microservices-traffic-management-using-istio "
Developer Journey:
Manage micro services traffic using Istio on Kubernetes
DEMO
Communication between microservices is typically REST based. Payload may be JSON or, where serialization efficiency is important, other options such as Apache Thrift or Google Protocol Buffers may be appropriate. Typically applications begin with synchronous communication but, depending on the interaction style, it may be appropriate to introduce asynchronous messaging. Due to the potential for large number of service interactions (in a typical customer facing application, a single front end invocation could spawn off 20-30 calls to services and data sources) parallel invocation may be required to keep latency to a minimum. Options here might include a reactive streams implementation such as Netflix’s RxJava or, for those using Java 8, CompletableFuture.
Now, when we scale the application, we can scale up and down instances of each service independently. When a service fails, it can fail independently rather than causing the whole application to fail.
Having lots of microservices brings it’s own challenges though. For example, you now need to be more rigorous about deployment. We also have to consider how the services will discover one another.
Now, when we want to make an update, whether that’s a simple code change or a complete change of the technology stack, this can be achieved at the service level. We even have the option to perform canary testing, slowly migrating traffic to the new implementation until we’re ready to remove the old.
v
Having lots of microservices brings it’s own challenges though. For example, you now need to be more rigorous about deployment. We also have to consider how the services will find one another.
Images is https://pixabay.com/en/plaid-coaster-bast-colorful-color-1173135/ CC0 license
Today application developers think of the network as big pipes of bits that don’t do very much else. They don’t really help with service-level concerns. We need a network that operates at the same level as the services we build and deploy. To OSI geeks this is a layer-7 network. So, what should this kind of network do?
Route traffic away from failures to increase the aggregate reliability of my cluster
Avoid needless overheads like high-latency routes or servers with cold-caches
It should provide insight. Highlight unexpected dependencies & root causes of failure
Let me impose policies at the granularity of service behaviors, not just at the connection level
Ensure that the traffic flowing between services is secure against trivial attack
… so let’s walk through a little bit of this in action
Handoff -
If we were to reimagine the network that connects our microservices, what would we want out of it?
Think of the kernel’s TCP/IP stack today.
Do we care where in the planet an IP address is or how to route to it? No
How about discovering MAC address associated with the IP or the next hop router? Nope.
Do we care about packet loss or congestion control? Heck no.
Essentially, the kernel provides a reliable communication fabric at Layer 4. It frees you from having to deal with discovery, failure recovery, flow control, and a host of other issues that you may not even be aware of.
Isn’t this a nice property to have at the services layer, that is, layer-7? We seem to be having some similar issues: discovery, resiliency, routing, etc. and other issues specific such as load balancing, monitoring, policy enforcement, authentication and authorization, etc.
Handoff -
Lets look at the operational visibility aspect of the service mesh.
What is so different about Istio when compared to the monitoring solutions you are used to?
In our experience working with a diverse set of clients, 95% of them are happy with an out of process sidecar implementing all the resiliency features such as circuit breakers instead of having to build them into the application code.
Secondly, in most of the cases, there is a fair amount of parameter tweaking - e.g., making sure that timeouts across services are not too tight or ensuring that a circuit doesn’t open too early.
Istio provides dynamically configurable resilience features that you can tweak at runtime, *in production*.
Systematic resilience testing allows you to test the ability of your services to recover from failures. You can isolate failures to specific subset of requests (such as those coming from QA teams), while running these tests in production.
If you recall the excellent Linkerd talk that Oliver gave this morning, he talked about a scenario where a chain of service calls had incompatible timeouts and retries. Well, you can always recover from these issues by using timeouts/retries/circuit breakers on the very first service in the call chain. But think of all the wasted work in the backend services (resources being held up).
Worse yet, think of the impact on your users. For example, consider Netflix recommendations. When the system is running well, you get personalized recommendations. That is a “feature”. When the personalized recommendations system is down, Netflix falls back to “generic recommendations”. This is great as long as the outage is temporary. But what if this becomes the permanent user experience for weeks at a stretch, because some frontend system had a misconfigured timeout that caused it to prematurely terminate API calls to the recommendation engine.
Circuit Breaker, max connection and pending requests
Circuit breaker, load balancing pool ejection.
Timeout and Fault Tolerance
Handoff -
Lets look at the operational visibility aspect of the service mesh.
What is so different about Istio when compared to the monitoring solutions you are used to?
Traffic control is decoupled from infrastructure scaling - i.e., the proportion of traffic routed to a version is independent of number of instances supporting the version
You can program these traffic routing rules using istio on the fly using simple rules as shown in the slide. These rules get translated into low level Envoy configs. Envoy periodically fetches configs from the manager and updates its internal routing table. There are no hot restarts, no impact to existing traffic.
We call this tag based routing.
Content-based traffic steering - The content of a request can be used to determine the destination of a request
Another way to route requests is the age old method. Look at the contents of a request and route it to a specific set of instances. This opens up all sorts of possibilities. For a user facing service, you could route a portion of mobile traffic from a particular device, say an iPhone, to a specific version of the service. For internal services, callers would have the ability to explicitly request a particular version of the service. In fact, Verizon is using this particular technique in production today.
Baseline visibility even for black boxes. Holistic view of the entire fleet. No need to rely on development process to instrument code to emit metrics. Metrics are not special. Its just that every component emits metrics.
Consistent metrics enables automation - things like autoscaling for example.
Policy engines have to operate on same data that telemetry systems have. Right now, we operate on two inconsistent data streams.
Log scrapping is an after thought.
In addition, the sidecars are ideally positioned to trace the flow of requests across services, when you need to drill down on performance issues or trouble shoot problems in the application.
We can write a variety of adapters for various metrics backends.
You can plug different backends into the mixer. Whats nice about this design is that you can switch from one metrics backend to another, as your requirements change, without ever having to rewrire your entire infrastructure.