SlideShare a Scribd company logo
1 of 29
Netflix Architecture and
Open Source
Andrew Spyker
Senior Software Engineer, Netflix
About Netflix
69M members
2000+ employees (1400 tech)
80+ countries
> 100M hours watch per day
> ⅓ NA internet download traffic
500+ Microservices
Many 10’s of thousands VM’s
3 regions across the world
About the Speaker
Cloud platform technologies
Distributed configuration, service discovery, RPC, application
frameworks, non-Java sidecar
Container cloud
Resource management and scheduling, making Docker containers
operational in Amazon EC2/ECS
Open Source
Organize @NetflixOSS meetups & internal group
Performance
Assist across Netflix, but focused mainly on cloud platform perf
With Netflix for ~ 1 year. Previously at IBM here in Raleigh/Durham
(RTP)
@aspyker
ispyker.blog
spot.com
Agenda
NetflixOSS
Netflix Cloud Architecture
Getting started
Why does Netflix open source?
Allows engineers to gather feedback
Openly talk, through code, on our approach
Collaboration on key projects with the world
Happily use proven outside open source
And improve it for Netflix scale and availability
Netflix culture of freedom and responsibility
Want to open source?
Go for it, be responsible!
Recruiting and Retention
Candidates know exactly what they can work on
NetflixOSS engineers choose to stay at Netflix
NetflixOSS is widely used
The architecture has shaped public cloud usage
Immutability, Red/Black Deploys, Chaos,
Regional and worldwide high availability
Offerings
Pivotal Spring Cloud
Large usage
IBM Watson as a Service (on IBM Cloud)
Nike Digital is hiring NetflixOSS experts
Interesting usage
“To help locate new troves of data claiming to be the files stolen from
AshleyMadison, the company’s forensics team has been using a tool
that Netflix released last year called Scumblr”
NetflixOSS Website Relaunch
http://netflix.github.io
Key aspects of NetflixOSS website
Show how the pieces fit together
Projects now discussed with each other in context
OSS categories mirror internal teams
No artificial categories, focal points for each area
Focus on projects that are core to Netflix
Projects mentioned are core and strategic
Agenda
NetflixOSS
Netflix Cloud Architecture
Getting Started
Elastic, Web and Hyper Scale
Doing this
Not doing that
Elastic, Web and Hyper Scale
Front end
API
Another
Microservice
Temporal
caching
Durable
Storage
Load
Balancers
Strategy Benefit
Make deployments automated Without automation impossible
Expose well designed API to users Offloads presentation complexity to clients
Remove state for mid tier services Allows easy elastic scale out
Push temporal state to client and caching tier Leverage clients, avoids data tier overload
Use partitioned data storage Data design and storage scales with HA
Recommendation
Microservice
HA and Automatic Recovery
Feeling This
Not Feeling That
Micro service
Implementation
Call microservice #2
Highly Available Service Runtime Recipe
Ribbon REST client
with Eureka
Microservice #1
(REST services)
App Service
Microservice #2
Execute
call
Hystrix
Eureka
Server(s)
Eureka
Server(s)
Eureka
Server(s)
Karyon
Fallback
Implementation
Implementation Detail Benefits
Decompose into micro services
• Key user path always available
• Failure does not propagate across service boundaries
Karyon /w automatic Eureka registration
• New instances are quickly found
• Failing individual instances disappear
Ribbon client with Eureka awareness
• Load balances & retries across instances with “smarts”
• Handles temporal instance failure
Hystrix as dependency circuit breaker
• Allows for fast failure
• Provides graceful cross service degradation/recovery
IaaS High Availability
Region (us-east-1)
us-east-1e
us-east-1c
Eureka
Web App Service1 Service2
Cluster Auto Recovery and Scaling Services (Auto Scaling Groups)
ELB’s
Rule Why?
Always > 2 of everything 1 is SPOF, 2 doesn’t web scale and slow DR recovery
Including IaaS and cloud services You’re only as strong as your weakest dependency
Use auto scaler/recovery monitoring Clusters guarantee availability and service latency
Use application level health checks Instance on the network != healthy
Worldwide availability Data replication, global front-end routing, cross region traffic
us-east-1d
A truly global service
Replicate data across
regions
Be able to redirect traffic
from region to region
Be able to migrate regional
traffic to other regions
Have automated control
across regions
Flux Demo
Testing is only way to prove HA
Chaos Monkey
Kill instances in production - runs regularly
Chaos Gorilla
Kills availability zones (single datacenter)
Also testing for split brain important
Chaos Kong
Kill entire region and shift traffic globally
Run frequently but with prior scheduling
Continuous Delivery
Reading This
Not This
v
Continuous Delivery
Cluster v1 Canary v2 Cluster V2
Step Technology
Developers test locally Unit test frameworks
Continuous build Continuous build server based on gradle builds
Build “bakes” full instance image Aminator and deployment pipeline bake images from build artifacts
Developer work across dev and test Archaius allows for environment based context
Developers do canary tests, red/black
deployments in prod
Asgard console provides app cluster common devops approach,
security patterns, and visibility
Continuous
Build Server
Baked to images
(AMI’s)
From Asgard to Spinnaker
Spinnaker is our CI/CD solution
CI/CD solution including baking and Jenkins integration
Workflow engine for the continuous delivery
Pipeline based deployment including baking
Global visibility across all of our AWS regions
Provides an API first design
A microservices runtime HA architecture
More flexible cloud model so the community can contribute back
improvements not related to AWS
Asgard continues to work side-by-side
Spinnaker is this new end to end CI/CD tool
Spinnaker Examples
Works at
Netflix
scale
Views of
global
pipelines
From simple Asgard
like deployment to
advanced CI/CD
pipelines
Operational Visibility
If you can’t see it, you can’t improve it
Operational Visibility
Microservice #1 Microservice #2
Visibility Point Technology
Basic IaaS instance monitoring Not enough (not scalable, not app specific)
User like external monitoring SaaS offerings or OSS like Uptime
Targeted performance, sampling Vector performance and app level metrics
Service to service interconnects Hystrix streams ➔Turbine aggregation ➔Hystrix dashboard
Application centric metrics Servo/Spectator gauges, counters, timers sent to metrics store like Atlas
Remote logging Logstash/Kibana or similar log aggregation and analysis frameworks
Threshold monitoring and alerts Services like Atlas and PagerDuty for incident management
Servo/
Spectator
Hystrix/Turbine
External
Uptime
Monitoring Metric/Event
Repositories
LogStash/Elastic
Search/Kibana
Incidents
Atlas
Vector
Security
Dynamic
Security
Done in new ways
NOT
Dynamic, Web Scale & Simpler Security
Security Monkey
Monitors security policies, tracks changes, alerts on situations
Scumblr
Searches internet for security “nuggets” (credentials, hacking discussions)
Sketchy
A safe way to collect text and screenshots from websites
FIDO
Automated event detection, analysis, enrichment & and enforcement
Sleepy Puppy
Delayed cross site scripting propagation testing framework
Lemur
x.509 certificate orchestration framework
What did we not cover?
Over 50 github projects
NetflixOSS is “Technical indigestion as a service”
Big Data, Data Persistence and UI Engineering
Big Data tools used well beyond Netflix
Ephemeral, semi and fully persistent data systems
Recent addition of UI OSS and Falcor
Agenda
NetflixOSS
Netflix Cloud Architecture
Getting Started
How do I get started?
All of the previous slides shows NetflixOSS components
Code: http://netflix.github.io
Announcements: http://techblog.netflix.com/
Want to get running a bit faster?
ZeroToCloud
Workshop for getting started with build/bake/deploy in Amazon EC2
ZeroToDocker
Docker images that containing running Netflix technologies (not production
ready, but easy to understand)
ZeroToDocker Demo
Mac OS X
Virtual Box
Ubuntu 14.04
single kernel
Container#1
Filesystem+
process
Eureka
Container
ZuulContainer
Another
Container
...
Docker running instances
Single kernel
Contained processes
Zookeeper and Exhibitor
A Microservices app and
surrounding NetflixOSS
services (Zuul to Karyon
with Eureka)
Questions
?

More Related Content

What's hot

Api gateway in microservices
Api gateway in microservicesApi gateway in microservices
Api gateway in microservices
Kunal Hire
 

What's hot (20)

CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
 
공개소프트웨어 기반 주요 클라우드 전환 사례
공개소프트웨어 기반 주요 클라우드 전환 사례공개소프트웨어 기반 주요 클라우드 전환 사례
공개소프트웨어 기반 주요 클라우드 전환 사례
 
Microservices & API Gateways
Microservices & API Gateways Microservices & API Gateways
Microservices & API Gateways
 
An overview of the Kubernetes architecture
An overview of the Kubernetes architectureAn overview of the Kubernetes architecture
An overview of the Kubernetes architecture
 
Api gateway in microservices
Api gateway in microservicesApi gateway in microservices
Api gateway in microservices
 
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
Interactive real-time dashboards on data streams using Kafka, Druid, and Supe...
 
Comparison of existing cni plugins for kubernetes
Comparison of existing cni plugins for kubernetesComparison of existing cni plugins for kubernetes
Comparison of existing cni plugins for kubernetes
 
Amazon EKS multi-cluster gitops-bridge
Amazon EKS multi-cluster gitops-bridgeAmazon EKS multi-cluster gitops-bridge
Amazon EKS multi-cluster gitops-bridge
 
High Concurrency Architecture at TIKI
High Concurrency Architecture at TIKIHigh Concurrency Architecture at TIKI
High Concurrency Architecture at TIKI
 
High Availability Websites: part one
High Availability Websites: part oneHigh Availability Websites: part one
High Availability Websites: part one
 
SOLID & Design Patterns
SOLID & Design PatternsSOLID & Design Patterns
SOLID & Design Patterns
 
Devops Devops Devops
Devops Devops DevopsDevops Devops Devops
Devops Devops Devops
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Domain Driven Design - Strategic Patterns and Microservices
Domain Driven Design - Strategic Patterns and MicroservicesDomain Driven Design - Strategic Patterns and Microservices
Domain Driven Design - Strategic Patterns and Microservices
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Airflow at lyft
Airflow at lyftAirflow at lyft
Airflow at lyft
 
EKS vs GKE vs AKS - Evaluating Kubernetes in the Cloud
EKS vs GKE vs AKS - Evaluating Kubernetes in the CloudEKS vs GKE vs AKS - Evaluating Kubernetes in the Cloud
EKS vs GKE vs AKS - Evaluating Kubernetes in the Cloud
 
Data pipeline with kafka
Data pipeline with kafkaData pipeline with kafka
Data pipeline with kafka
 
Apache Flink and Apache Hudi.pdf
Apache Flink and Apache Hudi.pdfApache Flink and Apache Hudi.pdf
Apache Flink and Apache Hudi.pdf
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 

Similar to Netflix Cloud Architecture and Open Source

Similar to Netflix Cloud Architecture and Open Source (20)

Netflix Architecture and Open Source
Netflix Architecture and Open SourceNetflix Architecture and Open Source
Netflix Architecture and Open Source
 
Netflix Cloud Platform and Open Source
Netflix Cloud Platform and Open SourceNetflix Cloud Platform and Open Source
Netflix Cloud Platform and Open Source
 
Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015Triangle Devops Meetup 10/2015
Triangle Devops Meetup 10/2015
 
Zero to 1000+ Applications - Large Scale CD Adoption at Cisco with Spinnaker ...
Zero to 1000+ Applications - Large Scale CD Adoption at Cisco with Spinnaker ...Zero to 1000+ Applications - Large Scale CD Adoption at Cisco with Spinnaker ...
Zero to 1000+ Applications - Large Scale CD Adoption at Cisco with Spinnaker ...
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
 
Fast, Secure Deployments with Docker on AWS
Fast, Secure Deployments with Docker on AWSFast, Secure Deployments with Docker on AWS
Fast, Secure Deployments with Docker on AWS
 
Simplify and Scale Enterprise Spring Apps in the Cloud | March 23, 2023
Simplify and Scale Enterprise Spring Apps in the Cloud | March 23, 2023Simplify and Scale Enterprise Spring Apps in the Cloud | March 23, 2023
Simplify and Scale Enterprise Spring Apps in the Cloud | March 23, 2023
 
Modernizing Testing as Apps Re-Architect
Modernizing Testing as Apps Re-ArchitectModernizing Testing as Apps Re-Architect
Modernizing Testing as Apps Re-Architect
 
Gluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A ChallengeGluecon Monitoring Microservices and Containers: A Challenge
Gluecon Monitoring Microservices and Containers: A Challenge
 
[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...
[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...
[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...
 
AWS re:Invent 2016: Develop Your Migration Toolkit (ENT312)
AWS re:Invent 2016: Develop Your Migration Toolkit (ENT312)AWS re:Invent 2016: Develop Your Migration Toolkit (ENT312)
AWS re:Invent 2016: Develop Your Migration Toolkit (ENT312)
 
CSC AWS re:Invent Enterprise DevOps session
CSC AWS re:Invent Enterprise DevOps sessionCSC AWS re:Invent Enterprise DevOps session
CSC AWS re:Invent Enterprise DevOps session
 
Business and IT agility through DevOps and microservice architecture powered ...
Business and IT agility through DevOps and microservice architecture powered ...Business and IT agility through DevOps and microservice architecture powered ...
Business and IT agility through DevOps and microservice architecture powered ...
 
The Future of Cloud Innovation, featuring Adrian Cockcroft
The Future of Cloud Innovation, featuring Adrian CockcroftThe Future of Cloud Innovation, featuring Adrian Cockcroft
The Future of Cloud Innovation, featuring Adrian Cockcroft
 
Stay productive while slicing up the monolith
Stay productive while slicing up the monolithStay productive while slicing up the monolith
Stay productive while slicing up the monolith
 
Microservices Architecture, Monolith Migration Patterns
Microservices Architecture, Monolith Migration PatternsMicroservices Architecture, Monolith Migration Patterns
Microservices Architecture, Monolith Migration Patterns
 
Slide DevSecOps Microservices
Slide DevSecOps Microservices Slide DevSecOps Microservices
Slide DevSecOps Microservices
 
The elegant way of implementing microservices with istio
The elegant way of implementing microservices with istioThe elegant way of implementing microservices with istio
The elegant way of implementing microservices with istio
 
12월 16일 Meetup [Deep Dive] Microservice 트래픽 관리를 위한 Istio 알아보기 | 강인호 컨설턴트, 오라클
12월 16일 Meetup [Deep Dive] Microservice 트래픽 관리를 위한 Istio 알아보기 | 강인호 컨설턴트, 오라클12월 16일 Meetup [Deep Dive] Microservice 트래픽 관리를 위한 Istio 알아보기 | 강인호 컨설턴트, 오라클
12월 16일 Meetup [Deep Dive] Microservice 트래픽 관리를 위한 Istio 알아보기 | 강인호 컨설턴트, 오라클
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDB
 

More from aspyker

Season 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data ScientistsSeason 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data Scientists
aspyker
 

More from aspyker (20)

Herding Kats - Netflix’s Journey to Kubernetes Public
Herding Kats - Netflix’s Journey to Kubernetes PublicHerding Kats - Netflix’s Journey to Kubernetes Public
Herding Kats - Netflix’s Journey to Kubernetes Public
 
Season 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data ScientistsSeason 7 Episode 1 - Tools for Data Scientists
Season 7 Episode 1 - Tools for Data Scientists
 
CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2
 
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and DaemonsQConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons
 
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, KayentaNetflixOSS Meetup S6E2 - Spinnaker, Kayenta
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
 
NetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & ContainersNetflixOSS Meetup S6E1 - Titus & Containers
NetflixOSS Meetup S6E1 - Titus & Containers
 
SRECon Lightning Talk
SRECon Lightning TalkSRECon Lightning Talk
SRECon Lightning Talk
 
Container World 2018
Container World 2018Container World 2018
Container World 2018
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
Netflix OSS Meetup Season 5 Episode 1
Netflix OSS Meetup Season 5 Episode 1Netflix OSS Meetup Season 5 Episode 1
Netflix OSS Meetup Season 5 Episode 1
 
Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17Series of Unfortunate Netflix Container Events - QConNYC17
Series of Unfortunate Netflix Container Events - QConNYC17
 
Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4Netflix OSS Meetup Season 4 Episode 4
Netflix OSS Meetup Season 4 Episode 4
 
Re:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS IntegrationRe:invent 2016 Container Scheduling, Execution and AWS Integration
Re:invent 2016 Container Scheduling, Execution and AWS Integration
 
Netflix and Containers: Not A Stranger Thing
Netflix and Containers:  Not A Stranger ThingNetflix and Containers:  Not A Stranger Thing
Netflix and Containers: Not A Stranger Thing
 
Netflix Open Source: Building a Distributed and Automated Open Source Program
Netflix Open Source:  Building a Distributed and Automated Open Source ProgramNetflix Open Source:  Building a Distributed and Automated Open Source Program
Netflix Open Source: Building a Distributed and Automated Open Source Program
 
Velocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ NetflixVelocity NYC 2016 - Containers @ Netflix
Velocity NYC 2016 - Containers @ Netflix
 
Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3Netflix Open Source Meetup Season 4 Episode 3
Netflix Open Source Meetup Season 4 Episode 3
 
Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016Netflix Container Scheduling and Execution - QCon New York 2016
Netflix Container Scheduling and Execution - QCon New York 2016
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016Netflix Container Runtime - Titus - for Container Camp 2016
Netflix Container Runtime - Titus - for Container Camp 2016
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Netflix Cloud Architecture and Open Source

  • 1. Netflix Architecture and Open Source Andrew Spyker Senior Software Engineer, Netflix
  • 2. About Netflix 69M members 2000+ employees (1400 tech) 80+ countries > 100M hours watch per day > ⅓ NA internet download traffic 500+ Microservices Many 10’s of thousands VM’s 3 regions across the world
  • 3. About the Speaker Cloud platform technologies Distributed configuration, service discovery, RPC, application frameworks, non-Java sidecar Container cloud Resource management and scheduling, making Docker containers operational in Amazon EC2/ECS Open Source Organize @NetflixOSS meetups & internal group Performance Assist across Netflix, but focused mainly on cloud platform perf With Netflix for ~ 1 year. Previously at IBM here in Raleigh/Durham (RTP) @aspyker ispyker.blog spot.com
  • 5. Why does Netflix open source? Allows engineers to gather feedback Openly talk, through code, on our approach Collaboration on key projects with the world Happily use proven outside open source And improve it for Netflix scale and availability Netflix culture of freedom and responsibility Want to open source? Go for it, be responsible! Recruiting and Retention Candidates know exactly what they can work on NetflixOSS engineers choose to stay at Netflix
  • 6. NetflixOSS is widely used The architecture has shaped public cloud usage Immutability, Red/Black Deploys, Chaos, Regional and worldwide high availability Offerings Pivotal Spring Cloud Large usage IBM Watson as a Service (on IBM Cloud) Nike Digital is hiring NetflixOSS experts Interesting usage “To help locate new troves of data claiming to be the files stolen from AshleyMadison, the company’s forensics team has been using a tool that Netflix released last year called Scumblr”
  • 8. Key aspects of NetflixOSS website Show how the pieces fit together Projects now discussed with each other in context OSS categories mirror internal teams No artificial categories, focal points for each area Focus on projects that are core to Netflix Projects mentioned are core and strategic
  • 10. Elastic, Web and Hyper Scale Doing this Not doing that
  • 11. Elastic, Web and Hyper Scale Front end API Another Microservice Temporal caching Durable Storage Load Balancers Strategy Benefit Make deployments automated Without automation impossible Expose well designed API to users Offloads presentation complexity to clients Remove state for mid tier services Allows easy elastic scale out Push temporal state to client and caching tier Leverage clients, avoids data tier overload Use partitioned data storage Data design and storage scales with HA Recommendation Microservice
  • 12. HA and Automatic Recovery Feeling This Not Feeling That
  • 13. Micro service Implementation Call microservice #2 Highly Available Service Runtime Recipe Ribbon REST client with Eureka Microservice #1 (REST services) App Service Microservice #2 Execute call Hystrix Eureka Server(s) Eureka Server(s) Eureka Server(s) Karyon Fallback Implementation Implementation Detail Benefits Decompose into micro services • Key user path always available • Failure does not propagate across service boundaries Karyon /w automatic Eureka registration • New instances are quickly found • Failing individual instances disappear Ribbon client with Eureka awareness • Load balances & retries across instances with “smarts” • Handles temporal instance failure Hystrix as dependency circuit breaker • Allows for fast failure • Provides graceful cross service degradation/recovery
  • 14. IaaS High Availability Region (us-east-1) us-east-1e us-east-1c Eureka Web App Service1 Service2 Cluster Auto Recovery and Scaling Services (Auto Scaling Groups) ELB’s Rule Why? Always > 2 of everything 1 is SPOF, 2 doesn’t web scale and slow DR recovery Including IaaS and cloud services You’re only as strong as your weakest dependency Use auto scaler/recovery monitoring Clusters guarantee availability and service latency Use application level health checks Instance on the network != healthy Worldwide availability Data replication, global front-end routing, cross region traffic us-east-1d
  • 15. A truly global service Replicate data across regions Be able to redirect traffic from region to region Be able to migrate regional traffic to other regions Have automated control across regions Flux Demo
  • 16. Testing is only way to prove HA Chaos Monkey Kill instances in production - runs regularly Chaos Gorilla Kills availability zones (single datacenter) Also testing for split brain important Chaos Kong Kill entire region and shift traffic globally Run frequently but with prior scheduling
  • 18. v Continuous Delivery Cluster v1 Canary v2 Cluster V2 Step Technology Developers test locally Unit test frameworks Continuous build Continuous build server based on gradle builds Build “bakes” full instance image Aminator and deployment pipeline bake images from build artifacts Developer work across dev and test Archaius allows for environment based context Developers do canary tests, red/black deployments in prod Asgard console provides app cluster common devops approach, security patterns, and visibility Continuous Build Server Baked to images (AMI’s)
  • 19. From Asgard to Spinnaker Spinnaker is our CI/CD solution CI/CD solution including baking and Jenkins integration Workflow engine for the continuous delivery Pipeline based deployment including baking Global visibility across all of our AWS regions Provides an API first design A microservices runtime HA architecture More flexible cloud model so the community can contribute back improvements not related to AWS Asgard continues to work side-by-side Spinnaker is this new end to end CI/CD tool
  • 20. Spinnaker Examples Works at Netflix scale Views of global pipelines From simple Asgard like deployment to advanced CI/CD pipelines
  • 21. Operational Visibility If you can’t see it, you can’t improve it
  • 22. Operational Visibility Microservice #1 Microservice #2 Visibility Point Technology Basic IaaS instance monitoring Not enough (not scalable, not app specific) User like external monitoring SaaS offerings or OSS like Uptime Targeted performance, sampling Vector performance and app level metrics Service to service interconnects Hystrix streams ➔Turbine aggregation ➔Hystrix dashboard Application centric metrics Servo/Spectator gauges, counters, timers sent to metrics store like Atlas Remote logging Logstash/Kibana or similar log aggregation and analysis frameworks Threshold monitoring and alerts Services like Atlas and PagerDuty for incident management Servo/ Spectator Hystrix/Turbine External Uptime Monitoring Metric/Event Repositories LogStash/Elastic Search/Kibana Incidents Atlas Vector
  • 24. Dynamic, Web Scale & Simpler Security Security Monkey Monitors security policies, tracks changes, alerts on situations Scumblr Searches internet for security “nuggets” (credentials, hacking discussions) Sketchy A safe way to collect text and screenshots from websites FIDO Automated event detection, analysis, enrichment & and enforcement Sleepy Puppy Delayed cross site scripting propagation testing framework Lemur x.509 certificate orchestration framework
  • 25. What did we not cover? Over 50 github projects NetflixOSS is “Technical indigestion as a service” Big Data, Data Persistence and UI Engineering Big Data tools used well beyond Netflix Ephemeral, semi and fully persistent data systems Recent addition of UI OSS and Falcor
  • 27. How do I get started? All of the previous slides shows NetflixOSS components Code: http://netflix.github.io Announcements: http://techblog.netflix.com/ Want to get running a bit faster? ZeroToCloud Workshop for getting started with build/bake/deploy in Amazon EC2 ZeroToDocker Docker images that containing running Netflix technologies (not production ready, but easy to understand)
  • 28. ZeroToDocker Demo Mac OS X Virtual Box Ubuntu 14.04 single kernel Container#1 Filesystem+ process Eureka Container ZuulContainer Another Container ... Docker running instances Single kernel Contained processes Zookeeper and Exhibitor A Microservices app and surrounding NetflixOSS services (Zuul to Karyon with Eureka)