SlideShare ist ein Scribd-Unternehmen logo
1 von 65
ONE MAN OPS
      Reliability & Scale in AWS while letting you sleep through the night
                                                         Jos Boumans - @jiboumans
http://www.fwallpaper.net/picture_pics-Sleepy-cat.html
Tuesday 26 March 13
RIPE NCC
                      Engineering manager for RIPE Database
                                                              http://www.ripe.net/db
Tuesday 26 March 13
CANONICAL
                    Engineering manager for Ubuntu Server 10.04 & 10.10

http://lukeroberts.deviantart.com/art/Destroy-Ubuntu-93235775          http://www.ubuntu.com/business/server/overview
Tuesday 26 March 13
KRUX
                      VP of Operations & Infrastructure

                                                          http://www.krux.com/
Tuesday 26 March 13
GOOD GUYS OF DATA PRIVACY
Tuesday 26 March 13
SOME OF OUR CUSTOMERS
Tuesday 26 March 13
LOTS OF TRAFFIC
http://www.americapictures.net/buenos-aires-traffic-city-night-argentina.html
Tuesday 26 March 13
0                              2,500                 5,000        7,500   10,000



               AVERAGE REQUESTS* / SEC
                                                              *Twitter: New tweets
                                                              Wikipedia: Articles read
https://twitter.com/tps_watcher
                                                              Krux: New data points
http://stats.wikimedia.org/EN/TablesPageViewsMonthlyCombined.htm
Tuesday 26 March 13
0                          150,000,000                          300,000,000              450,000,000   600,000,000




                  MONTHLY UNIQUE USERS
http://techcrunch.com/2012/12/18/twitter-passes-200m-monthly-active-users-a-42-increase-over-9-months/
http://technorati.com/technology/article/wikipedias-nonprofit-parent-raises-20-million/
Tuesday 26 March 13
WE CHOSE 'THE CLOUD'
http://previewnetworks.com/blog/
Tuesday 26 March 13
THERE ARE DOWNSIDES
http://modernsavage.hubpages.com/hub/10-springfield-shopper-headlines
Tuesday 26 March 13
FOCUS ON AWS
                                     http://aws.amazon.com/
Tuesday 26 March 13
APRIL 21, 2011
                                                                                                                    http://aws.amazon.com/message/65648/
http://businessnerds.wordpress.com/2011/05/28/so-far-so-good…-the-review/   http://techblog.netflix.com/2011/04/lessons-netflix-learned-from-aws-outage.html
Tuesday 26 March 13
... SOME OUTAGES ...
                 ... SKIPPED FOR BREVITY ...
Tuesday 26 March 13
JUNE 14, 2012
http://www.laczik.org/BMW/repair/E38_wiring_harness/E38_wiring_harness.html   http://blog.pagerduty.com/2012/06/outage-post-mortem-june-14/
Tuesday 26 March 13
JUNE 29, 2012
http://www.fanpop.com/spots/thunderstorm/images/25416163/title/thunderstorms-wallpaper   http://aws.amazon.com/message/67457/
Tuesday 26 March 13
AWS OUTAGE = YOUR OUTAGE
http://it.mario.wikia.com/wiki/Lakitu
Tuesday 26 March 13
THE RULES HAVE CHANGED
                                                        You're not in Kansas anymore

http://entreatmenot.blogspot.com/2011/04/shattered-dreams.html
Tuesday 26 March 13
NETWORK WILL PARTITION
                                                              And it will happen often

http://thevinylvillain.blogspot.com/2010_04_01_archive.html
Tuesday 26 March 13
DISK IO WILL FLUCTUATE
                                                     On a good day, it's mediocre

http://www.freeguidetonwcamping.com/oregon_washington_main/washington/southwest_wa/cape_disappointment_sp.htm
Tuesday 26 March 13
IP ADDRESSES WILL CHANGE
                       IP lease is 8 hours
                      DNS TTL is 60 seconds
www.fantom-xp.com
Tuesday 26 March 13
INSTANCES WILL DIE
                                  And it will always be your Database Master

http://room57.deviantart.com/art/Hangman-188353196
Tuesday 26 March 13
HUMANS MAKE MISTAKES
                      Including your humans

Tuesday 26 March 13
EMBRACE FAILURE
                                Hardware will fail. Humans will make errors.
                                   Nature will produce thunderstorms.
http://www.freeguidetonwcamping.com/oregon_washington_main/washington/southwest_wa/cape_disappointment_sp.htm
Tuesday 26 March 13
OR, COLLOQUIALLY

Tuesday 26 March 13
ADJUST YOUR STRATEGY
                                                      Don't bring a knife to a gun fight

http://www.flickr.com/photos/statlerhotel/6628770499/sizes/l/in/photostream/
Tuesday 26 March 13
DATA STORES
                                                     Some work better than others

http://gustavhoiland.com/2010/03/10/stacked-boxes/
Tuesday 26 March 13
RDBMS
         CouchDB
                                                                  BigTable Based
       Dynamo Based
                                                                Master / Slave based




                              CAP THEOREM
                      Your choice: sacrifice availability or consistency.
                                      Orange is a lie.
Tuesday 26 March 13
MYSQL / ORACLE VS RDS
                      See: Network partitioning & instances dying

Tuesday 26 March 13
AMAZON REDSHIFT
                                      Great for analytics/reports, bad for OLTP
                                           Unburden your RDS instances
http://www.flitemedia.com/music.php                                               http://aws.amazon.com/redshift
Tuesday 26 March 13
BIGTABLE BASED STORES
                                 HBase, Accumulo, Hypertable
                      Still suffer when network partitioning happens
                                                                       http://www.cloudera.com/cdh4/

Tuesday 26 March 13
DYNAMO BASED STORES
                                                         Cassandra, Riak, DynamoDB

http://www.fromoldbooks.org/Walker-ElectricLightingForShips/pages/015-Siemens-Alternate-Current-Dynamo//1552x1175-q75.html   http://aws.amazon.com/dynamodb/faqs/
Tuesday 26 March 13
GO HOSTED?
                                 CouchDB, MongoDB, Riak, Cassandra, HBase
                                          Your Latency May Vary
http://www.fromoldbooks.org/Walker-ElectricLightingForShips/pages/015-Siemens-Alternate-Current-Dynamo//1552x1175-q75.html
Tuesday 26 March 13
CLIENT SIDE STORAGE
                                          Keep a copy of your users data locally

http://www.wired.com/gadgetlab/2012/03/badass-gadget-ammo-lunch-box/       http://www.w3.org/2001/tag/2010/09/ClientSideStorage.html
Tuesday 26 March 13
FILE STORES
                                                                EBS vs Instance Store ...
                                                                     ... vs RamFS
http://homedezine.blogspot.com/2011/04/day-my-cat-removed-carpet-photo-studio.html
Tuesday 26 March 13
SIMPLE STORAGE SERVICE
                                                        S3: Arguably AWS' best feature

http://www.iwallpaper.us/gold-star-fo-christmas-wallpaper-140/
Tuesday 26 March 13
TRAFFIC SHAPING
                                                Control every part of the request

http://www.visualphotos.com/image/2x4154765/man_standing_with_traffic_cones_in_shape_of_u-turn
Tuesday 26 March 13
STAY LOCAL IF YOU CAN
                 Going off box exposes you to risks you need to mitigate

http://southshorewoman.com/issue/june-2010/article/local-character
Tuesday 26 March 13
CACHE WHAT YOU CAN
                                  HTTP Responses, DB Queries, User content
                                         Browsers have caches too!
http://theoatmeal.com/blog/charity_money
Tuesday 26 March 13
USE ELASTIC LOAD BALANCERS
                                                They will save you more than once

http://wallpapers5.com/wallpaper/Balance-Green-Tree-Frog/
Tuesday 26 March 13
USE GLOBAL LOAD BALANCING
                      Fail over to the closest data center on region failure

Tuesday 26 March 13
SHOUT OUT: DYN
                      DNS for Bit.ly, Quora, Twitter, Wikia, etc

Tuesday 26 March 13
USE A CDN
                                        Critical items should always be available

http://kadanthuponanimidangal.blogspot.com/2010/12/blog-post_6992.html
Tuesday 26 March 13
MEASURE EVERYTHING
                Find outliers, deviants & trends before they cause trouble

http://www.themoviedb.org/movie/629-the-usual-suspects
Tuesday 26 March 13
GRAPHITE, STATSD & COLLECTD
                       Use Statsd & Collectd for application/system metrics
                           Use graphite to store, aggregate & visualize
                                                                                                                    http://hostedgraphite.com/
http://bakingismyzen.blogspot.com/2011/07/beignets-cant-have-just-one.html   http://jiboumans.wordpress.com/2012/07/02/measure-all-the-things/
Tuesday 26 March 13
GRAPH EVENTS
         Deployments, outages, CDN reconfigurations, failed builds, etc
          Anything that's important to the health of your eco system
http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/
Tuesday 26 March 13
COMPARE WEEK TO WEEK
                          Overlay week to week graphs using timeShift()
                         Quickly identifies trends and deviations from trends
http://obfuscurity.com/2012/04/Unhelpful-Graphite-Tip-10
Tuesday 26 March 13
FORECASTING
                                 Use Holt-Winters confidence bands
                        Verify that your metrics are within normal tolerance
https://github.com/ripienaar/graphite-graph-dsl/wiki/Creating-Holt-Winters-Forecasts
Tuesday 26 March 13
FIND INDIVIDUAL OUTLIERS
                                                      Absolute numbers mean very little
                                                       Use mean & standard deviation
http://en.wikipedia.org/wiki/File:Black_sheep-1.jpg
Tuesday 26 March 13
ALERT ON TRENDS
                                Once you go over a threshold, it's too late
                              Alert on unwanted trends and preemptively fix
http://sub-second.blogspot.com/2012/06/reporting-response-times-percentile.html   http://aphyr.github.com/riemann/
Tuesday 26 March 13
MEASURE WITHOUT RETROFIT
                                          LogFormat "http.beacon:%D|ms" stats
                                         CustomLog "|nc -u localhost 8125" stats
                                                                               http://jiboumans.wordpress.com/2012/07/02/measure-all-the-things/
http://absinthemindedhero.blogspot.com/2012/03/victory-nonetheless.html   http://jiboumans.wordpress.com/2013/02/27/realtime-stats-from-varnish/
Tuesday 26 March 13
SHOUT OUT: NEW RELIC
             Java, but also Python, Ruby, .NET, PHP & NodeJS support
             In depth profiling of your app for performance & errors.
Tuesday 26 March 13
CONFIGURATION MANAGEMENT
                                                             Unique snowflakes are bad

http://www.torange.us/Plants/Conifers/spruce-needles-in-hoarfrost-424.html
Tuesday 26 March 13
PUPPET VS CHEF
                            Yes.

                                               http://puppetlabs.com/
                                       http://www.opscode.com/chef
Tuesday 26 March 13
INFRASTRUCTURE AS CODE
                                            Use different environments
                                            Measure and report on it
http://americansingercanary.com/green.htm
Tuesday 26 March 13
SHOUT OUT: UBUNTU
                                      Ubuntu + cloud-init + boto = awesome*
                                                                         *I am biased

http://www.123rf.com/photo_4871141_food-pyramid-isolated-on-white.html                  https://github.com/krux/ops-tools

Tuesday 26 March 13
AWS OPSWORKS
                                  Hosted Chef, No extra charge, Ubuntu 12.04 or Amazon Linux
                                                 Still rough around the edges.

http://thebrandbuilder.files.wordpress.com/2011/08/gordon-01.jpg                               http://aws.amazon.com/opsworks/

Tuesday 26 March 13
DEV = PRODUCTION
                          "I dunno, it worked on my laptop"
                                 Instead, use vagrant
http://vagrantup.com/                                         http://vagrantup.com/
Tuesday 26 March 13
ROLL YOUR OWN AMIS
                                                Instantly boot up new deployments
                                                     Reduce Time to Respond
http://bakingismyzen.blogspot.com/2011/07/beignets-cant-have-just-one.html   http://puppetlabs.com/blog/rapid-scaling-with-auto-generated-amis-using-puppet/
Tuesday 26 March 13
CONFIDENT DEPLOYS
                                                   That human error could be yours

http://www.etsy.com/listing/37178125/stormtrooper-regrets-those-were-the
Tuesday 26 March 13
CONTINUOUS INTEGRATION
                         Ours: Github + Jenkins + FPM + apt::s3
                      From commit to deployable in one command                         http://github.com/
                                                                                    http://jenkins-ci.org/
                                                                      https://github.com/thekad/apt-s3
                                                             https://github.com/jordansissel/fpm/wiki/
Tuesday 26 March 13
ONE CLICK DEPLOYMENTS
                                        Deployments should not be exciting.
                                      Don't create a checklist; automate & track
                                                                                             https://checkmarkable.com
http://www.thegreenhead.com/2012/07/one-click-butter-cutter.php               https://github.com/jib/aws-analysis-tools/
Tuesday 26 March 13
DARK LAUNCHES
               Exercise the code without impacting the user experience
                                                                          http://www.kissmetrics.com/
http://www.layoutsparks.com/pictures/moon-23                   https://github.com/yahoo/boomerang/
Tuesday 26 March 13
SHADOW TRAFFIC
                                                    Test new code against live traffic

http://doppelthingers.tumblr.com/post/12839979386/traffic-light-shadow-hangman-and-possibly-his   https://gist.github.com/3125323
Tuesday 26 March 13
SLEEP TIGHT
                                           Slides at: www.Slideshare.net/jiboumans
                                                 We're hiring: www.krux.com
http://raafay-awan.blogspot.com/2011/08/cats-cutest-of-creatures.html
Tuesday 26 March 13

Weitere ähnliche Inhalte

Ähnlich wie Devoxx UK: Reliability & Scale in AWS while letting you sleep through the night

Modules and the Puppet Forge
Modules and the Puppet ForgeModules and the Puppet Forge
Modules and the Puppet ForgePuppet
 
Automatic Configuration of Your Cloud with Puppet
Automatic Configuration of Your Cloud with PuppetAutomatic Configuration of Your Cloud with Puppet
Automatic Configuration of Your Cloud with PuppetPuppet
 
Cloud building talk
Cloud building talkCloud building talk
Cloud building talkbodepd
 
MySQL & MariaDB - Innovation Happens Here
MySQL & MariaDB - Innovation Happens HereMySQL & MariaDB - Innovation Happens Here
MySQL & MariaDB - Innovation Happens HereIvan Zoratti
 
The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]
The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]
The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]Jason Rhodes
 
The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]
The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]
The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]Jason Rhodes
 
sampling on the corpus of giants
sampling on the corpus of giantssampling on the corpus of giants
sampling on the corpus of giantsmknoszlig
 
Consideration for Building a Private Cloud
Consideration for Building a Private CloudConsideration for Building a Private Cloud
Consideration for Building a Private CloudOpenStack Foundation
 
Drupal Course 2013 - Form API
Drupal Course 2013 - Form APIDrupal Course 2013 - Form API
Drupal Course 2013 - Form APIAttila Cs. Nagy
 
Front end performance improvements
Front end performance improvementsFront end performance improvements
Front end performance improvementsMatthew Farina
 
Continuous Delivery at Netflix
Continuous Delivery at NetflixContinuous Delivery at Netflix
Continuous Delivery at NetflixRob Spieldenner
 
The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...
The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...
The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...Ernie Hsiung
 
Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?
Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?
Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?Andy Davies
 

Ähnlich wie Devoxx UK: Reliability & Scale in AWS while letting you sleep through the night (17)

Modules and the Puppet Forge
Modules and the Puppet ForgeModules and the Puppet Forge
Modules and the Puppet Forge
 
Automatic Configuration of Your Cloud with Puppet
Automatic Configuration of Your Cloud with PuppetAutomatic Configuration of Your Cloud with Puppet
Automatic Configuration of Your Cloud with Puppet
 
Cloud building talk
Cloud building talkCloud building talk
Cloud building talk
 
MySQL & MariaDB - Innovation Happens Here
MySQL & MariaDB - Innovation Happens HereMySQL & MariaDB - Innovation Happens Here
MySQL & MariaDB - Innovation Happens Here
 
The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]
The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]
The WordPress Hacker's Guide to the \Galaxy() [@MidwestPHP]
 
The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]
The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]
The WordPress Hacker's Guide to the \Galaxy() [@Baltimore PHP]
 
Wphackergalaxy
WphackergalaxyWphackergalaxy
Wphackergalaxy
 
sampling on the corpus of giants
sampling on the corpus of giantssampling on the corpus of giants
sampling on the corpus of giants
 
Consideration for Building a Private Cloud
Consideration for Building a Private CloudConsideration for Building a Private Cloud
Consideration for Building a Private Cloud
 
Drupal Course 2013 - Form API
Drupal Course 2013 - Form APIDrupal Course 2013 - Form API
Drupal Course 2013 - Form API
 
Front end performance improvements
Front end performance improvementsFront end performance improvements
Front end performance improvements
 
Continuous Delivery at Netflix
Continuous Delivery at NetflixContinuous Delivery at Netflix
Continuous Delivery at Netflix
 
Faster mobile sites
Faster mobile sitesFaster mobile sites
Faster mobile sites
 
Blind XSS
Blind XSSBlind XSS
Blind XSS
 
Wordcamps 2013
Wordcamps 2013Wordcamps 2013
Wordcamps 2013
 
The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...
The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...
The Kitchen Sink Talk (Importing, Exporting, Customization & Troubleshooting ...
 
Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?
Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?
Are Today’s Good Practices... Tomorrow’s Performance Anti-Patterns?
 

Kürzlich hochgeladen

VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 

Kürzlich hochgeladen (20)

VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 

Devoxx UK: Reliability & Scale in AWS while letting you sleep through the night

  • 1. ONE MAN OPS Reliability & Scale in AWS while letting you sleep through the night Jos Boumans - @jiboumans http://www.fwallpaper.net/picture_pics-Sleepy-cat.html Tuesday 26 March 13
  • 2. RIPE NCC Engineering manager for RIPE Database http://www.ripe.net/db Tuesday 26 March 13
  • 3. CANONICAL Engineering manager for Ubuntu Server 10.04 & 10.10 http://lukeroberts.deviantart.com/art/Destroy-Ubuntu-93235775 http://www.ubuntu.com/business/server/overview Tuesday 26 March 13
  • 4. KRUX VP of Operations & Infrastructure http://www.krux.com/ Tuesday 26 March 13
  • 5. GOOD GUYS OF DATA PRIVACY Tuesday 26 March 13
  • 6. SOME OF OUR CUSTOMERS Tuesday 26 March 13
  • 8. 0 2,500 5,000 7,500 10,000 AVERAGE REQUESTS* / SEC *Twitter: New tweets Wikipedia: Articles read https://twitter.com/tps_watcher Krux: New data points http://stats.wikimedia.org/EN/TablesPageViewsMonthlyCombined.htm Tuesday 26 March 13
  • 9. 0 150,000,000 300,000,000 450,000,000 600,000,000 MONTHLY UNIQUE USERS http://techcrunch.com/2012/12/18/twitter-passes-200m-monthly-active-users-a-42-increase-over-9-months/ http://technorati.com/technology/article/wikipedias-nonprofit-parent-raises-20-million/ Tuesday 26 March 13
  • 10. WE CHOSE 'THE CLOUD' http://previewnetworks.com/blog/ Tuesday 26 March 13
  • 12. FOCUS ON AWS http://aws.amazon.com/ Tuesday 26 March 13
  • 13. APRIL 21, 2011 http://aws.amazon.com/message/65648/ http://businessnerds.wordpress.com/2011/05/28/so-far-so-good…-the-review/ http://techblog.netflix.com/2011/04/lessons-netflix-learned-from-aws-outage.html Tuesday 26 March 13
  • 14. ... SOME OUTAGES ... ... SKIPPED FOR BREVITY ... Tuesday 26 March 13
  • 15. JUNE 14, 2012 http://www.laczik.org/BMW/repair/E38_wiring_harness/E38_wiring_harness.html http://blog.pagerduty.com/2012/06/outage-post-mortem-june-14/ Tuesday 26 March 13
  • 17. AWS OUTAGE = YOUR OUTAGE http://it.mario.wikia.com/wiki/Lakitu Tuesday 26 March 13
  • 18. THE RULES HAVE CHANGED You're not in Kansas anymore http://entreatmenot.blogspot.com/2011/04/shattered-dreams.html Tuesday 26 March 13
  • 19. NETWORK WILL PARTITION And it will happen often http://thevinylvillain.blogspot.com/2010_04_01_archive.html Tuesday 26 March 13
  • 20. DISK IO WILL FLUCTUATE On a good day, it's mediocre http://www.freeguidetonwcamping.com/oregon_washington_main/washington/southwest_wa/cape_disappointment_sp.htm Tuesday 26 March 13
  • 21. IP ADDRESSES WILL CHANGE IP lease is 8 hours DNS TTL is 60 seconds www.fantom-xp.com Tuesday 26 March 13
  • 22. INSTANCES WILL DIE And it will always be your Database Master http://room57.deviantart.com/art/Hangman-188353196 Tuesday 26 March 13
  • 23. HUMANS MAKE MISTAKES Including your humans Tuesday 26 March 13
  • 24. EMBRACE FAILURE Hardware will fail. Humans will make errors. Nature will produce thunderstorms. http://www.freeguidetonwcamping.com/oregon_washington_main/washington/southwest_wa/cape_disappointment_sp.htm Tuesday 26 March 13
  • 26. ADJUST YOUR STRATEGY Don't bring a knife to a gun fight http://www.flickr.com/photos/statlerhotel/6628770499/sizes/l/in/photostream/ Tuesday 26 March 13
  • 27. DATA STORES Some work better than others http://gustavhoiland.com/2010/03/10/stacked-boxes/ Tuesday 26 March 13
  • 28. RDBMS CouchDB BigTable Based Dynamo Based Master / Slave based CAP THEOREM Your choice: sacrifice availability or consistency. Orange is a lie. Tuesday 26 March 13
  • 29. MYSQL / ORACLE VS RDS See: Network partitioning & instances dying Tuesday 26 March 13
  • 30. AMAZON REDSHIFT Great for analytics/reports, bad for OLTP Unburden your RDS instances http://www.flitemedia.com/music.php http://aws.amazon.com/redshift Tuesday 26 March 13
  • 31. BIGTABLE BASED STORES HBase, Accumulo, Hypertable Still suffer when network partitioning happens http://www.cloudera.com/cdh4/ Tuesday 26 March 13
  • 32. DYNAMO BASED STORES Cassandra, Riak, DynamoDB http://www.fromoldbooks.org/Walker-ElectricLightingForShips/pages/015-Siemens-Alternate-Current-Dynamo//1552x1175-q75.html http://aws.amazon.com/dynamodb/faqs/ Tuesday 26 March 13
  • 33. GO HOSTED? CouchDB, MongoDB, Riak, Cassandra, HBase Your Latency May Vary http://www.fromoldbooks.org/Walker-ElectricLightingForShips/pages/015-Siemens-Alternate-Current-Dynamo//1552x1175-q75.html Tuesday 26 March 13
  • 34. CLIENT SIDE STORAGE Keep a copy of your users data locally http://www.wired.com/gadgetlab/2012/03/badass-gadget-ammo-lunch-box/ http://www.w3.org/2001/tag/2010/09/ClientSideStorage.html Tuesday 26 March 13
  • 35. FILE STORES EBS vs Instance Store ... ... vs RamFS http://homedezine.blogspot.com/2011/04/day-my-cat-removed-carpet-photo-studio.html Tuesday 26 March 13
  • 36. SIMPLE STORAGE SERVICE S3: Arguably AWS' best feature http://www.iwallpaper.us/gold-star-fo-christmas-wallpaper-140/ Tuesday 26 March 13
  • 37. TRAFFIC SHAPING Control every part of the request http://www.visualphotos.com/image/2x4154765/man_standing_with_traffic_cones_in_shape_of_u-turn Tuesday 26 March 13
  • 38. STAY LOCAL IF YOU CAN Going off box exposes you to risks you need to mitigate http://southshorewoman.com/issue/june-2010/article/local-character Tuesday 26 March 13
  • 39. CACHE WHAT YOU CAN HTTP Responses, DB Queries, User content Browsers have caches too! http://theoatmeal.com/blog/charity_money Tuesday 26 March 13
  • 40. USE ELASTIC LOAD BALANCERS They will save you more than once http://wallpapers5.com/wallpaper/Balance-Green-Tree-Frog/ Tuesday 26 March 13
  • 41. USE GLOBAL LOAD BALANCING Fail over to the closest data center on region failure Tuesday 26 March 13
  • 42. SHOUT OUT: DYN DNS for Bit.ly, Quora, Twitter, Wikia, etc Tuesday 26 March 13
  • 43. USE A CDN Critical items should always be available http://kadanthuponanimidangal.blogspot.com/2010/12/blog-post_6992.html Tuesday 26 March 13
  • 44. MEASURE EVERYTHING Find outliers, deviants & trends before they cause trouble http://www.themoviedb.org/movie/629-the-usual-suspects Tuesday 26 March 13
  • 45. GRAPHITE, STATSD & COLLECTD Use Statsd & Collectd for application/system metrics Use graphite to store, aggregate & visualize http://hostedgraphite.com/ http://bakingismyzen.blogspot.com/2011/07/beignets-cant-have-just-one.html http://jiboumans.wordpress.com/2012/07/02/measure-all-the-things/ Tuesday 26 March 13
  • 46. GRAPH EVENTS Deployments, outages, CDN reconfigurations, failed builds, etc Anything that's important to the health of your eco system http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/ Tuesday 26 March 13
  • 47. COMPARE WEEK TO WEEK Overlay week to week graphs using timeShift() Quickly identifies trends and deviations from trends http://obfuscurity.com/2012/04/Unhelpful-Graphite-Tip-10 Tuesday 26 March 13
  • 48. FORECASTING Use Holt-Winters confidence bands Verify that your metrics are within normal tolerance https://github.com/ripienaar/graphite-graph-dsl/wiki/Creating-Holt-Winters-Forecasts Tuesday 26 March 13
  • 49. FIND INDIVIDUAL OUTLIERS Absolute numbers mean very little Use mean & standard deviation http://en.wikipedia.org/wiki/File:Black_sheep-1.jpg Tuesday 26 March 13
  • 50. ALERT ON TRENDS Once you go over a threshold, it's too late Alert on unwanted trends and preemptively fix http://sub-second.blogspot.com/2012/06/reporting-response-times-percentile.html http://aphyr.github.com/riemann/ Tuesday 26 March 13
  • 51. MEASURE WITHOUT RETROFIT LogFormat "http.beacon:%D|ms" stats CustomLog "|nc -u localhost 8125" stats http://jiboumans.wordpress.com/2012/07/02/measure-all-the-things/ http://absinthemindedhero.blogspot.com/2012/03/victory-nonetheless.html http://jiboumans.wordpress.com/2013/02/27/realtime-stats-from-varnish/ Tuesday 26 March 13
  • 52. SHOUT OUT: NEW RELIC Java, but also Python, Ruby, .NET, PHP & NodeJS support In depth profiling of your app for performance & errors. Tuesday 26 March 13
  • 53. CONFIGURATION MANAGEMENT Unique snowflakes are bad http://www.torange.us/Plants/Conifers/spruce-needles-in-hoarfrost-424.html Tuesday 26 March 13
  • 54. PUPPET VS CHEF Yes. http://puppetlabs.com/ http://www.opscode.com/chef Tuesday 26 March 13
  • 55. INFRASTRUCTURE AS CODE Use different environments Measure and report on it http://americansingercanary.com/green.htm Tuesday 26 March 13
  • 56. SHOUT OUT: UBUNTU Ubuntu + cloud-init + boto = awesome* *I am biased http://www.123rf.com/photo_4871141_food-pyramid-isolated-on-white.html https://github.com/krux/ops-tools Tuesday 26 March 13
  • 57. AWS OPSWORKS Hosted Chef, No extra charge, Ubuntu 12.04 or Amazon Linux Still rough around the edges. http://thebrandbuilder.files.wordpress.com/2011/08/gordon-01.jpg http://aws.amazon.com/opsworks/ Tuesday 26 March 13
  • 58. DEV = PRODUCTION "I dunno, it worked on my laptop" Instead, use vagrant http://vagrantup.com/ http://vagrantup.com/ Tuesday 26 March 13
  • 59. ROLL YOUR OWN AMIS Instantly boot up new deployments Reduce Time to Respond http://bakingismyzen.blogspot.com/2011/07/beignets-cant-have-just-one.html http://puppetlabs.com/blog/rapid-scaling-with-auto-generated-amis-using-puppet/ Tuesday 26 March 13
  • 60. CONFIDENT DEPLOYS That human error could be yours http://www.etsy.com/listing/37178125/stormtrooper-regrets-those-were-the Tuesday 26 March 13
  • 61. CONTINUOUS INTEGRATION Ours: Github + Jenkins + FPM + apt::s3 From commit to deployable in one command http://github.com/ http://jenkins-ci.org/ https://github.com/thekad/apt-s3 https://github.com/jordansissel/fpm/wiki/ Tuesday 26 March 13
  • 62. ONE CLICK DEPLOYMENTS Deployments should not be exciting. Don't create a checklist; automate & track https://checkmarkable.com http://www.thegreenhead.com/2012/07/one-click-butter-cutter.php https://github.com/jib/aws-analysis-tools/ Tuesday 26 March 13
  • 63. DARK LAUNCHES Exercise the code without impacting the user experience http://www.kissmetrics.com/ http://www.layoutsparks.com/pictures/moon-23 https://github.com/yahoo/boomerang/ Tuesday 26 March 13
  • 64. SHADOW TRAFFIC Test new code against live traffic http://doppelthingers.tumblr.com/post/12839979386/traffic-light-shadow-hangman-and-possibly-his https://gist.github.com/3125323 Tuesday 26 March 13
  • 65. SLEEP TIGHT Slides at: www.Slideshare.net/jiboumans We're hiring: www.krux.com http://raafay-awan.blogspot.com/2011/08/cats-cutest-of-creatures.html Tuesday 26 March 13