SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Scalable Streaming Data
Pipelines with Redis
Avram Lyon
@ajlyon / github.com/avram
redisconf / May 10, 2016
MOBILE GAMES - PUBLISHER AND DEVELOPER
What kind of data?
• App opened
• Killed a walker
• Bought something
• Heartbeat
• Memory usage report
• App error
• Declined a review
prompt
• Finished the tutorial
• Clicked on that button
• Lost a battle
• Found a treasure chest
• Received a push
message
• Finished a turn
• Sent an invite
• Scored a Yahtzee
• Spent 100 silver coins
• Anything else any
game designer or
developer wants to
learn about
How much?
Recently:
Peak:
2.8 million events / minute
2.4 billion events / day
Primary Data Stream
Collection
Kinesis
Warehousing
Enrichment
Realtime MonitoringKinesisPublic API
Collection
HTTP
Collection
SQS
SQS
SQS
Studio A
Studio B
Studio C
Kinesis
SQS Failover
Redis
Caching App Configurations
System Configurations
Kinesis
SQS Failover
S3
Kinesis
Elasticsearch?
Enricher
Data
Warehouse
Forwarder
Ariel
(Realtime)
Idempotence
Aggregation
Idempotence
Idempotence
What’s in the box?
=
Where does this flow?
Ariel / Real-Time
Operational monitoring
Business alerts
Dashboarding
Data Warehouse
Funnel analysis
Ad-hoc batch analysis
Reporting
Behavior analysis
Elasticsearch
Ad-hoc realtime analysis
Fraud detection
Top-K summaries
Exploration
Ad-Hoc Forwarding
Data integration with partners
Game-specific systems
Kinesis
a short aside
Kinesis
• Distributed, sharded streams. Akin to Kafka.
• Get an iterator over the stream— and checkpoint with current stream
pointer occasionally.
• Workers coordinate shard leases and checkpoints in DynamoDB (via
KCL)
Shard 0
Shard 1
Shard 2
Shard 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Checkpointing
Checkpoint for Shard 0: 10 Given: Worker checkpoints every 5
Worker A 🔥
Worker B
Auxiliary Idempotence
• Idempotence keys at each stage
• Redis sets of idempotence keys by time window
• Gives resilience against various types of failures
Auxiliary Idempotence
Auxiliary Idempotence
• Gotcha: Set expiry is O(N)
• Broke up into small sets, partitioned by first 2 bytes of md5 of
idempotence key
Collection
Kinesis
Warehousing
Enrichment
Realtime MonitoringKinesisPublic API
1. Deserialize event batch
2. Apply changes to application properties
3. Get current device and application properties
4. Get known facts about sending device
5. Emit to each enriched event to Kinesis
Collection
Kinesis
Enrichment
Kinesis
SQS Failover
Kinesis
S3
Elasticsearch
?
S3 Backups
to HDFS
Enricher
Data
Warehouse
Forwarder
Idempotence
Ariel
Realtime
Idempotence
Aggregation
Idempotence
Now we have a stream of well-
described, denormalized event facts.
Pipeline to HDFS
• Partitioned by event name and game, buffered in-memory and
written to S3
• Picked up every hour by Spark job
• Converts to Parquet, loaded to HDFS
A closer look at Ariel
Dashboards
Alarms
Ariel Goals
• Low time-to-visibility
• Easy configuration
• Low cost per configured metric
Configuration
Live Metrics (Ariel)
Enriched Event Data
name: game_end
time: 2015-07-15 10:00:00.000 UTC
_devices_per_turn: 1.0
event_id: 12345
device_token: AAAA
user_id: 100
name: game_end
time: 2015-07-15 10:01:00.000 UTC
_devices_per_turn: 14.1
event_id: 12346
device_token: BBBB
user_id: 100
name: Cheating Games
predicate: _devices_per_turn > 1.5
target: event_id
type: DISTINCT
id: 1
name: Cheating Players
predicate: _devices_per_turn > 1.5
target: user_id
type: DISTINCT
id: 2
name: game_end
time: 2015-07-15 10:01:00.000 UTC
_devices_per_turn: 14.1
event_id: 12347
device_token: BBBB
user_id: 100
PFADD /m/1/2015-07-15-10-00 12346
PFADD /m/1/2015-07-15-10-00 123467
PFADD /m/2/2015-07-15-10-00 BBBB
PFADD /m/2/2015-07-15-10-00 BBBB
PFCOUNT /m/1/2015-07-15-10-00
2
PFCOUNT /m/2/2015-07-15-10-00
1
Configured Metrics
Collector
HyperLogLog
• High-level algorithm (four bullet-point version stolen from my
colleague, Cristian)
• b bits of the hashed function is used as an index pointer (redis
uses b = 14, i.e. m = 16384 registers)
• The rest of the hash is inspected for the longest run of zeroes
we can encounter (N)
• The register pointed by the index is replaced with
max(currentValue, N + 1)
• An estimator function is used to calculate the approximated
cardinality
http://content.research.neustar.biz/blog/hll.html
Live Metrics (Ariel)
Enriched Event Data
name: game_end
time: 2015-07-15 10:00:00.000 UTC
_devices_per_turn: 1.0
event_id: 12345
device_token: AAAA
user_id: 100
name: game_end
time: 2015-07-15 10:01:00.000 UTC
_devices_per_turn: 14.1
event_id: 12346
device_token: BBBB
user_id: 100
name: Cheating Games
predicate: _devices_per_turn > 1.5
target: event_id
type: DISTINCT
id: 1
name: Cheating Players
predicate: _devices_per_turn > 1.5
target: user_id
type: DISTINCT
id: 2
name: game_end
time: 2015-07-15 10:01:00.000 UTC
_devices_per_turn: 14.1
event_id: 12347
device_token: BBBB
user_id: 100
PFADD /m/1/2015-07-15-10-00 12346
PFADD /m/1/2015-07-15-10-00 123467
PFADD /m/2/2015-07-15-10-00 BBBB
PFADD /m/2/2015-07-15-10-00 BBBB
PFCOUNT /m/1/2015-07-15-10-00
2
PFCOUNT /m/2/2015-07-15-10-00
1
Configured Metrics
We can count
different things
Collector
Kinesis
Aggregation
Ariel
PFCOUNT
Are installs anomalous?
Collector
Idempotence
PFADD
Web
Workers
Pipeline Delay
• Pipelines back up
• Dashboards get outdated
• Alarms fire!
Alarm Clocks
• Push timestamp of current events to per-game
pub/sub channel
• Worker takes 99th percentile age of last N events
per title as delay
• Use that time for alarm calculations
• Overlay delays on dashboards
Ariel, now with clocks
Event ClockKinesis
Aggregation
PFCOUNT
Are installs anomalous?
Collector
Idempotence
PFADD
Web
Workers
Ariel 1.0
• ~30K metrics configured
• Aggregation into 30-minute
buckets
• 12 kilobytes per HLL set
(plus overhead)
Challenges
• Dataset size.
RedisLabs non-cluster
max = 100GB
• Packet/s limits: 250K in
EC2-Classic
• Alarm granularity
Hybrid Datastore:
Requirements
• Need to keep HLL sets to count distinct
• Redis is relatively finite
• HLL outside of Redis is messy
Hybrid Datastore: Plan
• Move older HLL sets to DynamoDB
• They’re just strings!
• Cache reports aggressively
• Fetch backing HLL data from DynamoDB as
needed on web layer, merge using on-instance
Redis
Ariel, now with hybrid datastore
DynamoDB
Report Caches
Old Data Migration
Event Clock
Kinesis
Aggregation
PFCOUNT
Are installs anomalous?
Collector
Idempotence
PFADD
Web
Workers
Merge Scratchpad
Much less memory…
Redis Roles
• Idempotence
• Configuration Caching
• Aggregation
• Clock
• Scratchpad for merges
• Cache of reports
• Staging of DWH extracts
Other Considerations
• Multitenancy. We run parallel stacks and give
games an assigned affinity, to insulate from
pipeline delays
• Backfill. System is forward-looking only; can replay
Kinesis backups to backfill, or backfill from
warehouse
Why Not _____?
• Druid
• Flink
• InfluxDB
• RethinkDB
Thanks!
Questions?
scopely.com/jobs
@ajlyon
avram@scopely.com
github.com/avram

Weitere ähnliche Inhalte

Was ist angesagt?

Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016Zabbix
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...Fred de Villamil
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comDamien Krotkine
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Nagios
 
Gordonh0945deepdive openstackcompute-140417174059-phpapp02
Gordonh0945deepdive openstackcompute-140417174059-phpapp02Gordonh0945deepdive openstackcompute-140417174059-phpapp02
Gordonh0945deepdive openstackcompute-140417174059-phpapp02Công TÔ
 
Securing Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINXSecuring Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINXDocker, Inc.
 
HadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop OverviewHadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop OverviewYafang Chang
 
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
Building a Real-Time Data Pipeline with Spark, Kafka, and PythonBuilding a Real-Time Data Pipeline with Spark, Kafka, and Python
Building a Real-Time Data Pipeline with Spark, Kafka, and PythonSingleStore
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimDatabricks
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Nagios
 
Dip into prometheus
Dip into prometheusDip into prometheus
Dip into prometheusZaar Hai
 
Icinga lsm 2015 copy
Icinga lsm 2015 copyIcinga lsm 2015 copy
Icinga lsm 2015 copyNETWAYS
 
OSMC 2014: Current state of Icinga | Icinga Team
OSMC 2014: Current state of Icinga | Icinga TeamOSMC 2014: Current state of Icinga | Icinga Team
OSMC 2014: Current state of Icinga | Icinga TeamNETWAYS
 
Microservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital OneMicroservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital OneNoriaki Tatsumi
 
PyconKR 2019 Lightning Talk - Let The Dogs Out on Kubernetes
PyconKR 2019 Lightning Talk - Let The Dogs Out on KubernetesPyconKR 2019 Lightning Talk - Let The Dogs Out on Kubernetes
PyconKR 2019 Lightning Talk - Let The Dogs Out on KubernetesSeokju Hong
 
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxDataHow to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxDataInfluxData
 
Elk ruminating on logs
Elk ruminating on logsElk ruminating on logs
Elk ruminating on logsMathew Beane
 
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios
 
CN Asturias - Stateful application for kubernetes
CN Asturias -  Stateful application for kubernetes CN Asturias -  Stateful application for kubernetes
CN Asturias - Stateful application for kubernetes Cédrick Lunven
 

Was ist angesagt? (20)

Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.com
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
 
Gordonh0945deepdive openstackcompute-140417174059-phpapp02
Gordonh0945deepdive openstackcompute-140417174059-phpapp02Gordonh0945deepdive openstackcompute-140417174059-phpapp02
Gordonh0945deepdive openstackcompute-140417174059-phpapp02
 
Securing Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINXSecuring Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINX
 
HadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop OverviewHadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop Overview
 
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
Building a Real-Time Data Pipeline with Spark, Kafka, and PythonBuilding a Real-Time Data Pipeline with Spark, Kafka, and Python
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
 
Dip into prometheus
Dip into prometheusDip into prometheus
Dip into prometheus
 
Icinga lsm 2015 copy
Icinga lsm 2015 copyIcinga lsm 2015 copy
Icinga lsm 2015 copy
 
OSMC 2014: Current state of Icinga | Icinga Team
OSMC 2014: Current state of Icinga | Icinga TeamOSMC 2014: Current state of Icinga | Icinga Team
OSMC 2014: Current state of Icinga | Icinga Team
 
Microservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital OneMicroservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital One
 
PyconKR 2019 Lightning Talk - Let The Dogs Out on Kubernetes
PyconKR 2019 Lightning Talk - Let The Dogs Out on KubernetesPyconKR 2019 Lightning Talk - Let The Dogs Out on Kubernetes
PyconKR 2019 Lightning Talk - Let The Dogs Out on Kubernetes
 
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxDataHow to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
 
Elk ruminating on logs
Elk ruminating on logsElk ruminating on logs
Elk ruminating on logs
 
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
 
CN Asturias - Stateful application for kubernetes
CN Asturias -  Stateful application for kubernetes CN Asturias -  Stateful application for kubernetes
CN Asturias - Stateful application for kubernetes
 

Ähnlich wie Scalable Streaming Data Pipelines with Redis

Scalable Streaming Data Pipelines with Redis
Scalable Streaming Data Pipelines with RedisScalable Streaming Data Pipelines with Redis
Scalable Streaming Data Pipelines with RedisAvram Lyon
 
High Availability by Design
High Availability by DesignHigh Availability by Design
High Availability by DesignDavid Prinzing
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2inovex GmbH
 
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...VMware Tanzu
 
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel LavoieSpring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel LavoieVMware Tanzu
 
Our Data Ourselves, Pydata 2015
Our Data Ourselves, Pydata 2015Our Data Ourselves, Pydata 2015
Our Data Ourselves, Pydata 2015kingsBSD
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in MotionRuhani Arora
 
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroDevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroGaurav "GP" Pal
 
Realtime Web Apps in 2014 & Beyond
Realtime Web Apps in 2014 & BeyondRealtime Web Apps in 2014 & Beyond
Realtime Web Apps in 2014 & BeyondPhil Leggetter
 
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy NguyenGrokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy NguyenHuy Nguyen
 
Stream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdaysStream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdaysSmartNews, Inc.
 
Spring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonSpring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonVMware Tanzu
 
Georgia Azure Event - Scalable cloud games using Microsoft Azure
Georgia Azure Event - Scalable cloud games using Microsoft AzureGeorgia Azure Event - Scalable cloud games using Microsoft Azure
Georgia Azure Event - Scalable cloud games using Microsoft AzureMicrosoft
 
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...Amazon Web Services
 
[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습
[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습
[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습Oracle Korea
 
CQRS and Event Sourcing
CQRS and Event Sourcing CQRS and Event Sourcing
CQRS and Event Sourcing Inho Kang
 
Event Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoTEvent Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoTLei Xu
 
Say hello to the new PlayFab!
Say hello to the new PlayFab!Say hello to the new PlayFab!
Say hello to the new PlayFab!Thomas Robbins
 
Combinación de logs, métricas y rastreos para observabilidad unificada
Combinación de logs, métricas y rastreos para observabilidad unificadaCombinación de logs, métricas y rastreos para observabilidad unificada
Combinación de logs, métricas y rastreos para observabilidad unificadaElasticsearch
 
'DOCKER' & CLOUD: ENABLERS For DEVOPS
'DOCKER' & CLOUD:  ENABLERS For DEVOPS'DOCKER' & CLOUD:  ENABLERS For DEVOPS
'DOCKER' & CLOUD: ENABLERS For DEVOPSACA IT-Solutions
 

Ähnlich wie Scalable Streaming Data Pipelines with Redis (20)

Scalable Streaming Data Pipelines with Redis
Scalable Streaming Data Pipelines with RedisScalable Streaming Data Pipelines with Redis
Scalable Streaming Data Pipelines with Redis
 
High Availability by Design
High Availability by DesignHigh Availability by Design
High Availability by Design
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
 
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
 
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel LavoieSpring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
 
Our Data Ourselves, Pydata 2015
Our Data Ourselves, Pydata 2015Our Data Ourselves, Pydata 2015
Our Data Ourselves, Pydata 2015
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
 
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroDevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
 
Realtime Web Apps in 2014 & Beyond
Realtime Web Apps in 2014 & BeyondRealtime Web Apps in 2014 & Beyond
Realtime Web Apps in 2014 & Beyond
 
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy NguyenGrokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
 
Stream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdaysStream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdays
 
Spring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonSpring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - Boston
 
Georgia Azure Event - Scalable cloud games using Microsoft Azure
Georgia Azure Event - Scalable cloud games using Microsoft AzureGeorgia Azure Event - Scalable cloud games using Microsoft Azure
Georgia Azure Event - Scalable cloud games using Microsoft Azure
 
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
 
[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습
[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습
[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습
 
CQRS and Event Sourcing
CQRS and Event Sourcing CQRS and Event Sourcing
CQRS and Event Sourcing
 
Event Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoTEvent Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoT
 
Say hello to the new PlayFab!
Say hello to the new PlayFab!Say hello to the new PlayFab!
Say hello to the new PlayFab!
 
Combinación de logs, métricas y rastreos para observabilidad unificada
Combinación de logs, métricas y rastreos para observabilidad unificadaCombinación de logs, métricas y rastreos para observabilidad unificada
Combinación de logs, métricas y rastreos para observabilidad unificada
 
'DOCKER' & CLOUD: ENABLERS For DEVOPS
'DOCKER' & CLOUD:  ENABLERS For DEVOPS'DOCKER' & CLOUD:  ENABLERS For DEVOPS
'DOCKER' & CLOUD: ENABLERS For DEVOPS
 

Kürzlich hochgeladen

Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excelysmaelreyes
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 

Kürzlich hochgeladen (20)

Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excel
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 

Scalable Streaming Data Pipelines with Redis

  • 1. Scalable Streaming Data Pipelines with Redis Avram Lyon @ajlyon / github.com/avram redisconf / May 10, 2016
  • 2. MOBILE GAMES - PUBLISHER AND DEVELOPER
  • 3. What kind of data? • App opened • Killed a walker • Bought something • Heartbeat • Memory usage report • App error • Declined a review prompt • Finished the tutorial • Clicked on that button • Lost a battle • Found a treasure chest • Received a push message • Finished a turn • Sent an invite • Scored a Yahtzee • Spent 100 silver coins • Anything else any game designer or developer wants to learn about
  • 4. How much? Recently: Peak: 2.8 million events / minute 2.4 billion events / day
  • 6. Collection HTTP Collection SQS SQS SQS Studio A Studio B Studio C Kinesis SQS Failover Redis Caching App Configurations System Configurations
  • 9. Where does this flow? Ariel / Real-Time Operational monitoring Business alerts Dashboarding Data Warehouse Funnel analysis Ad-hoc batch analysis Reporting Behavior analysis Elasticsearch Ad-hoc realtime analysis Fraud detection Top-K summaries Exploration Ad-Hoc Forwarding Data integration with partners Game-specific systems
  • 11. Kinesis • Distributed, sharded streams. Akin to Kafka. • Get an iterator over the stream— and checkpoint with current stream pointer occasionally. • Workers coordinate shard leases and checkpoints in DynamoDB (via KCL) Shard 0 Shard 1 Shard 2
  • 12. Shard 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Checkpointing Checkpoint for Shard 0: 10 Given: Worker checkpoints every 5 Worker A 🔥 Worker B
  • 13. Auxiliary Idempotence • Idempotence keys at each stage • Redis sets of idempotence keys by time window • Gives resilience against various types of failures
  • 15. Auxiliary Idempotence • Gotcha: Set expiry is O(N) • Broke up into small sets, partitioned by first 2 bytes of md5 of idempotence key
  • 17. 1. Deserialize event batch 2. Apply changes to application properties 3. Get current device and application properties 4. Get known facts about sending device 5. Emit to each enriched event to Kinesis Collection Kinesis Enrichment
  • 18. Kinesis SQS Failover Kinesis S3 Elasticsearch ? S3 Backups to HDFS Enricher Data Warehouse Forwarder Idempotence Ariel Realtime Idempotence Aggregation Idempotence
  • 19. Now we have a stream of well- described, denormalized event facts.
  • 20. Pipeline to HDFS • Partitioned by event name and game, buffered in-memory and written to S3 • Picked up every hour by Spark job • Converts to Parquet, loaded to HDFS
  • 21. A closer look at Ariel
  • 23. Ariel Goals • Low time-to-visibility • Easy configuration • Low cost per configured metric
  • 25. Live Metrics (Ariel) Enriched Event Data name: game_end time: 2015-07-15 10:00:00.000 UTC _devices_per_turn: 1.0 event_id: 12345 device_token: AAAA user_id: 100 name: game_end time: 2015-07-15 10:01:00.000 UTC _devices_per_turn: 14.1 event_id: 12346 device_token: BBBB user_id: 100 name: Cheating Games predicate: _devices_per_turn > 1.5 target: event_id type: DISTINCT id: 1 name: Cheating Players predicate: _devices_per_turn > 1.5 target: user_id type: DISTINCT id: 2 name: game_end time: 2015-07-15 10:01:00.000 UTC _devices_per_turn: 14.1 event_id: 12347 device_token: BBBB user_id: 100 PFADD /m/1/2015-07-15-10-00 12346 PFADD /m/1/2015-07-15-10-00 123467 PFADD /m/2/2015-07-15-10-00 BBBB PFADD /m/2/2015-07-15-10-00 BBBB PFCOUNT /m/1/2015-07-15-10-00 2 PFCOUNT /m/2/2015-07-15-10-00 1 Configured Metrics Collector
  • 26.
  • 27. HyperLogLog • High-level algorithm (four bullet-point version stolen from my colleague, Cristian) • b bits of the hashed function is used as an index pointer (redis uses b = 14, i.e. m = 16384 registers) • The rest of the hash is inspected for the longest run of zeroes we can encounter (N) • The register pointed by the index is replaced with max(currentValue, N + 1) • An estimator function is used to calculate the approximated cardinality http://content.research.neustar.biz/blog/hll.html
  • 28. Live Metrics (Ariel) Enriched Event Data name: game_end time: 2015-07-15 10:00:00.000 UTC _devices_per_turn: 1.0 event_id: 12345 device_token: AAAA user_id: 100 name: game_end time: 2015-07-15 10:01:00.000 UTC _devices_per_turn: 14.1 event_id: 12346 device_token: BBBB user_id: 100 name: Cheating Games predicate: _devices_per_turn > 1.5 target: event_id type: DISTINCT id: 1 name: Cheating Players predicate: _devices_per_turn > 1.5 target: user_id type: DISTINCT id: 2 name: game_end time: 2015-07-15 10:01:00.000 UTC _devices_per_turn: 14.1 event_id: 12347 device_token: BBBB user_id: 100 PFADD /m/1/2015-07-15-10-00 12346 PFADD /m/1/2015-07-15-10-00 123467 PFADD /m/2/2015-07-15-10-00 BBBB PFADD /m/2/2015-07-15-10-00 BBBB PFCOUNT /m/1/2015-07-15-10-00 2 PFCOUNT /m/2/2015-07-15-10-00 1 Configured Metrics We can count different things Collector
  • 30. Pipeline Delay • Pipelines back up • Dashboards get outdated • Alarms fire!
  • 31. Alarm Clocks • Push timestamp of current events to per-game pub/sub channel • Worker takes 99th percentile age of last N events per title as delay • Use that time for alarm calculations • Overlay delays on dashboards
  • 32. Ariel, now with clocks Event ClockKinesis Aggregation PFCOUNT Are installs anomalous? Collector Idempotence PFADD Web Workers
  • 33. Ariel 1.0 • ~30K metrics configured • Aggregation into 30-minute buckets • 12 kilobytes per HLL set (plus overhead)
  • 34. Challenges • Dataset size. RedisLabs non-cluster max = 100GB • Packet/s limits: 250K in EC2-Classic • Alarm granularity
  • 35. Hybrid Datastore: Requirements • Need to keep HLL sets to count distinct • Redis is relatively finite • HLL outside of Redis is messy
  • 36. Hybrid Datastore: Plan • Move older HLL sets to DynamoDB • They’re just strings! • Cache reports aggressively • Fetch backing HLL data from DynamoDB as needed on web layer, merge using on-instance Redis
  • 37. Ariel, now with hybrid datastore DynamoDB Report Caches Old Data Migration Event Clock Kinesis Aggregation PFCOUNT Are installs anomalous? Collector Idempotence PFADD Web Workers Merge Scratchpad
  • 39. Redis Roles • Idempotence • Configuration Caching • Aggregation • Clock • Scratchpad for merges • Cache of reports • Staging of DWH extracts
  • 40. Other Considerations • Multitenancy. We run parallel stacks and give games an assigned affinity, to insulate from pipeline delays • Backfill. System is forward-looking only; can replay Kinesis backups to backfill, or backfill from warehouse
  • 41. Why Not _____? • Druid • Flink • InfluxDB • RethinkDB

Hinweis der Redaktion

  1. We also expect this to grow with the growth of our userbase, the launch of new titles, and of course with every addition of new, useful functionality.
  2. We’re just looking at one simple transformation of a stream, and the consumption of that stream by a variety of consumers. Since we’re using Kinesis, we can read the same stream in parallel from multiple applications safely. We’ll consider major challenges moving from left to right across this architecture.
  3. Primary collection is intended to be at-least-once; currently support SQS and HTTP; all batches have idempotence information to allow deduplication. At this stage, we have minimal logic— we are focused on letting game servers and clients successfully unload their batches of user events, so they can be durably stored in our systems. System configuration lives in DynamoDB; we use Netflix Archaius App configuration lives in DynamoDB; we cache in-memory on instances and in Redis Goals of SQS: Goal: Register and receive events asynchronously Goal: Provide elasticity when senders spike Goal: Reduce CPU burn for senders
  4. Autoscaling group containing a simple Java service, deployed as a golden AMI provisioned with Packer and Ansible, using Cloudformation. We make lots of these — we call them our satellites. Usually we name them after moons. The little orange symbol means we’re using Amazon’s KCL, so the fleet negotiates workers’ shard control using a lease table in DynamoDB. Monitoring is New Relic and lots of StatsD sent to Datadog. So every time we see a gray square, assume we’re talking about 1-50 EC2 instances across several availability zones in one AWS region.
  5. But first an aside on Kinesis.
  6. Checkpointing and auxiliary idempotence The data in our stream has monotonically increasing pointers (huge, huge numbers!). In our case, 1-22 and beyond. A worker on this shard appears and checkpoints every 5 successfully processed records. But it dies after processing record 12. When Worker B appears, it sees the checkpoint at 10 and picks up processing the shard at 11. But this means we’ll reprocess 11 and 12! Similar issues can occur with out-of-order processing of data.
  7. Expensive. Bloom filters may be a viable option some day
  8. Expensive. Bloom filters may be a viable option some day
  9. this stage is the latency-sensitive.
  10. This lets all downstream systems act on data without needing to hit any more systems.
  11. We have considered a streaming ingest, but this has proven easier to reason about and has sufficient liveness at the moment.
  12. Introduced in Redis 2.8.9 (http://antirez.com/news/75) But I don’t want to really get into this too much…
  13. The first complete implementation of this had three major components: collector, web and workers.
  14. Caveat— not all metrics were HLL; we also support sums, which take only several bytes. But only the sparsest of distinct metrics would require less than 12KB for a time window