3. What kind of data?
• App opened
• Killed a walker
• Bought something
• Heartbeat
• Memory usage report
• App error
• Declined a review
prompt
• Finished the tutorial
• Clicked on that button
• Lost a battle
• Found a treasure chest
• Received a push
message
• Finished a turn
• Sent an invite
• Scored a Yahtzee
• Spent 100 silver coins
• Anything else any
game designer or
developer wants to
learn about
9. Where does this flow?
Ariel / Real-Time
Operational monitoring
Business alerts
Dashboarding
Data Warehouse
Funnel analysis
Ad-hoc batch analysis
Reporting
Behavior analysis
Elasticsearch
Ad-hoc realtime analysis
Fraud detection
Top-K summaries
Exploration
Ad-Hoc Forwarding
Data integration with partners
Game-specific systems
11. Kinesis
• Distributed, sharded streams. Akin to Kafka.
• Get an iterator over the stream— and checkpoint with current stream
pointer occasionally.
• Workers coordinate shard leases and checkpoints in DynamoDB (via
KCL)
Shard 0
Shard 1
Shard 2
13. Auxiliary Idempotence
• Idempotence keys at each stage
• Redis sets of idempotence keys by time window
• Gives resilience against various types of failures
17. 1. Deserialize event batch
2. Apply changes to application properties
3. Get current device and application properties
4. Get known facts about sending device
5. Emit to each enriched event to Kinesis
Collection
Kinesis
Enrichment
19. Now we have a stream of well-
described, denormalized event facts.
20. Pipeline to HDFS
• Partitioned by event name and game, buffered in-memory and
written to S3
• Picked up every hour by Spark job
• Converts to Parquet, loaded to HDFS
27. HyperLogLog
• High-level algorithm (four bullet-point version stolen from my
colleague, Cristian)
• b bits of the hashed function is used as an index pointer (redis
uses b = 14, i.e. m = 16384 registers)
• The rest of the hash is inspected for the longest run of zeroes
we can encounter (N)
• The register pointed by the index is replaced with
max(currentValue, N + 1)
• An estimator function is used to calculate the approximated
cardinality
http://content.research.neustar.biz/blog/hll.html
31. Alarm Clocks
• Push timestamp of current events to per-game
pub/sub channel
• Worker takes 99th percentile age of last N events
per title as delay
• Use that time for alarm calculations
• Overlay delays on dashboards
32. Ariel, now with clocks
Event ClockKinesis
Aggregation
PFCOUNT
Are installs anomalous?
Collector
Idempotence
PFADD
Web
Workers
33. Ariel 1.0
• ~30K metrics configured
• Aggregation into 30-minute
buckets
• 12 kilobytes per HLL set
(plus overhead)
36. Hybrid Datastore: Plan
• Move older HLL sets to DynamoDB
• They’re just strings!
• Cache reports aggressively
• Fetch backing HLL data from DynamoDB as
needed on web layer, merge using on-instance
Redis
37. Ariel, now with hybrid datastore
DynamoDB
Report Caches
Old Data Migration
Event Clock
Kinesis
Aggregation
PFCOUNT
Are installs anomalous?
Collector
Idempotence
PFADD
Web
Workers
Merge Scratchpad
39. Redis Roles
• Idempotence
• Configuration Caching
• Aggregation
• Clock
• Scratchpad for merges
• Cache of reports
• Staging of DWH extracts
40. Other Considerations
• Multitenancy. We run parallel stacks and give
games an assigned affinity, to insulate from
pipeline delays
• Backfill. System is forward-looking only; can replay
Kinesis backups to backfill, or backfill from
warehouse
We also expect this to grow with the growth of our userbase, the launch of new titles, and of course with every addition of new, useful functionality.
We’re just looking at one simple transformation of a stream, and the consumption of that stream by a variety of consumers. Since we’re using Kinesis, we can read the same stream in parallel from multiple applications safely.
We’ll consider major challenges moving from left to right across this architecture.
Primary collection is intended to be at-least-once; currently support SQS and HTTP; all batches have idempotence information to allow deduplication.
At this stage, we have minimal logic— we are focused on letting game servers and clients successfully unload their batches of user events, so they can be durably stored in our systems.
System configuration lives in DynamoDB; we use Netflix Archaius
App configuration lives in DynamoDB; we cache in-memory on instances and in Redis
Goals of SQS:
Goal: Register and receive events asynchronously
Goal: Provide elasticity when senders spike
Goal: Reduce CPU burn for senders
Autoscaling group containing a simple Java service, deployed as a golden AMI provisioned with Packer and Ansible, using Cloudformation. We make lots of these — we call them our satellites. Usually we name them after moons.
The little orange symbol means we’re using Amazon’s KCL, so the fleet negotiates workers’ shard control using a lease table in DynamoDB. Monitoring is New Relic and lots of StatsD sent to Datadog.
So every time we see a gray square, assume we’re talking about 1-50 EC2 instances across several availability zones in one AWS region.
But first an aside on Kinesis.
Checkpointing and auxiliary idempotence
The data in our stream has monotonically increasing pointers (huge, huge numbers!). In our case, 1-22 and beyond.
A worker on this shard appears and checkpoints every 5 successfully processed records. But it dies after processing record 12.
When Worker B appears, it sees the checkpoint at 10 and picks up processing the shard at 11. But this means we’ll reprocess 11 and 12!
Similar issues can occur with out-of-order processing of data.
Expensive. Bloom filters may be a viable option some day
Expensive. Bloom filters may be a viable option some day
this stage is the latency-sensitive.
This lets all downstream systems act on data without needing to hit any more systems.
We have considered a streaming ingest, but this has proven easier to reason about and has sufficient liveness at the moment.
Introduced in Redis 2.8.9 (http://antirez.com/news/75)
But I don’t want to really get into this too much…
The first complete implementation of this had three major components: collector, web and workers.
Caveat— not all metrics were HLL; we also support sums, which take only several bytes. But only the sparsest of distinct metrics would require less than 12KB for a time window