At the StampedeCon 2015 Big Data Conference: Riot Games’ mission statement is to become the most player focused company in the world. With over 67 million players battling on the fields of justice every month, League of Legends generates more than 45 terabytes of data on a daily basis. From game events to store transactions, data comes in from thousands of sources around the world. The big data engineering team at Riot Games is responsible for collecting this data and exposing it through a variety of tools to assist in delivering value to the players. This talk will span the past, present, and future of our data ecosystem, covering the reasons behind the decisions we made and the lessons we learned along the way.
Student profile product demonstration on grades, ability, well-being and mind...
StampedeCon 2015 - Building a Player Focused Data Pipeline
1. #StampedeCon 2015 - Riot Games
BUILDING A PLAYER
FOCUSED DATA
PIPELINE
RYAN TABORA
@ryantabora
SEAN MALONEY
@sean_seannery
2. SEAN MALONEY
ENGINEER
WHO WE ARE
SMALONEY@
RIOTGAMES.COM
@SEAN_SEANNERY
WORKING ON RIOT’S ETL TOOLS
FAVORITE ACTIVITY:
ATTEMPTING TO GROW FACIAL HAIR
BUT FAILING MISERABLY
3. RYAN TABORA
ENGINEER
WHO WE ARE
WORKING ON RIOT’S INGESTION PIPELINE.
FAVORITE ACTIVITY:
EATING MAC + CHEESE WHILE
LISTENING TO DEATH METAL.
RTABORA@
RIOTGAMES.COM
@RYANTABORA
4. OUR DATA PLATFORM (THEN)
5 THINGS YOU NEED
RIOT GAMES SCALE
AGENDA
3 THINGS WE STILL NEED (AND YOU MAY WANT ALSO)
OUR DATA PLATFORM (NOW)
8. LEAGUE OF LEGENDS STATS
7.5 MILLION
PEAK
CONCURRENT
PLAYERS
STATS RELEASED JANUARY 2014
67 MILLION
MONTHLY
ACTIVE PLAYERS
MORE THAN MORE THAN
27 MILLION
DAILY ACTIVE
PLAYERS
MORE THAN
14. Auditing
ETLs can use queries with custom injected data.
Ad-Hoc Data Requests
Extend with new connection types and custom etls easily
Self-Service Architecture
The big data team is small. We can’t manage all the ETLS
ourselves.
Support Multiple Datacenters
One task will execute on different database servers around the
world.
A.K.A.
5 THINGS
WE DIDN’
T HAVE Multiple Data Access Patterns
Extend with new connection types and custom etls easily
15. 5 THINGS YOU NEED
SELF-SERVICE ARCHITECTURE
1
2
3
4
5
19. User Documentation
No one likes doing it, but it helps a lot.
Onboard training
Get new coworkers in-the-know
Familiar Protocols
Use REST or RPC so developers are on the same page
Focus on UX
Your tools need to be easy for non-technical people to use.
SELF
SERVICE
HOW?
20. 5 THINGS YOU NEED
A PLAN FOR MULTIPLE DATACENTERS
1
2
3
4
5
24. Templating
ETLs can use queries with custom injected data.
Scale Horizontally
As the data grows, the tool should be able to handle it.
Empower Users
The big data team is small. We can’t manage all the ETLS
ourselves.
Support One ETL - Many Sources
One task will execute on different database servers around the
world.
YOUR ETL
TOOL
SHOULD...
25. Distributed ETL Software written in
Ruby.
Candidate for Riot open sourcing
Same ETL applied to multiple regions
/ datacenters
Self-Service UI with SQL query
templating.
39. REST micro-service built with Java
and docker.
Reports and visualizations we can
use to find problems.
Source and target comparison.
Warehouse
Auditing
Service
Platform
58. 5 THINGS YOU NEED
AD-HOC DATA CRUNCHING
1
2
3
4
5
59. Easily scale our resources
Both vertically (metastore) and horizontally (clusters)
Support intensive ad-hoc tasks.
We can spin up temporary dedicated clusters for big projects.
We own our infrastructure
Before, the game servers team got all the love.
Can now join our data!
One task will execute on different database servers around the
world.
TO THE
CLOUD!