eBay has one of the largest data warehouses in the world! See how the BI Platform team at eBay had to rethink and rebuild their system architecture and processes in order to support the ever-growing data volume and scalability needs of their developers and users.
8. every
10
min.a car or truck is
bought via mobile
every
1
min.a tablet is
bought via mobile
every
15
sec.a ladies handbag is
bought via mobile
EBAY MOBILE VOLUME IS STAGGERING
19. INTELLIGENT CUBE PROCESSING
PROBLEM
•Intelligent cube processing taking too
long and missing SLAs
SOLUTION
•Upgraded to a 10Gbps connection to
backend network
•Added additional server to cluster
•Balanced jobs across nodes Cube 05
Cube 04
Cube 03
Cube 02
Cube 01
Cube Processing Times (in
seconds)
Before After
SCALING MICROSTRATEGY AT EBAY 19
22. PROBLEM
•Users need to be able to gauge data
freshness at a glance
CUBE MONITOR
SCALING MICROSTRATEGY AT EBAY 22
23. PROBLEM
•Users need to be able to gauge data
freshness at a glance
SOULTION
•Custom Cube Monitor application
CUBE MONITOR
SCALING MICROSTRATEGY AT EBAY 23
24. PROBLEM
•No way to measure end-to-end user
experience
PAGE LOAD TIMES
SCALING MICROSTRATEGY AT EBAY 24
25. PROBLEM
•No way to measure end-to-end user
experience
SOULTION
•Plugin to measure client-side
performance
PAGE LOAD TIMES
SCALING MICROSTRATEGY AT EBAY 25
26. PROBLEM
•No way to measure end-to-end user
experience
SOULTION
•Plugin to measure client-side
performance
•Email Alerts
PAGE LOAD TIMES
SCALING MICROSTRATEGY AT EBAY 26
27. PROBLEM
•Need a low friction way to log anything
MEASURE EVERYTHING
SCALING MICROSTRATEGY AT EBAY 27
28. PROBLEM
•Need a low friction way to log anything
SOULTION
•HTTP Log Service
MEASURE EVERYTHING
SCALING MICROSTRATEGY AT EBAY 28
http://api.bix.corp.ebay.com/LogService/Log
?app=TEST
&function=TestFunction
&user=tcase
&executiontime=9999
&error=FALSE
¬e=This%20is%20a%20sample
29. PROBLEM
•Difficult to monitor multiple applications
•No single point-of-view
CROSS-PLATFORM INSIGHT
SCALING MICROSTRATEGY AT EBAY 29
30. PROBLEM
•Difficult to monitor multiple applications
•No single point-of-view
SOULTION
CROSS-PLATFORM INSIGHT
SCALING MICROSTRATEGY AT EBAY 30
43. METRICS EXPLORER
•Integrated with internal DataHub
•Leverages MicroStrategy Visual Insight
•Access to Intelligent Cubes for quick analysis
•Warehouse reports also available for deeper analysis
•Can save and share results
SCALING MICROSTRATEGY AT EBAY 43
48. ELEMENT CACHING SERVICE
•Improve prompt performance
– Critical for Metrics Explorer
•Daily and weekly schedules
•Developer self-service
SCALING MICROSTRATEGY AT EBAY 48
53. WEB BASED COMMAND MANAGER (WBCM)
•No need to install Command Manager
•Ability to trigger events via REST service
•Simple workflows supported
•Simplified user management
•Empowers developers
SCALING MICROSTRATEGY AT EBAY 53
58. SELF SERVICE MIGRATION TOOL (SSMT)
•Developer-focused tool for object management
•Enables software development lifecycle (SDLC)
•Built-in automated testing
•Improved visibility into object migration
•Leverages System Manager
SCALING MICROSTRATEGY AT EBAY 58
59. SELF-SERVICE MIGRATION TOOL
SCALING MICROSTRATEGY AT EBAY 59
1. Get Object Manager access
2. Generate package using Object Manager
3. Upload package
4. Import package
o Update Schema
o Purge Object Cache
o Purge Element Cache
o Email Notifications
60. OData
•Industry-standard Open Data Protocol
•Get data in XML or JSON format
•Consume MicroStrategy data from your tool of choice:
– Excel
– Tableau
– Visualization libraries (D3.js, Highcharts)
SCALING MICROSTRATEGY AT EBAY 60
Today’s eBay isn’t what it used to be. Many people think of us only as an auction site. But that perception hasn’t kept up with reality. The reality is that more than 70% of what is sold on eBay is new merchandise, available for purchase immediately.
At eBay we believe that commerce will change more in the next 3 years than it has in the past 20. It’s consumer driven, and technology enabled. And it’s being led by mobile.
First, consumers are using tablets and smart phones as a mission control deviceResearchInspirationCoupons or other discountsChattingWeatherAnd then they’re buying, straight from their mobile devices.
It’s clear, retail and commerce are fundamentally changing – and technology is the driving force. We expect this will cause the $10 Trillion commerce market to be turned on its head in the years to come.
In 2013, our mobile business continued accelerate.There were 120 million mobile applications downloads; by the way our iPhone and iPad apps are now available in 8 languages and in 190 countries. eBay generated $13B in mobile GMV in 2012 and $22 billion mobile GMV for 2013.
When you put all of this together, our business is actually pretty simple – eBay is about connecting people to the things they need and love.
Start off with some Data StatisticsGreater than 12,000 of Named UsersGreater than 55,000 chains of logicGreater than 150,000 Data elements Millions of queries run on our Platforms everydayGreater than 40 Terabytes of Backup per hourGreater then 3.5 Trillion rows in our largest table100TB of New Data everyday100 PB a Day of Physical IO
98 nodehundreds of small Oracle databases on an Hourly basisAn older 128 node system is used as it’s DR. It is managed for high availability. Software releases are adopted after enough time has been given to shake out bugs. Hardware components are all enterprise class for performance and reliability. Singularity 256 nodeTeradata system with 36 PB of spinning disksUser behavior data; A/B testing Software releases have been adopted very aggressively to benefit from new features earlier in the process. Since the workload is not quite as diverse as the EDW, the exposure to bugs is not as significant.The previous production generation serves as a DR system. 20PB Hadoop system Structured and Unstructured Data. Pattern Detection and behavioral data.
Started in AccessCouple years later moved to Oracle,Informatica and Business ObjectsFirst Teradata system around 2002Migrated to MicroStrategy in 2003SAS added to the capabilities in 2004VDMs 2006Tableau 2009Hadoop gained momentum a couple years ago
Close to 800GB of cubes
Heavy user of cubes due to Teradata performanceAt the beginning of 2012 we had 256GB of RAM
Some cube processing was taking 18+ hoursLoading cubes from shared storage was taking too long
Backend network connects to Teradata and carries NFS traffic
Also exploring use for vanity URLs
External process built with SDKRuns every 5 minutes and writes to database tableOutput formatted using Freeform SQL and HTML tags
EM only provides visibility into sever-side processing
JavaScript plugin to measure client-side performance--Time--Browser--Web server--Page rending time--Report/dashboard nameCombine with internal user data to determine location
Includes log agent for log4jUsed within plugins and custom widgets - Application - Function - User - Execution Time (ms) - Was it an error? - Message (note)
Web and Mobile on Windows due to SSO requirements
Couldn’t achieve performance requirements using NAS or SANLocal SSD performed better
All flash storage bubble on 10Gbps network
2 billion row limit for Intelligent Cubes (240GB cube)Linear scalabilityReduce dependency Sub 5 second response time
Same datasets available for MicroStrategy and TableauCan also be leverage in other analytic applications like SAS and RNo 2 billion row limit