2. About Google
Google Products
Background
Distributed Computing
Layered Architecture & Abstraction
Google Architecture
Computing Infrastructure
Software Infrastructure
App Engine: Google platform for your
Enterprise
genda
3. Google Mission: to organize the world’s information
and make it universally accessible and useful
Established in (September 4, 1998)
Today Google runs over one million servers in centers
around the world
Processes over one billion search requests[
and twenty petabyte (1015 B) of user-generated data
every day
oogle at glance
4. Some of the products Google provide
Google Search
Gmail
Maps
YouTube
GoogleDoc
Google Calendar
App Engine
And many more
Most of their products are web based
They serve millions of people and they store user’s data in the “cloud”
How do they do that? What is under the hood?
oogle products
5. About Google
Google Products
Background
Distributed Parallel Computing
Layered Architecture & Abstraction
Google Architecture
Computing Infrastructure
Software Infrastructure
App Engine: Google platform for your
Enterprise
genda
6. One “smart” computer doing the task of summing up the cells of the arrays
sequentially.
ackground: (Distributed Parallel Computing)
1 2 0 3
3 1 2 2
5 1 3 3
4 5 3 6
6
8
12
18
44
Compute
7. Five “dummy” distributed computers doing the same task in parallel.
ackground: (Distributed Parallel Computing)
1 2 0 3
3 1 2 2
5 1 3 3
4 5 3 6
6
8
12
18
44
Compute
WorkerServers
Master
Distribute
computation power
and memory
8. Division of concern
Structure the system in layers, such as that each layer has a set of
problems, tasks and processes decoupled from the other layers.
Abstraction
Each layer abstract a set of functions
and concerns to the layer above it
Flexibility
Replace an implementation while maintaining
the interface
ackground: (Layers & Abstraction)
The trouble with layers of computer
software is that sooner or later you loose
touch with reality.
9. About Google
Google Products
Background
Distributed Parallel Computing
Layered Architecture & Abstraction
Google Architecture
Computing Infrastructure
Software Infrastructure
App Engine: Google platform for your
Enterprise
genda
10. rchitecture (General Overview)
Computing Platform:
- Cost Efficiency
- Server Design
- Networking
- Datacenters Technologies
System Infrastructure:
-Google File System (GFS)
-MapReduce
-BigTable
Google Services
Computing
Platform
Clusters of thousands of
commodity-class PC
-Reliable (fault tolerance)
-Scalable
-Cost Efficient (Low end
servers)
System Infrastructure:
A layer of software that abstracts the
hardware complexity from the
developers, it provides features
such as:
-Scheduling
-File access
-Fault management
-And many more
Google Services:
The set of services provided for
the users:
-Usability/User friendliness
-Simplicity
-Performance
-Innovation & solving people’s
problems
11. Google Datacenters evolved over time…
Google.standford.edu (circa 1997)
Eric & Sergey (google founders)
volunteered to receive a shipments of
machines other research groups order,
and hold on them for sometime.
ompuing Platform
13. ompuing Platform
Google’s software architecture arises from two
basic insights *:
o Reliability in software rather than in server-class hardware (thus
they can commodity PC)
o Tailor the design for best aggregate request throughput, not
peak server response time (manage request time by
parallelizing individual request)
* WEB SEARCH FOR A PLANET:THE GOOGLE CLUSTER
ARCHITECTURE by Luiz André Barroso , Jeffrey Dean & Urs Hölzle
14. ompuing Platform
Dual SATA
Disks
RAM
12VDC Sealed
Lead-Acid Battery
Dual CPUs
Power Supply
Google custom made
servers uses consumers
products to get the best
economical value per
performance..
15. ompuing Platform
The servers are placed in racks in
a shipment container (Modular
design)
Plug & play (or serve)
The servers interconnect via a 100-
Mbps Ethernet switch that has one or
two gigabit uplinks to a core gigabit
switch that connects all racks
together.
Each shipping container can hold up
to 1,160 servers
“power above, water below,”
Modular design
The Google facility features
a “container hanger” filled
with 45 containers,
16. ompuing Platform
Some key challenges with Datacenter
design:
Powering:
(Google has a backup battery for each
server as a oppose to a centralized UPS)
Cooling
(Low tech PC generates more heat, thus
the datacenter requires more aggressive
cooling)
Cabling and modularity
(Low tech pc are more prone to failure and
their life span is shorter; thus, those
machines need to be replaced easily)
And much more..
17. ompuing Platform
What could go wrong? Many things*..
Overheating (power down most machines)
PDU failure (machines suddenly
disappear)
Rack-move (plenty of warnings)
Rack-failures (40-80 machines instantly
disappear)
Racks go wonky (40-80 machines see
50% pack loss)
Network maintenance ( ~ 30 min random
connectivity loss)
Individual machine failures
Thousands of hard drive failures
And much more (slow disk, bad memory,
miss configured machine, etc..)
Thousands of low end
machines clustered
together is
maintenance nightmare
!
*Google Seattle Conference on Scalability
18. ompuing Platform
Google datacenters are more a single upgradable machine
Warehouse Scale Machines– (WSM).
19. ompuing Platform
“Cloud” computing or back to mainframe computing?
1960s mainframe machines
serving thin clients
2005 Google datacenters hosting
web applications and serving thin
clients
20. oftware Platform
A software layer on top of computing platform
If one thinks of Google Datacenter as one single machine
(WSM) composted of thousands of individual machines, then
the software platforms managing those machines could be
thought of as an operating system for this machine
Some of the main custom tools created by Google
Google File Systems (GFS)
MapReduce
BigTable
21. oftware Platform (GFS)
Google File System (GFS)
It is designed to provide efficient, reliable access to data
using large clusters of commodity hardware. (from Wikipedia)
Abstract the storage on distributed unreliable hardware
Master machines that deals with Metadata(Filename, mapping from
filename to chuck locations)
64MB chunks (on the disk 8K file system block on the Operating System)
Every chunk is replicated 3 times on different racks
Responsible for managing failures (if machine dies,
then replicate the data in another machine)
22. oftware Platform (MapRecude)
MapReduce
Introduced by Google to support distributed computing on
large data sets on clusters of computers. (from Wikipedia)
Abstract the computation on distributed unreliable hardware
User has to write to functions (Map & Redeuce) and the library will take
care of all the hardware related issues (Assigning tasks to machines,
managing machines failures etc)
The library will try to make the computation faster by pushing the logic
closer to where the chunk data is located
Deals with scalability
23. oftware Platform (MapRecude)
Split the data set into N (mapping)
where N is equal to the number of
available workers
Wait until all the workers finish their
tasks (some processing is done on
intermediate results)
Computer the final result (reduce)
functions
24. oftware Platform (BigTable)
BigTable
A compressed, high performance,
and proprietary database system built on Google File
System (GFS), Chubby Lock Service, and a few
other Google programs (from Wikipedia)
Non-relational distributed database created by Google
Built on top of GFS and provides a higher level of abstraction
Implements a sub-set of typical DBMS (Database management system)
Google Analytics, Google Earth, Personalized Search, App Engine and many more..
25. About Google
Google Products
Background
Distributed Parallel Computing
Layered Architecture & Abstraction
Google Architecture
Computing Infrastructure
Software Infrastructure
App Engine: Google platform for your
Enterprise
genda