Evolution of Supermicro GPU Server Solution

TAIPEI | SEP. 21-22, 2016
Benedict Khoo (FAE Manager, APAC Region), September 21st 2016
We Keep IT Green™
"Earth-friendly" Solutions
EVOLUTION OF SUPERMICRO GPU
SERVER SOLUTION

2
INNOVATION
• Server Building Blocks methodology
• Application optimized
• First to market
ABOUT THE COMPANY…
Our DNA…
OPEN PLATFORM
• True open platform
• Commoditization with innovation
ENERGY EFFICIENCY
• Excellency on thermal / cooling design
• Titanium power supply
• Perfection in green computing

3
Supermicro® (NASDAQ:SMCI) is a global leader in high-
performance, high-efficiency server technology and innovation. We
develop and provide end-to-end green computing solutions to the
datacenter, cloud computing, enterprise IT, big data, HPC, and
embedded markets. Our solutions range from complete server,
storage, blade and workstations to full racks, networking devices,
server management software and technology support and services.
We perform the majority of our R&D efforts in-house, which
increases the communication and collaboration between design
teams, streamlines the development process and reduces time-to-
market. We have developed a set of design principles which allow us
to aggregate individual industry-standard components and materials
to develop truly optimized server boards, chassis, power supplies,
networking and storage devices. This building block approach allows
us to provide a broad range of SKUs, and enables us to build and
deliver application-optimized solutions based upon customers'
requirements.

4
KEY ADVANTAGES FEATURES
Widest Range of Supported Solutions (up
to 7U)…
Highest Density Solutions which can
support up to 10x GPGPU Solutions per
Node
Maximizing performance per Watts, per
Sq. Ft., per Dollar designs…
Unique Green Computing Architecture
features…
Full Bandwidth capable for optimal I/O
performance…
http://www.supermicro.com.tw/white_paper/white_paper_1U_4GPU_Server.pdf
Offer up to 50+ Sku Solutions…

5
GPU Solutions – HPC/Grid Optimized
Tesla S1070
PCI-E x16
1U 4-GPU
Standalone
Server
2008 – GPGPU
Where it started…
1U Twin™
2009 – Hybrid Computing
Pioneer
Integrated
GPU Server
The fastest
1U server
in the world
2011 – GPU Blades
GPU Server &
Workstation
2015 – 1U 4-GPU Optimized
7U 10-blade 20-GPUs
üHigh efficiency power supplies at full capacity
üExcellent thermal design
üNon-blocking air-flow
üGreatest performance layout
üNo re-driver required; no latency
“The most comprehensive
product line in the Industry”
2013 – GPU FatTwin™
2016 – Next Gen
GPU Innovation - Latency
and Performance
Optimized

6
SUPERMICRO SUPERBLADE REVIEW
HTTP://WWW.SERVETHEHOME.COM/SUPERMICRO-SUPERBLADE-GPU-SYSTEM-REVIEW-SBE-710Q-R90-CHASSIS/
Using the Supermicro GPU Super-
Blade platform we quickly saw the
benefits in terms of: higher density,
higher power supply efficiency,
easier maintenance, significantly
reduced cabling, and easier
upgrades/ expansion.
We were impressed by how easy it
was to use and manage the system.

7
3rd Party Server Reviews
97%
Rating:
“As we have said before, the case used for the 7048GR-TR Workstation is
simply the best in quality, craftsmanship, and features. … it is our go-to case
every time. The 7048GR-TR Workstation is designed for maximum uptime with
hot-swappable drives and cooling fans, and includes dual redundant power
supplies.”
— TweakTown
9.7
Rating:
“Overall, for those looking to cram four GPU’s into a small 1U form factor for dense
compute or even VDI applications, the Supermicro 1028GQ-TRT is an excellent
solution. With 10Gbase-T networking, the server is easy to integrate into existing
datacenter infrastructure so long as the rack is able to handle higher-power rated
gear.
… we find the 4028GR-TR is a well designed system that has the ability to
handle high performance work loads. Moving to a large 4U server allows larger
capacity cooling systems to be installed that keep the system cool while running
extreme work loads. This is a trade off vs smaller 1U systems which have higher
density but operate at close to maximum heat load capacities.”
— ServeTheHome

8
STAC-A2 Benchmarks
The STAC-A2 Benchmark suite is the industry
standard for testing technology stacks used for
compute-intensive analytic workloads involved
in pricing and risk management. In all, the
STAC-A2 specifications deliver nearly 200 test
results related to performance, scaling,
efficiency, and quality, which are detailed in this
report.
Test System: Supermicro SYS-1028GR-TR server
World Record Results
Fastest warm time to date in the baseline end-to-end
Greeks benchmark: GREEKS.TIME.WARM;
This was 1.27x the speed of the next fastest
System (SUT ID: INTC150811).

9
https://www.supermicro.com.tw/products/nfo/Green500.cfm
Supermicro 1U GPU Solution at GSIC
Center
- Ranked 1st on the World's Green500 List of
Computer Systems

10
Optimized Portfolio with Highest Rack-level
GPU Density
Best–in-class technology designed for highly parallel applications to deliver ultimate
performance, flexibility, and scalability
1018GR
Single Haswell/Broadwell CPU
8 DDR4 DIMMs
6x 2.5” HS HDD bays
2 Double-Width GPUs
1 x8 PCIe 3.0 slot
1x 1400W Platinum PWS
Cost Effective
1028GR
Dual Haswell/Broadwell CPUs
16 DDR4 DIMMs
3 Double-Width GPUs
1 x8 PCIe 3.0 slot
Mainstream
1028GQ
16 DDR4 DIMMs
4 Double-Width GPUs
Active/Passive GPUs
2 x8 PCIe 3.0 Slots
Parallel Optimized
432

11
Optimized Portfolio with Highest Node-level
GPU Density
Best–in-class technology designed for highly parallel applications to deliver ultimate
performance, flexibility, and scalability
7048GR
4U Chassis
Dual Haswell/Broadwell CPUs w/
IPMI
16 DDR4 DIMMs
4 Double-Width GPUs
x16/x8/x4 – 4/2/1**
2x 2000W Titanium PWS
Mission Critical
2028GR
2U Chassis
16 DDR4 DIMMs
6 Double-Width GPUs
1 x8 PCIe 3.0 slot
Mainstream
4028GR
4U Chassis
24 DDR4 DIMMs
8 Double-Width GPUs
2 x8 PCIe 3.0 slot; 1 x4 PCIe 2.0
slot
Parallel Optimized
864

12
Widest Portfolios
RACKTOWER MULTI-NODE
6:2 (2U)
4:2)
3:2 (1U)
2:1 (1U)
3:2
(4U/4Node)
4:2 (2U)
1:2 (1U)
1:2 (2U/2Node)
3*:2 (WS)
6:2
(4U / 2Node)
8:2 (4U)
4:2 (1U) 2:2 (7U / 10Node)
3:2 (2U)3*:2 (4U)
HIGHER DENSITY
GPUENABLEDGPUOPTIMIZED
1:1 (WS)
RATIO:
GPU:CPU *Support MAX 2x Double Width GPU

13
THE LEADING SOLUTIONS (NEW)
GPU Optimized Server Portfolio
New Generation High Performance Optimal Solutions…

14
CUSTOMER PAIN POINTS
Machine Learning / AI
applications have
large datasets well
beyond one single
GPU.
PROBLEM SOLUTION
Aggregate GPU resources
to tackle large dataset
computation, in
conjunction with high
speed connectivity to
minimize latency
Generic ARCHITECTURE
QPI
PCIe
PCIe
Latency is a major bottleneck,
based on many 8x GPU designs
With constant communication,
the QPI + PCIe is a major
constraint. Symmetric PCIe
design is NOT efficient for
Machine Learning Applications.

15
Highest Density NVIDIA GPU Solution
MAXWELL/PASCAL READY
• Active/Passive GPU Support
• Support latest Maxwell/Pascal GPUs
• Support a 10 GPUs configuration
X10 SUPERMICRO ADVANTAGE
● PERFORMANCE: GPUs under single CPU Root
● FLEXIBILITY: Supports up to 10x Active/Passive GPUs
● GPU RDMA: Direct Internode GPU Interconnect
● EFFICIENCY: Titanium-rated Power Supply
● DESIGN: No GPU preheating
ADVANTAGES
• GPU compute unit on one ROOT can train twice as fast and explore networks twice as
large.
• Distributed training across eight GPUs allows scaling to size and speed of the networks by
another factor of two
The most flexible parallel computing solution in the market. Optimized for GPU peering, this
architecture enables faster Machine Learning Training by up to with GPUs under a
single CPU root!
Single Root Complex Design
for World Class Latency Optimized Solution
Super High Computing Capability
Highest Performance/ Watts Capabilities

16
Optimized Solution With NVIDIA Pascal GPU Architecture
PASCAL GPU READY
• Performance – 10 TFLOPs FP32
• NVLink Advance Technology
• 3D Memory - 2x Memory Bandwidth
X10 SUPERMICRO ADVANTAGE
● PERFORMANCE: 8x PASCAL with GPUs IN 1U/ 4U
● NVLINK: 80GB/s High Bandwidth GPU Interconnect
● RDMA FABRIC: 4x Direct Low Latency Data Access
● EFFICIENCY: Titanium-rated Power Supply
● DESIGN: No GPU preheating
ADVANTAGES
• All GPUs capable of Peer-to-Peer direct access to all other GPUs’ memory as well as
direct transfer (memcpy) operations via NVLink at high Bandwidth
• High performance for collective communications
• PCIe bandwidth fully available for host and/or NIC communication during inter-GPU
communication
Unparalleled 1U platform for the highest parallel applications. No one else can do so much in
a 1U!!!! Up to NDIDIA GPU with Pascal Architecture in , supporting Optimized GPU
RDMA

TAIPEI | SEP. 21-22, 2016
THANK YOU
More Information Please Talk To Our Representatives
WWW.SUPERMICRO.COM/GPU
We Keep IT Green™
"Earth-friendly" Solutions

Evolution of Supermicro GPU Server Solution

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Evolution of Supermicro GPU Server Solution

Ähnlich wie Evolution of Supermicro GPU Server Solution (20)

Mehr von NVIDIA Taiwan

Mehr von NVIDIA Taiwan (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Evolution of Supermicro GPU Server Solution