SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
1
2
THE EXPANDING UNIVERSE OF HPC
JENSEN HUANG | SC19
3
AT THE INTERSECTION OF GRAPHICS, SIMULATION, AI
4
5
COMPUTING FOR THE DA VINCIS OF OUR TIME
FIRST AI SUPERCOMPUTERS FIRST EXASCALE SCIENCE42 NEW TOP 500 SYSTEMS
ABCI
SUMMIT
CLIMATE
LBNL | NVIDIA
GENOMICS
ORNL
NUCLEAR WASTE
REMEDIATION
LBNL | PNNL Brown U. NVIDIA
CANCER DETECTION
ORNL | Stony Brook U.
6
FULL STACK
SPEED-UP
CUDA-X
CUDA
AI DRIVEMETRO ISAACCLARARAPIDS AERIALCG
CUDA 10.2
cuTENSOR 1.0
cuSOLVER 10.3
cuBLAS 10.2
cuDNN 7.6
TensorRT 6.0
DALI 0.15
NCCL 2.5
IndeX 2.1
OptiX 7.0
RAPIDS 0.10
Spark XGBoost
3x in 2 Years
2017
2019
2018
Time to Solution
27 Hours
20 Hours
10 Hours
Amber
Chroma
GROMACS
GTC
LAMMPS
MILC
NAMD
QE
SPECFEM3D
TensorFlow
VASP
Benchmark Application: Amber [PME-Cellulose_NVE], Chroma [szscl21_24_128], GROMACS [ADH Dodec: Dev Prototype], GTC
[moi#proc.in], LAMMPS [LJ 2.5], MILC [Apex Medium], NAMD [stmv_nve_cuda], Quantum Espresso [AUSURF112-jR], SPECFEM3D
[four_material_simple_model]; TensorFlow [ResNet-50] , VASP [Si-Huge]; GPU node: with dual-socket CPUs with 4x V100 GPU.
7
THE EXPANDING UNIVERSE OF HPC
NETWORK
EDGE ANALYTICS
SIMULATION
AI
Edge
Cloud
Arm
Data
Analytics
Extreme
IO
EXTREME IO
8
INCREDIBLE ADVANCES IN AI
WRITING
DIALOG
TRANSLATION
SUMMARIZATION
Q&A
CLASSIFICATION
2012 2019
BERT
TRANSFORMER
ALEXNET
CNN
3D POSE
DENOISING
SEGMENTATION
OBJECT RECOGNITION
CLASSIFICATION
IMAGE GENERATION
9
GPU COMPUTING POWERS AI ADVANCES
#1 MLPERF — AI TRAINING + AI INFERENCE HPC COMPUTING CHALLENGE
Doubling
2 Years
Doubling
3.4 Months
Two Distinct Eras of AI TrainingSuper Moore’s Law — From 600 to 2 Hours in 5 Years
K80 SERVER
DGX 2 Hours
600 Hours
Time to Train (ResNet-50)
10
NVIDIA AI END-TO-END PLATFORM
TRAINING AUTONOMOUS MACHINES
DGX HGX EGX AGX
EDGE AICLOUD
11
AI FOR SCIENCE
EXPERIMENTATION
DATA
SIMULATION
DATA
NEURAL ESTIMATION
Real-time Steering Fast Approximation
Design Space Exploration
ICF + MERLIN — Fusion
Inverse Problems
LIGO — Gravitational Waves
Faster Prediction
ANI + MD – Chemistry
Real-time Steering
ITER – Fusion Energy
12
13
1x
Data Transfer
100x
Data Collected
STREAMING AI
SOFTWARE-DEFINED
SENSORS
BUILD MODELSSTREAMING AI
PROCESSING
ECMWF: 287 TB/dayLSST: 20 TB/day
SKA: 16 TB/sec
14
NVIDIA EGX STACK
NGC
Kubernetes Networking Storage Security
CUDA-X
Third-Party ISVs
METROPOLIS
IMAGE
PROCESSING
DECODE DNN GRAPHICS ENCODE
DEEPSTREAM
Powered by NVIDIA CUDA Tensor Core GPU | Secured Boot Root of Trust
Cryptographic Acceleration for IPsec and TLS | NVMe-oF over TCP and RDMA
Industrial-strength Cloud Native and AI Stack
NVIDIA EGX EDGE SUPERCOMPUTING PLATFORM
15
VERTICAL INDUSTRY FRAMEWORKS
Clara Metropolis
Isaac Omniverse Aerial
DRIVE
WORLD’S LARGEST DELIVERY SERVICE
ADOPTS NVIDIA AI
PUTTING AI TO WORK
16
NVIDIA EGX
Edge Supercomputing Platform
17
SUPERCOMPUTING CLOUD
Benchmark Application: Amber [PME-Cellulose_NVE], Chroma [szscl21_24_128], GROMACS [ADH Dodec:
Dev Prototype], GTC [moi#proc.in], LAMMPS [LJ 2.5], MILC [Apex Medium], NAMD [stmv_nve_cuda],
Quantum Espresso [AUSURF112-jR], SPECFEM3D [four_material_simple_model]; TensorFlow [ResNet-50],
VASP [Si-Huge]; GPU node: with dual-socket CPUs with 4x V100 GPU.
CPU Instance 48 Hours, $152
Amber, Chroma,
GROMACS, GTC, LAMMPS
MILC, NAMD, QE, SPECFEM3D,
TensorFlow, VASP
SUPER COMPUTING IS HARD — CLOUD HPC IS EXPENSIVE
18
SUPERCOMPUTING CLOUD
8x GPU Instance
1x GPU Instance
CPU Instance 48 Hours, $152
Amber, Chroma,
GROMACS, GTC, LAMMPS
MILC, NAMD, QE, SPECFEM3D,
TensorFlow, VASP
Benchmark Application: Amber [PME-Cellulose_NVE], Chroma [szscl21_24_128], GROMACS [ADH Dodec:
Dev Prototype], GTC [moi#proc.in], LAMMPS [LJ 2.5], MILC [Apex Medium], NAMD [stmv_nve_cuda],
Quantum Espresso [AUSURF112-jR], SPECFEM3D [four_material_simple_model]; TensorFlow [ResNet-50],
VASP [Si-Huge]; GPU node: with dual-socket CPUs with 4x V100 GPU.
SUPER COMPUTING IS HARD —
GPU CLOUD 1/7TH COST OF CPU CLOUD
48x Faster, 1/7th the Cost
19
ICECUBE OBSERVATORY
DETECTING NEUTRINOS
50K NVIDIA GPUs IN THE CLOUD
350 PF OF SIMULATION FOR 2 HOURS
PRODUCED 5% OF ANNUAL SIMULATION DATA
AWS, MICROSOFT AZURE, GOOGLE CLOUD PLATFORM
DISTRIBUTED ACROSS U.S., EUROPE, APAC
Frank Wüerthwein, Ph.D.
Executive Director, Open Science Grid
Igor Sfiligoi
Lead Developer and Researcher
MULTIPLE GENERATIONS,
ONE APPLICATION
Events Processed
Per GPU Type
V100
M60
K80
T4
P40
P100
THE LARGEST CLOUD SIMULATION IN HISTORY
20
Up to 800 V100 GPUs Connected via Mellanox InfiniBand
ANNOUNCING
WORLD’S LARGEST ON-DEMAND SUPERCOMPUTER
21
DIVERSE ARM ARCHITECTURES
AMPERE COMPUTING eMAG
Hyperscale and Storage
AMAZON GRAVITON
Hyperscale and SmartNIC
MARVELL THUNDERX2
Hyperscale, Storage and HPC
FUJITSU A64FX
Supercomputing
HUAWEI KUNPENG 920
Big Data and Edge
22
NVIDIA CUDA ON ARM AT OAK RIDGE NATIONAL LAB
Benchmark Application [Dataset]: GROMACS [ADH Dodec- Dev prototype], LAMMPS [LJ 2.5], MILC [Apex Small],
NAMD [apoa1_npt_cuda], Quantum Espresso [AUSURF112-jR], Relion [Plasmodium Ribosome], SPECFEM3D
[four_material_simple_model], TensorFlow [ResNet50: Batch:256]; CPU node: 2x ThunderX2 9975; GPU node:
Same CPU node + 2x V100 32GB PCIe ; *1xV100 for GROMACS, MILC, and TensorFlow
23
ANNOUNCING
NVIDIA HPC FOR ARM
HPC Server Reference Platform | 8 V100 Tensor Core GPUs with NVLink
4 100 Gbps Mellanox InfiniBand| Systems Ranging from Supercomputer, Hyperscale, to Edge
CUDA on Arm Beta Available Now
NIC PCIe Switch PCIe Switch NIC
CPU CPU
GPU
GPU
GPU
GPU
24
ANNOUNCING
NVIDIA HPC FOR ARM
HPC Server Reference Platform | 8 V100 Tensor Core GPUs with NVLink
4 100 Gbps Mellanox InfiniBand| Systems Ranging from Supercomputer, Hyperscale, to Edge
CUDA on Arm Beta Available Now
NIC PCIe Switch PCIe Switch NIC
CPU CPU
GPU
GPU
GPU
GPU
25
ANNOUNCING
NVIDIA HPC FOR ARM
HPC Server Reference Platform | 8 V100 Tensor Core GPUs with NVLink
4 100 Gbps Mellanox InfiniBand| Systems Ranging from Supercomputer, Hyperscale, to Edge
CUDA on Arm Beta Available Now
APPLICATIONS
PROGRAMMING MODELS
C++
CUDA
FORTRAN
COMET
DCA++ GROMACS
INDEX
LAMMPS
LSMS
MATLAB
MILC
NAMD
OPTIX
RELION
TENSORFLOW
PARAVIEW
OPENACC
PYTHON
ARM ALLINEA STUDIO
BRIGHT COMPUTING
CMAKE
CUDA-GDB
CUPTI
GCC
LLVM
NVCC
PAPI
SINGULARITY
SLURM
TAU
GAMERA
SDKS
QUANTUM ESPRESSO
PERFORCE TOTALVIEW
PGI
SCORE-P
VMD
26
27
50 GB/s 50 GB/s
EXTREME COMPUTE NEEDS EXTREME IO
TRADITIONAL RDMA
NODE A NODE B
PCIe Switch
CPU
System Memory
GPU
NIC
PCIe Switch
CPU
System Memory
GPU
NIC
28
EXTREME COMPUTE NEEDS EXTREME IO
GPUDIRECT RDMA
NODE A NODE B
PCIe Switch
CPU
System Memory
GPU
100 GB/s
NIC
PCIe Switch
CPU
System Memory
GPU
NIC
29
EXTREME COMPUTE NEEDS EXTREME IO
TRADITIONAL STORAGE
PCIe Switch
CPU
System Memory
GPU
GPUDIRECT RDMA
NODE A NODE B
NIC
PCIe Switch
CPU
System Memory
GPU
100 GB/s
NIC
PCIe Switch
CPU
System Memory
GPU
NIC
50 GB/s
30
EXTREME COMPUTE NEEDS EXTREME IO
GPUDIRECT STORAGE
PCIe Switch
CPU
System Memory
GPU
Storage
100 GB/s
GPUDIRECT RDMA
NODE A NODE B
NIC
PCIe Switch
CPU
System Memory
GPU
100 GB/s
NIC
PCIe Switch
CPU
System Memory
GPU
NIC
31
ANNOUNCING NVIDIA MAGNUM IO
Acceleration Libraries for Large-scale HPC and IO
High-bandwidth, Low-latency, Massive Storage Access with Lower CPU Utilization
GPUDIRECT STORAGE
PCIe Switch
CPU
System Memory
GPU
Storage
100 GB/s
GPUDIRECT RDMA
NODE A NODE B
NIC
PCIe Switch
CPU
System Memory
GPU
100 GB/s
NIC
PCIe Switch
CPU
System Memory
GPU
NIC
32
PYTHON
CUDA
APACHE ARROW
CUDF CUGRAPH
RAPIDS
CUML
PANDAS SCI-KL / XGBOOST
CUDNN
DEEP LEARNING
FRAMEWORKS
DASK
NVIDIA RAPIDS DATA SCIENCE
Open Source | Multi-GPU and Multi-Node | Up to 100x Speed-Up | 150K Downloads in 1 Year
Data Load and Processing Times from Hours to Minutes | Used by NERSC, ORNL, NASA, SDSC
33
NVIDIA MAGNUM IO BOOSTS RAPIDS DATA ANALYTICS
20x ON TPC-H STRUCTURAL BIOLOGY — 3x VMDNEW PANGEO XARRAY ZARR READER
FOR CLIMATE
Q4 TPC-H Benchmark Work Breakdown:
With Repeated Query
0 400,000 800,000 1,200,000
WITHOUT GDS
WITH GDS
Latency (msec)
CUDA
Startup
GPU and CPU
Allocation
Data
Preload
Warmup
Query
Repeat
Query
Clean
Up
Driver
Close
34
ANNOUNCING
WORLD’S LARGEST INTERACTIVE VOLUME VISUALIZATION
Simulating Mars Lander with FUN3D | Interactively Visualizing 150 TB; Unstructured Mesh
4 NVIDIA DGX-2 Streaming 400 GB/s | NVIDIA Magnum IO | NVIDIA IndeX
35
ANNOUNCING
NVIDIA DGX-2 AS SUPERCOMPUTING ANALYTICS INSTRUMENT
16 V100 GPUs - 2 PF Tensor Core | 512 GB HBM2 - 16 TB/s | 8 MLNX CX5 - 800 Gbps
30 TB NVMe - 53 GB/s with Magnum IO | Fabric Storage - 100 GB/s with Magnum IO
2.3x Faster Than Current IO500 10-node Leader
Powered by NVIDIA Magnum IO
EXTREME WEATHER
AI INFERENCE
NVIDIA TENSOR RT
3D VOLUME ANALYTICS
PANGEO XARRAY
VMD COMPUTATIONAL
MICROSCOPE
NVIDIA OPTIX
3D INTERACTIVE VOLUME RENDERING
NVIDIA INDEX
TPC-H RECORD
10 TB JOIN
NVIDIA RAPIDS
36
THE EXPANDING UNIVERSE OF HPC
NETWORK
EDGE ANALYTICS
SIMULATION
EXTREME IO
NVIDIA HPC
for ARM
NVIDIA EGX
Edge
Supercomputing
Platform
NVIDIA DGX-2
Supercomputing
Analytics Instrument
NVIDIA
Magnum IO
NGC
Azure
37

Weitere ähnliche Inhalte

Was ist angesagt?

Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUAMD
 
NVIDIA GTC 2020 October Summary
NVIDIA GTC 2020 October SummaryNVIDIA GTC 2020 October Summary
NVIDIA GTC 2020 October SummaryNVIDIA
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD
 
Tilera tile64 by Ibrahem Batta
Tilera tile64  by Ibrahem BattaTilera tile64  by Ibrahem Batta
Tilera tile64 by Ibrahem BattaIbrahem Batta
 
Parallel computing with Gpu
Parallel computing with GpuParallel computing with Gpu
Parallel computing with GpuRohit Khatana
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLinside-BigData.com
 
10. GPU - Video Card (Display, Graphics, VGA)
10. GPU - Video Card (Display, Graphics, VGA)10. GPU - Video Card (Display, Graphics, VGA)
10. GPU - Video Card (Display, Graphics, VGA)Akhila Dakshina
 
High performance computing tutorial, with checklist and tips to optimize clus...
High performance computing tutorial, with checklist and tips to optimize clus...High performance computing tutorial, with checklist and tips to optimize clus...
High performance computing tutorial, with checklist and tips to optimize clus...Pradeep Redddy Raamana
 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practicesLior Sidi
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentationVishal Singh
 
Introduction to FPGA acceleration
Introduction to FPGA accelerationIntroduction to FPGA acceleration
Introduction to FPGA accelerationMarco77328
 
Red Bend Software: Separation Using Type-1 Virtualization in Vehicles and Aut...
Red Bend Software: Separation Using Type-1 Virtualization in Vehicles and Aut...Red Bend Software: Separation Using Type-1 Virtualization in Vehicles and Aut...
Red Bend Software: Separation Using Type-1 Virtualization in Vehicles and Aut...Red Bend Software
 
A Peek into Google's Edge TPU
A Peek into Google's Edge TPUA Peek into Google's Edge TPU
A Peek into Google's Edge TPUKoan-Sin Tan
 
Green computing
Green computingGreen computing
Green computingsubtlejaya
 
Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)Rakuten Group, Inc.
 

Was ist angesagt? (20)

Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
 
NVIDIA GTC 2020 October Summary
NVIDIA GTC 2020 October SummaryNVIDIA GTC 2020 October Summary
NVIDIA GTC 2020 October Summary
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor Architecture
 
HPC in higher education
HPC in higher educationHPC in higher education
HPC in higher education
 
Tilera tile64 by Ibrahem Batta
Tilera tile64  by Ibrahem BattaTilera tile64  by Ibrahem Batta
Tilera tile64 by Ibrahem Batta
 
Parallel computing with Gpu
Parallel computing with GpuParallel computing with Gpu
Parallel computing with Gpu
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
 
Gpu
GpuGpu
Gpu
 
10. GPU - Video Card (Display, Graphics, VGA)
10. GPU - Video Card (Display, Graphics, VGA)10. GPU - Video Card (Display, Graphics, VGA)
10. GPU - Video Card (Display, Graphics, VGA)
 
Blue gene
Blue geneBlue gene
Blue gene
 
High performance computing tutorial, with checklist and tips to optimize clus...
High performance computing tutorial, with checklist and tips to optimize clus...High performance computing tutorial, with checklist and tips to optimize clus...
High performance computing tutorial, with checklist and tips to optimize clus...
 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practices
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentation
 
GPU Programming
GPU ProgrammingGPU Programming
GPU Programming
 
Introduction to FPGA acceleration
Introduction to FPGA accelerationIntroduction to FPGA acceleration
Introduction to FPGA acceleration
 
Red Bend Software: Separation Using Type-1 Virtualization in Vehicles and Aut...
Red Bend Software: Separation Using Type-1 Virtualization in Vehicles and Aut...Red Bend Software: Separation Using Type-1 Virtualization in Vehicles and Aut...
Red Bend Software: Separation Using Type-1 Virtualization in Vehicles and Aut...
 
A Peek into Google's Edge TPU
A Peek into Google's Edge TPUA Peek into Google's Edge TPU
A Peek into Google's Edge TPU
 
Green computing
Green computingGreen computing
Green computing
 
Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 

Ähnlich wie NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019

組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステムShinnosuke Furuya
 
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdfNVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdfMuhammadAbdullah311866
 
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univa
NVIDIA GPUs Power HPC & AI Workloads in Cloud with UnivaNVIDIA GPUs Power HPC & AI Workloads in Cloud with Univa
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univainside-BigData.com
 
2 Sessione - Macchine virtuali per la scalabilità di calcolo per velocizzare ...
2 Sessione - Macchine virtuali per la scalabilità di calcolo per velocizzare ...2 Sessione - Macchine virtuali per la scalabilità di calcolo per velocizzare ...
2 Sessione - Macchine virtuali per la scalabilità di calcolo per velocizzare ...Jürgen Ambrosi
 
Talk on commercialising space data
Talk on commercialising space data Talk on commercialising space data
Talk on commercialising space data Alison B. Lowndes
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platforminside-BigData.com
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA Taiwan
 
Application Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsApplication Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsGanesan Narayanasamy
 
PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018NVIDIA
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUsiguazio
 
GTC 2019 Keynote in Silicon Valley
GTC 2019 Keynote in Silicon ValleyGTC 2019 Keynote in Silicon Valley
GTC 2019 Keynote in Silicon ValleyNVIDIA
 
OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020John Zedlewski
 

Ähnlich wie NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019 (20)

組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム
 
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdfNVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
 
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univa
NVIDIA GPUs Power HPC & AI Workloads in Cloud with UnivaNVIDIA GPUs Power HPC & AI Workloads in Cloud with Univa
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univa
 
Latest HPC News from NVIDIA
Latest HPC News from NVIDIALatest HPC News from NVIDIA
Latest HPC News from NVIDIA
 
2 Sessione - Macchine virtuali per la scalabilità di calcolo per velocizzare ...
2 Sessione - Macchine virtuali per la scalabilità di calcolo per velocizzare ...2 Sessione - Macchine virtuali per la scalabilità di calcolo per velocizzare ...
2 Sessione - Macchine virtuali per la scalabilità di calcolo per velocizzare ...
 
Talk on commercialising space data
Talk on commercialising space data Talk on commercialising space data
Talk on commercialising space data
 
Advances in GPU Computing
Advances in GPU ComputingAdvances in GPU Computing
Advances in GPU Computing
 
Nvidia at SEMICon, Munich
Nvidia at SEMICon, MunichNvidia at SEMICon, Munich
Nvidia at SEMICon, Munich
 
Nvidia tesla-k80-overview
Nvidia tesla-k80-overviewNvidia tesla-k80-overview
Nvidia tesla-k80-overview
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platform
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
 
Ac922 cdac webinar
Ac922 cdac webinarAc922 cdac webinar
Ac922 cdac webinar
 
Application Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsApplication Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systems
 
PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUs
 
GTC 2019 Keynote in Silicon Valley
GTC 2019 Keynote in Silicon ValleyGTC 2019 Keynote in Silicon Valley
GTC 2019 Keynote in Silicon Valley
 
OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020GPU Accelerated Data Science with RAPIDS - ODSC West 2020
GPU Accelerated Data Science with RAPIDS - ODSC West 2020
 

Mehr von NVIDIA

NVIDIA Story 2023.pdf
NVIDIA Story 2023.pdfNVIDIA Story 2023.pdf
NVIDIA Story 2023.pdfNVIDIA
 
NVIDIA GTC2022 Spring Highlights
NVIDIA GTC2022 Spring HighlightsNVIDIA GTC2022 Spring Highlights
NVIDIA GTC2022 Spring HighlightsNVIDIA
 
NVIDIA Brochure 2021 Company Overview
NVIDIA Brochure 2021 Company OverviewNVIDIA Brochure 2021 Company Overview
NVIDIA Brochure 2021 Company OverviewNVIDIA
 
The Best of AI and HPC in Healthcare and Life Sciences
The Best of AI and HPC in Healthcare and Life SciencesThe Best of AI and HPC in Healthcare and Life Sciences
The Best of AI and HPC in Healthcare and Life SciencesNVIDIA
 
NLP for Biomedical Applications
NLP for Biomedical ApplicationsNLP for Biomedical Applications
NLP for Biomedical ApplicationsNVIDIA
 
Top 5 Deep Learning and AI Stories - August 30, 2019
Top 5 Deep Learning and AI Stories - August 30, 2019Top 5 Deep Learning and AI Stories - August 30, 2019
Top 5 Deep Learning and AI Stories - August 30, 2019NVIDIA
 
Seven Ways to Boost Artificial Intelligence Research
Seven Ways to Boost Artificial Intelligence ResearchSeven Ways to Boost Artificial Intelligence Research
Seven Ways to Boost Artificial Intelligence ResearchNVIDIA
 
NVIDIA Developer Program Overview
NVIDIA Developer Program OverviewNVIDIA Developer Program Overview
NVIDIA Developer Program OverviewNVIDIA
 
NVIDIA at Computex 2019
NVIDIA at Computex 2019 NVIDIA at Computex 2019
NVIDIA at Computex 2019 NVIDIA
 
Top 5 DGX Sessions From GTC 2019
Top 5 DGX Sessions From GTC 2019Top 5 DGX Sessions From GTC 2019
Top 5 DGX Sessions From GTC 2019NVIDIA
 
DGX POD Top 4 Sessions From GTC 2019
DGX POD Top 4 Sessions From GTC 2019DGX POD Top 4 Sessions From GTC 2019
DGX POD Top 4 Sessions From GTC 2019NVIDIA
 
Top 5 Data Science Sessions from GTC 2019
Top 5 Data Science Sessions from GTC 2019Top 5 Data Science Sessions from GTC 2019
Top 5 Data Science Sessions from GTC 2019NVIDIA
 
This Week in Data Science - Top 5 News - April 26, 2019
This Week in Data Science - Top 5 News - April 26, 2019This Week in Data Science - Top 5 News - April 26, 2019
This Week in Data Science - Top 5 News - April 26, 2019NVIDIA
 
CUDA DLI Training Courses at GTC 2019
CUDA DLI Training Courses at GTC 2019CUDA DLI Training Courses at GTC 2019
CUDA DLI Training Courses at GTC 2019NVIDIA
 
DGX Sessions You Won't Want to Miss at GTC 2019
DGX Sessions You Won't Want to Miss at GTC 2019DGX Sessions You Won't Want to Miss at GTC 2019
DGX Sessions You Won't Want to Miss at GTC 2019NVIDIA
 
Transforming Healthcare at GTC Silicon Valley
Transforming Healthcare at GTC Silicon ValleyTransforming Healthcare at GTC Silicon Valley
Transforming Healthcare at GTC Silicon ValleyNVIDIA
 
OpenACC Monthly Highlights February 2019
OpenACC Monthly Highlights February 2019OpenACC Monthly Highlights February 2019
OpenACC Monthly Highlights February 2019NVIDIA
 
CUDA Sessions You Won't Want to Miss at GTC 2019
CUDA Sessions You Won't Want to Miss at GTC 2019CUDA Sessions You Won't Want to Miss at GTC 2019
CUDA Sessions You Won't Want to Miss at GTC 2019NVIDIA
 
Empowering Radiology with AI
Empowering Radiology with AIEmpowering Radiology with AI
Empowering Radiology with AINVIDIA
 
Top 5 Deep Learning and AI Stories - November 30, 2018
Top 5 Deep Learning and AI Stories - November 30, 2018Top 5 Deep Learning and AI Stories - November 30, 2018
Top 5 Deep Learning and AI Stories - November 30, 2018NVIDIA
 

Mehr von NVIDIA (20)

NVIDIA Story 2023.pdf
NVIDIA Story 2023.pdfNVIDIA Story 2023.pdf
NVIDIA Story 2023.pdf
 
NVIDIA GTC2022 Spring Highlights
NVIDIA GTC2022 Spring HighlightsNVIDIA GTC2022 Spring Highlights
NVIDIA GTC2022 Spring Highlights
 
NVIDIA Brochure 2021 Company Overview
NVIDIA Brochure 2021 Company OverviewNVIDIA Brochure 2021 Company Overview
NVIDIA Brochure 2021 Company Overview
 
The Best of AI and HPC in Healthcare and Life Sciences
The Best of AI and HPC in Healthcare and Life SciencesThe Best of AI and HPC in Healthcare and Life Sciences
The Best of AI and HPC in Healthcare and Life Sciences
 
NLP for Biomedical Applications
NLP for Biomedical ApplicationsNLP for Biomedical Applications
NLP for Biomedical Applications
 
Top 5 Deep Learning and AI Stories - August 30, 2019
Top 5 Deep Learning and AI Stories - August 30, 2019Top 5 Deep Learning and AI Stories - August 30, 2019
Top 5 Deep Learning and AI Stories - August 30, 2019
 
Seven Ways to Boost Artificial Intelligence Research
Seven Ways to Boost Artificial Intelligence ResearchSeven Ways to Boost Artificial Intelligence Research
Seven Ways to Boost Artificial Intelligence Research
 
NVIDIA Developer Program Overview
NVIDIA Developer Program OverviewNVIDIA Developer Program Overview
NVIDIA Developer Program Overview
 
NVIDIA at Computex 2019
NVIDIA at Computex 2019 NVIDIA at Computex 2019
NVIDIA at Computex 2019
 
Top 5 DGX Sessions From GTC 2019
Top 5 DGX Sessions From GTC 2019Top 5 DGX Sessions From GTC 2019
Top 5 DGX Sessions From GTC 2019
 
DGX POD Top 4 Sessions From GTC 2019
DGX POD Top 4 Sessions From GTC 2019DGX POD Top 4 Sessions From GTC 2019
DGX POD Top 4 Sessions From GTC 2019
 
Top 5 Data Science Sessions from GTC 2019
Top 5 Data Science Sessions from GTC 2019Top 5 Data Science Sessions from GTC 2019
Top 5 Data Science Sessions from GTC 2019
 
This Week in Data Science - Top 5 News - April 26, 2019
This Week in Data Science - Top 5 News - April 26, 2019This Week in Data Science - Top 5 News - April 26, 2019
This Week in Data Science - Top 5 News - April 26, 2019
 
CUDA DLI Training Courses at GTC 2019
CUDA DLI Training Courses at GTC 2019CUDA DLI Training Courses at GTC 2019
CUDA DLI Training Courses at GTC 2019
 
DGX Sessions You Won't Want to Miss at GTC 2019
DGX Sessions You Won't Want to Miss at GTC 2019DGX Sessions You Won't Want to Miss at GTC 2019
DGX Sessions You Won't Want to Miss at GTC 2019
 
Transforming Healthcare at GTC Silicon Valley
Transforming Healthcare at GTC Silicon ValleyTransforming Healthcare at GTC Silicon Valley
Transforming Healthcare at GTC Silicon Valley
 
OpenACC Monthly Highlights February 2019
OpenACC Monthly Highlights February 2019OpenACC Monthly Highlights February 2019
OpenACC Monthly Highlights February 2019
 
CUDA Sessions You Won't Want to Miss at GTC 2019
CUDA Sessions You Won't Want to Miss at GTC 2019CUDA Sessions You Won't Want to Miss at GTC 2019
CUDA Sessions You Won't Want to Miss at GTC 2019
 
Empowering Radiology with AI
Empowering Radiology with AIEmpowering Radiology with AI
Empowering Radiology with AI
 
Top 5 Deep Learning and AI Stories - November 30, 2018
Top 5 Deep Learning and AI Stories - November 30, 2018Top 5 Deep Learning and AI Stories - November 30, 2018
Top 5 Deep Learning and AI Stories - November 30, 2018
 

Kürzlich hochgeladen

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Kürzlich hochgeladen (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019

  • 1. 1
  • 2. 2 THE EXPANDING UNIVERSE OF HPC JENSEN HUANG | SC19
  • 3. 3 AT THE INTERSECTION OF GRAPHICS, SIMULATION, AI
  • 4. 4
  • 5. 5 COMPUTING FOR THE DA VINCIS OF OUR TIME FIRST AI SUPERCOMPUTERS FIRST EXASCALE SCIENCE42 NEW TOP 500 SYSTEMS ABCI SUMMIT CLIMATE LBNL | NVIDIA GENOMICS ORNL NUCLEAR WASTE REMEDIATION LBNL | PNNL Brown U. NVIDIA CANCER DETECTION ORNL | Stony Brook U.
  • 6. 6 FULL STACK SPEED-UP CUDA-X CUDA AI DRIVEMETRO ISAACCLARARAPIDS AERIALCG CUDA 10.2 cuTENSOR 1.0 cuSOLVER 10.3 cuBLAS 10.2 cuDNN 7.6 TensorRT 6.0 DALI 0.15 NCCL 2.5 IndeX 2.1 OptiX 7.0 RAPIDS 0.10 Spark XGBoost 3x in 2 Years 2017 2019 2018 Time to Solution 27 Hours 20 Hours 10 Hours Amber Chroma GROMACS GTC LAMMPS MILC NAMD QE SPECFEM3D TensorFlow VASP Benchmark Application: Amber [PME-Cellulose_NVE], Chroma [szscl21_24_128], GROMACS [ADH Dodec: Dev Prototype], GTC [moi#proc.in], LAMMPS [LJ 2.5], MILC [Apex Medium], NAMD [stmv_nve_cuda], Quantum Espresso [AUSURF112-jR], SPECFEM3D [four_material_simple_model]; TensorFlow [ResNet-50] , VASP [Si-Huge]; GPU node: with dual-socket CPUs with 4x V100 GPU.
  • 7. 7 THE EXPANDING UNIVERSE OF HPC NETWORK EDGE ANALYTICS SIMULATION AI Edge Cloud Arm Data Analytics Extreme IO EXTREME IO
  • 8. 8 INCREDIBLE ADVANCES IN AI WRITING DIALOG TRANSLATION SUMMARIZATION Q&A CLASSIFICATION 2012 2019 BERT TRANSFORMER ALEXNET CNN 3D POSE DENOISING SEGMENTATION OBJECT RECOGNITION CLASSIFICATION IMAGE GENERATION
  • 9. 9 GPU COMPUTING POWERS AI ADVANCES #1 MLPERF — AI TRAINING + AI INFERENCE HPC COMPUTING CHALLENGE Doubling 2 Years Doubling 3.4 Months Two Distinct Eras of AI TrainingSuper Moore’s Law — From 600 to 2 Hours in 5 Years K80 SERVER DGX 2 Hours 600 Hours Time to Train (ResNet-50)
  • 10. 10 NVIDIA AI END-TO-END PLATFORM TRAINING AUTONOMOUS MACHINES DGX HGX EGX AGX EDGE AICLOUD
  • 11. 11 AI FOR SCIENCE EXPERIMENTATION DATA SIMULATION DATA NEURAL ESTIMATION Real-time Steering Fast Approximation Design Space Exploration ICF + MERLIN — Fusion Inverse Problems LIGO — Gravitational Waves Faster Prediction ANI + MD – Chemistry Real-time Steering ITER – Fusion Energy
  • 12. 12
  • 13. 13 1x Data Transfer 100x Data Collected STREAMING AI SOFTWARE-DEFINED SENSORS BUILD MODELSSTREAMING AI PROCESSING ECMWF: 287 TB/dayLSST: 20 TB/day SKA: 16 TB/sec
  • 14. 14 NVIDIA EGX STACK NGC Kubernetes Networking Storage Security CUDA-X Third-Party ISVs METROPOLIS IMAGE PROCESSING DECODE DNN GRAPHICS ENCODE DEEPSTREAM Powered by NVIDIA CUDA Tensor Core GPU | Secured Boot Root of Trust Cryptographic Acceleration for IPsec and TLS | NVMe-oF over TCP and RDMA Industrial-strength Cloud Native and AI Stack NVIDIA EGX EDGE SUPERCOMPUTING PLATFORM
  • 15. 15 VERTICAL INDUSTRY FRAMEWORKS Clara Metropolis Isaac Omniverse Aerial DRIVE WORLD’S LARGEST DELIVERY SERVICE ADOPTS NVIDIA AI PUTTING AI TO WORK
  • 17. 17 SUPERCOMPUTING CLOUD Benchmark Application: Amber [PME-Cellulose_NVE], Chroma [szscl21_24_128], GROMACS [ADH Dodec: Dev Prototype], GTC [moi#proc.in], LAMMPS [LJ 2.5], MILC [Apex Medium], NAMD [stmv_nve_cuda], Quantum Espresso [AUSURF112-jR], SPECFEM3D [four_material_simple_model]; TensorFlow [ResNet-50], VASP [Si-Huge]; GPU node: with dual-socket CPUs with 4x V100 GPU. CPU Instance 48 Hours, $152 Amber, Chroma, GROMACS, GTC, LAMMPS MILC, NAMD, QE, SPECFEM3D, TensorFlow, VASP SUPER COMPUTING IS HARD — CLOUD HPC IS EXPENSIVE
  • 18. 18 SUPERCOMPUTING CLOUD 8x GPU Instance 1x GPU Instance CPU Instance 48 Hours, $152 Amber, Chroma, GROMACS, GTC, LAMMPS MILC, NAMD, QE, SPECFEM3D, TensorFlow, VASP Benchmark Application: Amber [PME-Cellulose_NVE], Chroma [szscl21_24_128], GROMACS [ADH Dodec: Dev Prototype], GTC [moi#proc.in], LAMMPS [LJ 2.5], MILC [Apex Medium], NAMD [stmv_nve_cuda], Quantum Espresso [AUSURF112-jR], SPECFEM3D [four_material_simple_model]; TensorFlow [ResNet-50], VASP [Si-Huge]; GPU node: with dual-socket CPUs with 4x V100 GPU. SUPER COMPUTING IS HARD — GPU CLOUD 1/7TH COST OF CPU CLOUD 48x Faster, 1/7th the Cost
  • 19. 19 ICECUBE OBSERVATORY DETECTING NEUTRINOS 50K NVIDIA GPUs IN THE CLOUD 350 PF OF SIMULATION FOR 2 HOURS PRODUCED 5% OF ANNUAL SIMULATION DATA AWS, MICROSOFT AZURE, GOOGLE CLOUD PLATFORM DISTRIBUTED ACROSS U.S., EUROPE, APAC Frank Wüerthwein, Ph.D. Executive Director, Open Science Grid Igor Sfiligoi Lead Developer and Researcher MULTIPLE GENERATIONS, ONE APPLICATION Events Processed Per GPU Type V100 M60 K80 T4 P40 P100 THE LARGEST CLOUD SIMULATION IN HISTORY
  • 20. 20 Up to 800 V100 GPUs Connected via Mellanox InfiniBand ANNOUNCING WORLD’S LARGEST ON-DEMAND SUPERCOMPUTER
  • 21. 21 DIVERSE ARM ARCHITECTURES AMPERE COMPUTING eMAG Hyperscale and Storage AMAZON GRAVITON Hyperscale and SmartNIC MARVELL THUNDERX2 Hyperscale, Storage and HPC FUJITSU A64FX Supercomputing HUAWEI KUNPENG 920 Big Data and Edge
  • 22. 22 NVIDIA CUDA ON ARM AT OAK RIDGE NATIONAL LAB Benchmark Application [Dataset]: GROMACS [ADH Dodec- Dev prototype], LAMMPS [LJ 2.5], MILC [Apex Small], NAMD [apoa1_npt_cuda], Quantum Espresso [AUSURF112-jR], Relion [Plasmodium Ribosome], SPECFEM3D [four_material_simple_model], TensorFlow [ResNet50: Batch:256]; CPU node: 2x ThunderX2 9975; GPU node: Same CPU node + 2x V100 32GB PCIe ; *1xV100 for GROMACS, MILC, and TensorFlow
  • 23. 23 ANNOUNCING NVIDIA HPC FOR ARM HPC Server Reference Platform | 8 V100 Tensor Core GPUs with NVLink 4 100 Gbps Mellanox InfiniBand| Systems Ranging from Supercomputer, Hyperscale, to Edge CUDA on Arm Beta Available Now NIC PCIe Switch PCIe Switch NIC CPU CPU GPU GPU GPU GPU
  • 24. 24 ANNOUNCING NVIDIA HPC FOR ARM HPC Server Reference Platform | 8 V100 Tensor Core GPUs with NVLink 4 100 Gbps Mellanox InfiniBand| Systems Ranging from Supercomputer, Hyperscale, to Edge CUDA on Arm Beta Available Now NIC PCIe Switch PCIe Switch NIC CPU CPU GPU GPU GPU GPU
  • 25. 25 ANNOUNCING NVIDIA HPC FOR ARM HPC Server Reference Platform | 8 V100 Tensor Core GPUs with NVLink 4 100 Gbps Mellanox InfiniBand| Systems Ranging from Supercomputer, Hyperscale, to Edge CUDA on Arm Beta Available Now APPLICATIONS PROGRAMMING MODELS C++ CUDA FORTRAN COMET DCA++ GROMACS INDEX LAMMPS LSMS MATLAB MILC NAMD OPTIX RELION TENSORFLOW PARAVIEW OPENACC PYTHON ARM ALLINEA STUDIO BRIGHT COMPUTING CMAKE CUDA-GDB CUPTI GCC LLVM NVCC PAPI SINGULARITY SLURM TAU GAMERA SDKS QUANTUM ESPRESSO PERFORCE TOTALVIEW PGI SCORE-P VMD
  • 26. 26
  • 27. 27 50 GB/s 50 GB/s EXTREME COMPUTE NEEDS EXTREME IO TRADITIONAL RDMA NODE A NODE B PCIe Switch CPU System Memory GPU NIC PCIe Switch CPU System Memory GPU NIC
  • 28. 28 EXTREME COMPUTE NEEDS EXTREME IO GPUDIRECT RDMA NODE A NODE B PCIe Switch CPU System Memory GPU 100 GB/s NIC PCIe Switch CPU System Memory GPU NIC
  • 29. 29 EXTREME COMPUTE NEEDS EXTREME IO TRADITIONAL STORAGE PCIe Switch CPU System Memory GPU GPUDIRECT RDMA NODE A NODE B NIC PCIe Switch CPU System Memory GPU 100 GB/s NIC PCIe Switch CPU System Memory GPU NIC 50 GB/s
  • 30. 30 EXTREME COMPUTE NEEDS EXTREME IO GPUDIRECT STORAGE PCIe Switch CPU System Memory GPU Storage 100 GB/s GPUDIRECT RDMA NODE A NODE B NIC PCIe Switch CPU System Memory GPU 100 GB/s NIC PCIe Switch CPU System Memory GPU NIC
  • 31. 31 ANNOUNCING NVIDIA MAGNUM IO Acceleration Libraries for Large-scale HPC and IO High-bandwidth, Low-latency, Massive Storage Access with Lower CPU Utilization GPUDIRECT STORAGE PCIe Switch CPU System Memory GPU Storage 100 GB/s GPUDIRECT RDMA NODE A NODE B NIC PCIe Switch CPU System Memory GPU 100 GB/s NIC PCIe Switch CPU System Memory GPU NIC
  • 32. 32 PYTHON CUDA APACHE ARROW CUDF CUGRAPH RAPIDS CUML PANDAS SCI-KL / XGBOOST CUDNN DEEP LEARNING FRAMEWORKS DASK NVIDIA RAPIDS DATA SCIENCE Open Source | Multi-GPU and Multi-Node | Up to 100x Speed-Up | 150K Downloads in 1 Year Data Load and Processing Times from Hours to Minutes | Used by NERSC, ORNL, NASA, SDSC
  • 33. 33 NVIDIA MAGNUM IO BOOSTS RAPIDS DATA ANALYTICS 20x ON TPC-H STRUCTURAL BIOLOGY — 3x VMDNEW PANGEO XARRAY ZARR READER FOR CLIMATE Q4 TPC-H Benchmark Work Breakdown: With Repeated Query 0 400,000 800,000 1,200,000 WITHOUT GDS WITH GDS Latency (msec) CUDA Startup GPU and CPU Allocation Data Preload Warmup Query Repeat Query Clean Up Driver Close
  • 34. 34 ANNOUNCING WORLD’S LARGEST INTERACTIVE VOLUME VISUALIZATION Simulating Mars Lander with FUN3D | Interactively Visualizing 150 TB; Unstructured Mesh 4 NVIDIA DGX-2 Streaming 400 GB/s | NVIDIA Magnum IO | NVIDIA IndeX
  • 35. 35 ANNOUNCING NVIDIA DGX-2 AS SUPERCOMPUTING ANALYTICS INSTRUMENT 16 V100 GPUs - 2 PF Tensor Core | 512 GB HBM2 - 16 TB/s | 8 MLNX CX5 - 800 Gbps 30 TB NVMe - 53 GB/s with Magnum IO | Fabric Storage - 100 GB/s with Magnum IO 2.3x Faster Than Current IO500 10-node Leader Powered by NVIDIA Magnum IO EXTREME WEATHER AI INFERENCE NVIDIA TENSOR RT 3D VOLUME ANALYTICS PANGEO XARRAY VMD COMPUTATIONAL MICROSCOPE NVIDIA OPTIX 3D INTERACTIVE VOLUME RENDERING NVIDIA INDEX TPC-H RECORD 10 TB JOIN NVIDIA RAPIDS
  • 36. 36 THE EXPANDING UNIVERSE OF HPC NETWORK EDGE ANALYTICS SIMULATION EXTREME IO NVIDIA HPC for ARM NVIDIA EGX Edge Supercomputing Platform NVIDIA DGX-2 Supercomputing Analytics Instrument NVIDIA Magnum IO NGC Azure
  • 37. 37