SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
High Performance Computing

        Adam DeConinck
       R Systems NA, Inc.




        1
Development of models begins at small scale.

Working on your laptop is convenient, simple.

Actual analysis, however, is slow.




    2
Development of models begins at small scale.

Working on your laptop is convenient, simple.

Actual analysis, however, is slow.


“Scaling up” typically means a small server or
fast multi-core desktop.

Speedup exists, but for very large models, not
significant.

Single machines don't scale up forever.


    3
For the largest models, a different approach is required.


                    4
High-Performance Computing involves many
  distinct computer processors working together on
  the same calculation.

Large problems are divided into smaller parts and
  distributed among the many computers.

Usually clusters of quasi-independent computers
  which are coordinated by a central scheduler.


                   5
Typical HPC Cluster

             Login
External
connection                 Ethernet network




                     Scheduler



                                               Computes

                     File Server
                                              High-speed network
                                              (10GigE / Infiniband)

                      6
Performance gains
    High-end
    workstation



                    Duration (s)




                                            Number of cores


     Performance test: stochastic finance model on R Systems cluster

     High-end workstation: 8 cores. Maximum speedup of 20x: 4.5 hrs → 14 minutes
      
          Scale-up heavily model-dependent: 5x – 100x in our tests, can be faster

     No more performance gain after ~500 cores: why? Some operations can't be parallelized.

     Additional cores? Run multiple models simultaneously


                                     7
Performance comes at a price: complexity.



    New paradigm: real-time analysis vs batch jobs.

    Applications must be written specifically to take
    advantage of distributed computing.

    Performance characteristics of applications change.

    Debugging becomes more of a challenge.



                     8
New paradigm: real-time analysis vs batch jobs.




Most small analyses are done in    Large jobs are typically done in a
  real time:                          batch model:

    “At-your-desk” analysis        
                                       Submit job to a queue

    Small models only              
                                       Much larger models

    Fast iterations                
                                       Slow iterations

    No waiting for resources       
                                       May need to wait



                               9
Applications must be written specifically to
  take advantage of distributed
  computing.

    Explicitly split your problem into smaller
    “chunks”

    “Message passing” between processes

    Entire computation can be slowed by one
    or two slow chunks

    Exception: “embarrassingly parallel”
    problems

    Easy-to-split, independent chunks of
    computation

    Thankfully, many useful models fall under
                                                   “Embarrassingly parallel” =
    this heading. (e.g. stochastic models)       No inter-process communication

                              10
Performance characteristics of applications change.


On a single machine:        On a cluster:

    CPU speed (compute)     
                                Single-machine metrics

    Cache                   
                                Network

    Memory                  
                                File server

    Disk                    
                                Scheduler contention
                            
                                Results from other nodes



                   11
Debugging becomes more of a challenge.


    More complexity = more pieces that can fail

    Race conditions: sequence of events no longer deterministic

    Single nodes can “stall” and slow the entire computation

    Scheduler, file server, login server all have their own challenges




                          12
External resources

    One solution to handling complexity: outsource it!

    Historical HPC facilities: universities, national labs
    
        Often have the most absolute compute capacity, and will sell
        excess capacity
    
        Competition with academic projects, typically do not include
        SLA or high-level support

    Dedicated commercial HPC facilities providing “on-demand”
    compute power.



                           13
External HPC                      Internal HPC

    Outsource HPC sysadmin        
                                      Requires in-house expertise

    No hardware investment        
                                      Major investment in hardware

    Pay-as-you-go                 
                                      Possible idle time

    Easy to migrate to new tech   
                                      Upgrades require new hardware




                          14
Internal HPC                          External HPC

    No external contention            
                                          No guaranteed access

    All internal—easy security        
                                          Security arrangements complex

    Full control over configuration   
                                          Limited control of configuration

    Simpler licensing control         
                                          Some licensing complex


    Requires in-house expertise       
                                          Outsource HPC sysadmin

    Major investment in hardware      
                                          No hardware investment

    Possible idle time                
                                          Pay-as-you-go

    Upgrades require new hardware     
                                          Easy to migrate to new tech



                             15
“The Cloud”

    “Cloud computing”: virtual machines, dynamic allocation of resources in
    an external resource

    Lower performance (virtualization), higher flexibility

    Usually no contracts necessary: pay with your credit card, get 16 nodes

    Often have to do all your own sysadmin

    Low support, high control



                              16
CASE STUDY:
Windows cluster for Actuarial
       Application




         17
Global insurance company


    Needed 500-1000 cores on a temporary basis

    Preferred a utility, “pay-as-you-go” model

    Experimenting with external resources for “burst”
    capacity during high-activity periods

    Commercially licensed and supported application

    Requested a proof of concept

                     18
Cluster configuration

    Application embarrassingly parallel, small-to-medium data files,
    computationally and memory-intensive
    
        Prioritize computation (processors), access to fileserver over
        inter-node communication, large storage
    
        Upgraded memory in compute nodes to 2 GB/core

    128-node cluster: 3.0 GHz Intel Xeon processors, 8 cores per node for
    1024 cores total

    Windows 2008 HPC R2 operating system

    Application and fileserver on login node


                            19
Stumbling blocks

    Application optimization
Customer had a wide variety of models which generated different usage
  patterns. (IO, compute, memory-intensive jobs) Required dynamic
  reconfiguration for different conditions.

    Technical issue
Iterative testing process. Application turned out to be generating massive
   fileserver contention. Had to make changes to both software and hardware.

    Human processes
    Users were accustomed to internal access model. Required changes both
    for providers (increase ease-of-use) and users (change workflow)

    Security
    Customer had never worked with an external provider before. Complex
    internal security policy had to be reconciled with remote access.
                            20
Lessons learned:


    Security was the biggest delaying factor. The initial security setup took over
    3 months from the first expression of interest, even though cluster setup
    was done in less than a week.
    
        Only mattered the first time though: subsequent runs started much
        more smoothly.

    A low-cost proof-of-concept run was important to demonstrate feasibility,
    and for working the bugs out.

    A good relationship with the application vendor was extremely important
    to solving problems and properly optimizing the model for performance.



                              21
Recent developments: GPUs




       22
Graphics processing units




    CPU: complex, general-purpose processor

    GPU: highly-specialized parallel processor, optimized for performing operations for
    common graphics routines

    Highly specialized → many more “cores” for same cost and space
     
         Intel Core i7: 4 cores @ 3.4 GHz: $300 = $75/core
     
         NVIDIA Tesla M2070: 448 cores @ 575 MHz: $4500 = $10/core

    Also higher bandwidth: 100+ GB/s for GPU vs 10-30 GB/s for CPU

    Same operations can be adapted for non-graphics applications: “GPGPU”

                                       23
                      Image from http://blogs.nvidia.com/2009/12/whats-the-difference-between-a-cpu-and-a-gpu/
HPC/Actuarial using GPUs
                                                   
                                                        Random-number generation
                                                   
                                                        Finite-difference modeling
                                                   
                                                        Image processing

                                                   
                                                        Numerical Algorithms Group:
                                                        GPU random-number generator
                                                   
                                                        MATLAB: operations on large arrays/matrices
                                                   
                                                        Wolfram Mathematica: symbolic math analysis


Data from
http://www.nvidia.com/object/computational_finan
ce.html




                                                   24

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to HPC
Introduction to HPCIntroduction to HPC
Introduction to HPCChris Dwan
 
High performance computing
High performance computingHigh performance computing
High performance computingGuy Tel-Zur
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveJason Shih
 
Inside the Volta GPU Architecture and CUDA 9
Inside the Volta GPU Architecture and CUDA 9Inside the Volta GPU Architecture and CUDA 9
Inside the Volta GPU Architecture and CUDA 9inside-BigData.com
 
Lecture5 virtualization
Lecture5 virtualizationLecture5 virtualization
Lecture5 virtualizationhktripathy
 
OpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software StackOpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software Stackinside-BigData.com
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel ComputingAkhila Prabhakaran
 
Implementation levels of virtualization
Implementation levels of virtualizationImplementation levels of virtualization
Implementation levels of virtualizationGokulnath S
 
Physical organization of parallel platforms
Physical organization of parallel platformsPhysical organization of parallel platforms
Physical organization of parallel platformsSyed Zaid Irshad
 
Overview of HPC.pptx
Overview of HPC.pptxOverview of HPC.pptx
Overview of HPC.pptxsundariprabhu
 
High Performance Computing - The Future is Here
High Performance Computing - The Future is HereHigh Performance Computing - The Future is Here
High Performance Computing - The Future is HereMartin Hamilton
 

Was ist angesagt? (20)

Introduction to HPC
Introduction to HPCIntroduction to HPC
Introduction to HPC
 
High–Performance Computing
High–Performance ComputingHigh–Performance Computing
High–Performance Computing
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspective
 
GPU Programming
GPU ProgrammingGPU Programming
GPU Programming
 
GPU Computing
GPU ComputingGPU Computing
GPU Computing
 
Inside the Volta GPU Architecture and CUDA 9
Inside the Volta GPU Architecture and CUDA 9Inside the Volta GPU Architecture and CUDA 9
Inside the Volta GPU Architecture and CUDA 9
 
Lecture5 virtualization
Lecture5 virtualizationLecture5 virtualization
Lecture5 virtualization
 
OpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software StackOpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software Stack
 
CUDA Architecture
CUDA ArchitectureCUDA Architecture
CUDA Architecture
 
Parallel Computing on the GPU
Parallel Computing on the GPUParallel Computing on the GPU
Parallel Computing on the GPU
 
Cluster and Grid Computing
Cluster and Grid ComputingCluster and Grid Computing
Cluster and Grid Computing
 
Kvm
KvmKvm
Kvm
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel Computing
 
Implementation levels of virtualization
Implementation levels of virtualizationImplementation levels of virtualization
Implementation levels of virtualization
 
Physical organization of parallel platforms
Physical organization of parallel platformsPhysical organization of parallel platforms
Physical organization of parallel platforms
 
Overview of HPC.pptx
Overview of HPC.pptxOverview of HPC.pptx
Overview of HPC.pptx
 
CPU vs GPU Comparison
CPU  vs GPU ComparisonCPU  vs GPU Comparison
CPU vs GPU Comparison
 
High Performance Computing - The Future is Here
High Performance Computing - The Future is HereHigh Performance Computing - The Future is Here
High Performance Computing - The Future is Here
 
HPC in the Cloud
HPC in the CloudHPC in the Cloud
HPC in the Cloud
 

Andere mochten auch

High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPIAnkit Mahato
 
High performance concrete ppt
High performance concrete pptHigh performance concrete ppt
High performance concrete pptGoogle
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudAmazon Web Services
 
INCITE - INtegrated Components for Interactive TEaching
INCITE - INtegrated Components for Interactive TEachingINCITE - INtegrated Components for Interactive TEaching
INCITE - INtegrated Components for Interactive TEachingDragos Sbîrlea
 
High Performance Statistical Computing
High Performance Statistical ComputingHigh Performance Statistical Computing
High Performance Statistical ComputingMicah Altman
 
High performance computing
High performance computingHigh performance computing
High performance computingMaher Alshammari
 
Kalray TURBOCARD2 @ ISC'14
Kalray TURBOCARD2 @ ISC'14Kalray TURBOCARD2 @ ISC'14
Kalray TURBOCARD2 @ ISC'14KALRAY
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?Ian Lumb
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyIntel IT Center
 
AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAmazon Web Services
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...xKinAnx
 
GPFS - graphical intro
GPFS - graphical introGPFS - graphical intro
GPFS - graphical introAlex Balk
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Geoffrey Fox
 
Unix _linux_fundamentals_for_hpc-_b
Unix  _linux_fundamentals_for_hpc-_bUnix  _linux_fundamentals_for_hpc-_b
Unix _linux_fundamentals_for_hpc-_bMohammad Reza Beygi
 
Parasitic Computing
Parasitic ComputingParasitic Computing
Parasitic Computingjojothish
 
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC TechnologiesAccelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC Technologiesinside-BigData.com
 
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...Industrial Partnerships Office
 

Andere mochten auch (20)

High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPI
 
High performance concrete ppt
High performance concrete pptHigh performance concrete ppt
High performance concrete ppt
 
Intro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS CloudIntro to High Performance Computing in the AWS Cloud
Intro to High Performance Computing in the AWS Cloud
 
INCITE - INtegrated Components for Interactive TEaching
INCITE - INtegrated Components for Interactive TEachingINCITE - INtegrated Components for Interactive TEaching
INCITE - INtegrated Components for Interactive TEaching
 
JAWS
JAWSJAWS
JAWS
 
High Performance Statistical Computing
High Performance Statistical ComputingHigh Performance Statistical Computing
High Performance Statistical Computing
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
Kalray TURBOCARD2 @ ISC'14
Kalray TURBOCARD2 @ ISC'14Kalray TURBOCARD2 @ ISC'14
Kalray TURBOCARD2 @ ISC'14
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge Economy
 
AWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWSAWS Webcast - An Introduction to High Performance Computing on AWS
AWS Webcast - An Introduction to High Performance Computing on AWS
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
 
GPFS - graphical intro
GPFS - graphical introGPFS - graphical intro
GPFS - graphical intro
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
Unix _linux_fundamentals_for_hpc-_b
Unix  _linux_fundamentals_for_hpc-_bUnix  _linux_fundamentals_for_hpc-_b
Unix _linux_fundamentals_for_hpc-_b
 
Parasitic Computing
Parasitic ComputingParasitic Computing
Parasitic Computing
 
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC TechnologiesAccelerating Hadoop, Spark, and Memcached with HPC Technologies
Accelerating Hadoop, Spark, and Memcached with HPC Technologies
 
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
Delivering Transformational Solutions to Industry by Dr. Frederick Streitz, D...
 
Biometric technology
Biometric technologyBiometric technology
Biometric technology
 

Ähnlich wie High Performance Computing: an Introduction for the Society of Actuaries

Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made EasyMatching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made EasyPete Johnson
 
Microsoft Azure in HPC scenarios
Microsoft Azure in HPC scenariosMicrosoft Azure in HPC scenarios
Microsoft Azure in HPC scenariosmictc
 
InTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTTrust S.A.
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersRyousei Takano
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...confluent
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operationniallmilton
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors Rebekah Rodriguez
 
Performance testing virtualized systems v5
Performance testing virtualized systems v5Performance testing virtualized systems v5
Performance testing virtualized systems v5Mentora
 
A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...ChangWoo Min
 
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Igor José F. Freitas
 
Adaptive Computing Using PlateSpin Orchestrate
Adaptive Computing Using PlateSpin OrchestrateAdaptive Computing Using PlateSpin Orchestrate
Adaptive Computing Using PlateSpin OrchestrateNovell
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrationsinside-BigData.com
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland mictc
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceAmazon Web Services
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Amazon Web Services
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTechgeetachauhan
 

Ähnlich wie High Performance Computing: an Introduction for the Society of Actuaries (20)

Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8
 
B9 cmis
B9 cmisB9 cmis
B9 cmis
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made EasyMatching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
Matching Your Costs to Your DAU: Thin Client Back-End Infrastructure Made Easy
 
Microsoft Azure in HPC scenarios
Microsoft Azure in HPC scenariosMicrosoft Azure in HPC scenarios
Microsoft Azure in HPC scenarios
 
InTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AIInTech Event | Cognitive Infrastructure for Enterprise AI
InTech Event | Cognitive Infrastructure for Enterprise AI
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
 
Cassandra in Operation
Cassandra in OperationCassandra in Operation
Cassandra in Operation
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
 
Performance testing virtualized systems v5
Performance testing virtualized systems v5Performance testing virtualized systems v5
Performance testing virtualized systems v5
 
A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...A Survey on in-a-box parallel computing and its implications on system softwa...
A Survey on in-a-box parallel computing and its implications on system softwa...
 
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
 
Adaptive Computing Using PlateSpin Orchestrate
Adaptive Computing Using PlateSpin OrchestrateAdaptive Computing Using PlateSpin Orchestrate
Adaptive Computing Using PlateSpin Orchestrate
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrations
 
Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland Cloud Roundtable at Microsoft Switzerland
Cloud Roundtable at Microsoft Switzerland
 
Deep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance PerformanceDeep Dive on Delivering Amazon EC2 Instance Performance
Deep Dive on Delivering Amazon EC2 Instance Performance
 
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
Choosing the Right EC2 Instance and Applicable Use Cases - AWS June 2016 Webi...
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
 

Kürzlich hochgeladen

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Kürzlich hochgeladen (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

High Performance Computing: an Introduction for the Society of Actuaries

  • 1. High Performance Computing Adam DeConinck R Systems NA, Inc. 1
  • 2. Development of models begins at small scale. Working on your laptop is convenient, simple. Actual analysis, however, is slow. 2
  • 3. Development of models begins at small scale. Working on your laptop is convenient, simple. Actual analysis, however, is slow. “Scaling up” typically means a small server or fast multi-core desktop. Speedup exists, but for very large models, not significant. Single machines don't scale up forever. 3
  • 4. For the largest models, a different approach is required. 4
  • 5. High-Performance Computing involves many distinct computer processors working together on the same calculation. Large problems are divided into smaller parts and distributed among the many computers. Usually clusters of quasi-independent computers which are coordinated by a central scheduler. 5
  • 6. Typical HPC Cluster Login External connection Ethernet network Scheduler Computes File Server High-speed network (10GigE / Infiniband) 6
  • 7. Performance gains High-end workstation Duration (s) Number of cores  Performance test: stochastic finance model on R Systems cluster  High-end workstation: 8 cores. Maximum speedup of 20x: 4.5 hrs → 14 minutes  Scale-up heavily model-dependent: 5x – 100x in our tests, can be faster  No more performance gain after ~500 cores: why? Some operations can't be parallelized.  Additional cores? Run multiple models simultaneously 7
  • 8. Performance comes at a price: complexity.  New paradigm: real-time analysis vs batch jobs.  Applications must be written specifically to take advantage of distributed computing.  Performance characteristics of applications change.  Debugging becomes more of a challenge. 8
  • 9. New paradigm: real-time analysis vs batch jobs. Most small analyses are done in Large jobs are typically done in a real time: batch model:  “At-your-desk” analysis  Submit job to a queue  Small models only  Much larger models  Fast iterations  Slow iterations  No waiting for resources  May need to wait 9
  • 10. Applications must be written specifically to take advantage of distributed computing.  Explicitly split your problem into smaller “chunks”  “Message passing” between processes  Entire computation can be slowed by one or two slow chunks  Exception: “embarrassingly parallel” problems  Easy-to-split, independent chunks of computation  Thankfully, many useful models fall under “Embarrassingly parallel” = this heading. (e.g. stochastic models) No inter-process communication 10
  • 11. Performance characteristics of applications change. On a single machine: On a cluster:  CPU speed (compute)  Single-machine metrics  Cache  Network  Memory  File server  Disk  Scheduler contention  Results from other nodes 11
  • 12. Debugging becomes more of a challenge.  More complexity = more pieces that can fail  Race conditions: sequence of events no longer deterministic  Single nodes can “stall” and slow the entire computation  Scheduler, file server, login server all have their own challenges 12
  • 13. External resources  One solution to handling complexity: outsource it!  Historical HPC facilities: universities, national labs  Often have the most absolute compute capacity, and will sell excess capacity  Competition with academic projects, typically do not include SLA or high-level support  Dedicated commercial HPC facilities providing “on-demand” compute power. 13
  • 14. External HPC Internal HPC  Outsource HPC sysadmin  Requires in-house expertise  No hardware investment  Major investment in hardware  Pay-as-you-go  Possible idle time  Easy to migrate to new tech  Upgrades require new hardware 14
  • 15. Internal HPC External HPC  No external contention  No guaranteed access  All internal—easy security  Security arrangements complex  Full control over configuration  Limited control of configuration  Simpler licensing control  Some licensing complex  Requires in-house expertise  Outsource HPC sysadmin  Major investment in hardware  No hardware investment  Possible idle time  Pay-as-you-go  Upgrades require new hardware  Easy to migrate to new tech 15
  • 16. “The Cloud”  “Cloud computing”: virtual machines, dynamic allocation of resources in an external resource  Lower performance (virtualization), higher flexibility  Usually no contracts necessary: pay with your credit card, get 16 nodes  Often have to do all your own sysadmin  Low support, high control 16
  • 17. CASE STUDY: Windows cluster for Actuarial Application 17
  • 18. Global insurance company  Needed 500-1000 cores on a temporary basis  Preferred a utility, “pay-as-you-go” model  Experimenting with external resources for “burst” capacity during high-activity periods  Commercially licensed and supported application  Requested a proof of concept 18
  • 19. Cluster configuration  Application embarrassingly parallel, small-to-medium data files, computationally and memory-intensive  Prioritize computation (processors), access to fileserver over inter-node communication, large storage  Upgraded memory in compute nodes to 2 GB/core  128-node cluster: 3.0 GHz Intel Xeon processors, 8 cores per node for 1024 cores total  Windows 2008 HPC R2 operating system  Application and fileserver on login node 19
  • 20. Stumbling blocks  Application optimization Customer had a wide variety of models which generated different usage patterns. (IO, compute, memory-intensive jobs) Required dynamic reconfiguration for different conditions.  Technical issue Iterative testing process. Application turned out to be generating massive fileserver contention. Had to make changes to both software and hardware.  Human processes Users were accustomed to internal access model. Required changes both for providers (increase ease-of-use) and users (change workflow)  Security Customer had never worked with an external provider before. Complex internal security policy had to be reconciled with remote access. 20
  • 21. Lessons learned:  Security was the biggest delaying factor. The initial security setup took over 3 months from the first expression of interest, even though cluster setup was done in less than a week.  Only mattered the first time though: subsequent runs started much more smoothly.  A low-cost proof-of-concept run was important to demonstrate feasibility, and for working the bugs out.  A good relationship with the application vendor was extremely important to solving problems and properly optimizing the model for performance. 21
  • 23. Graphics processing units  CPU: complex, general-purpose processor  GPU: highly-specialized parallel processor, optimized for performing operations for common graphics routines  Highly specialized → many more “cores” for same cost and space  Intel Core i7: 4 cores @ 3.4 GHz: $300 = $75/core  NVIDIA Tesla M2070: 448 cores @ 575 MHz: $4500 = $10/core  Also higher bandwidth: 100+ GB/s for GPU vs 10-30 GB/s for CPU  Same operations can be adapted for non-graphics applications: “GPGPU” 23 Image from http://blogs.nvidia.com/2009/12/whats-the-difference-between-a-cpu-and-a-gpu/
  • 24. HPC/Actuarial using GPUs  Random-number generation  Finite-difference modeling  Image processing  Numerical Algorithms Group: GPU random-number generator  MATLAB: operations on large arrays/matrices  Wolfram Mathematica: symbolic math analysis Data from http://www.nvidia.com/object/computational_finan ce.html 24