SlideShare a Scribd company logo
1 of 47
Big Data for Healthcare:
Usage, Architecture and Technologies
Presenters

Pete Stiglich – Sr. Technical Architect
       Over 20 years IT experience

       Enterprise Data Architecture, Data Management, Data Modeling, Data Quality, DW/BI,
        MDM, Metadata Management, Data Quality, Database Administration (DBA)

       President of DAMA Phoenix, writer, speaker, former editor Real World Decision Support,
        listed expert for SearchDataManagement – Data Warehousing and Data Modeling

       Certified Data Management Professional (CDMP) and Certified Business Intelligence
        Professional (CBIP), both at master level



    Email: Pete.Stiglich@Perficient.com

    Phone: 602-284-0992

    Twitter: @pstiglich

    Blog: http://blogs.perficient.com/healthcare/blog/author/pstiglich/
Presenters

Hari Rajagopal – Sr. Solution Architect
   •   Over 15 years IT experience

   •   SOA solutions, Enterprise Service Bus technologies, Data Architecture, Algorithms

   •   Presenter at conferences, Author and Blogger

   •   IBM certified SOA solutions designer



   Email: Hari.Rajagopal@Perficient.com

   Phone: 303-517-9634
Key Takeaway Points


•   Big Data technologies represent a major paradigm shift – and is
    here to stay!

•   Big Data enables “all” the data to be leveraged for new insight–
    clinical notes, medical literature, OR videos, X-rays, consultation
    recordings, streaming medical device data, etc.

•   More intelligent enterprise – more efficient and prevalent
    advanced analytics (predictive data mining, text mining, etc.)

•   Big Data will affect application development and data
    management
Agenda


•   What is Big Data?

 How Big Data can enable better healthcare

 Types of Big Data processing

 Key technologies

 Impacts of Big Data on:

      Application Development

      Data Management

 Q&A
What is Big Data?
What is “Big Data”


•   Datasets which are too large, grow too rapidly, or are too
    varied to handle using traditional techniques

•   Volume, Velocity, Variety

•   Volume – 100’s of TB’s, petabytes, and beyond

•   Velocity – e.g., machine generated data, medical devices,
               sensors

•   Variety – unstructured data, many formats, varying
              semantics



•   Not every data problem is a “Big Data” problem!!
MPP enables Big Data


                                            100’s, 1,000’s of nodes



                    Scalability                    Scalability

                                  Cluster (homogenous) or Grid (heterogeneous)




     SMP – Symmetric                      MPP – Massively Parallel
      Multiprocessing                           Processing
    “Shared Everything”                     “Shared Nothing”
CPU, memory, disk (SAN, NAS)                Nodes do not share
                                          CPU, memory, disk (DAS)
Cost Factor


 Cost of storing and analyzing Big Data can be driven down
  by:

      Low cost commodity hardware

      Open source software

      Public Cloud? Yes, But for really massive amounts of data with many
       accesses, may be cost prohibitive

      Learning curve? You bet!
Hadoop / MapReduce


•   Hadoop and MapReduce – key Big Data technologies
    developed at Google, now open source

•   “Divide and conquer” approach

•   Highly fault tolerant – nodes are expected to fail

•   Every data block (by default) replicated on 3 nodes
    (is also rack aware)

•   MapReduce – component of Hadoop, programming
    framework for distributed processing

•   Not the only Big Data technology…
NoSQL


•   Stands for “Not only SQL” – really s/b “Not only Relational”

 New(ish) paradigms for storing and retrieving data

 Many Big Data platforms don’t use a RDBMS

        Might take too long to setup / change

        Problems with certain types of queries (e.g., social media, ragged
         hierarchies)

 Key Types of NoSQL Data Stores
          •   Key-Value Pair
          •   Wide Column
          •   Graph
          •   Document
          •   Object
          •   XML
How can “Big Data” improve Healthcare?
Healthcare “Big Data” opportunities


•   Examples of Big Data opportunities
        Patient Monitoring – inpatient, ICU, ER, home health

        Personalized Medicine

        Population health management / ACO

        Epidemiology

        Keeping abreast of medical literature

        Research

        Many more…
Healthcare “Big Data” opportunities


•   Patient Monitoring

        Big Data can enable Complex Event Processing (CEP) – dealing with
         multiple, large streams of data in real-time from medical devices,
         sensors, RFID, etc.

        Proactively address risk, improve quality, improve processes, etc.

        Data might not be persisted – Big Data can be used for distributed
         processing with the data located only in memory

        Example – an HL7 A01 message (admit a patient) received for an
         inpatient visit – but no PV1 Assigned Patient Location received within X
         hours. Is the patient on a gurney in a hallway somewhere???

        Example – home health sensor in a bed indicates patient hasn’t gotten
         out of bed for X number of hours
Healthcare “Big Data” opportunities


•   Personalized Medicine
        Genomic, proteomic, and metabolic data is large, complex, and varied

        Can have gigabytes of data for a single patient

        Use case examples - protein footprints, gene expression

        Difficult to use with a relational database, XML performance problematic

        Use wide-column stores, graphs, key-value stores (or combinations) for better
         scalability and performance




                                                                                    Source:
                                                                                    wikipedia
Healthcare “Big Data” opportunities


•   Population Management
        Preventative care for ACO – micro-segmentation of patients

              Identify most at risk patients – allocate resources wisely to help these
               patients (e.g., 1% of 100,000 patients had 30% of the costs)*

              Reduce admits/re-admits, ER visits, etc.

        Identify potential causes for infections, readmissions (e.g., which two
         materials when used together are correlated with high rates of infection)



        Even with structured data, data mining can be time consuming – distributed
         processing can speed up data mining




                                                                    * http://nyr.kr/L8o1Ag (New
                                                                    Yorker article)
Healthcare “Big Data” opportunities


•   Epidemiology
        Analysis of patterns and trends in health issues across a geography

        Tracking of the spread of disease based on streaming data

        Visualization of global outbreaks enabling the determination of ‘source’ of infection




                                                                                                 17
Healthcare “Big Data” opportunities


•   Unstructured data analysis
        Most data (80%) resides in unstructured or semi-structured sources – and a wealth
         of information might be gleaned

        One company allows dermatology patients to upload pictures on a regular basis to
         analyze moles in an automated fashion to check for melanoma based on redness,
         asymmetry, thickness, etc.

        A lot of information contained in clinical notes, but hard to extract

        Providers can’t keep abreast of medical literature – even specialists! Use Big Data
         and Semantic Web technologies to identify highly relevant literature

        Sentiment analysis – using surveys, social media

        Etc…
Poll


•   What Healthcare Big Data use case do you see as being most
    important for your organization?



    •   Patient Monitoring
    •   Personalized Medicine
    •   Population Management (e.g., for ACO)
    •   Epidemiology
    •   More effective use of medical literature
    •   Medical research
    •   Unstructured data analysis
    •   Quality Improvement
    •   Other




                                                                   19
Types of Big Data processing
Analytics


•   Big Data ideal for experimental / discovery analytics

•   Faster setup, data quality not as critical

•   Enables Data Scientists to formulate and investigate
    hypotheses more rapidly, with less expense

•   May discover useful knowledge . . . or not

•   Fail faster – so as to move on to the next hypothesis !
Unstructured Data Mining


•   Big Data can make mining unstructured sources(text, audio,
    video, image) more prevalent - more cost effective, with better
    performance

•   E.g., extract structured information, categorize documents,
    analyze shapes, coloration, how long was a video viewed, etc.

•   Text Mining capabilities
     •   Entity Extraction – extracting names, locations, dates, products, diseases, Rx,
         conditions, etc., from text

     •   Topic Tracking – track information of interest to a user

     •   Categorization – categorize a document based on wordcounts/synonyms, etc.

     •   Clustering – grouping similar documents

     •   Concept Linking – related documents based on shared concepts

     •   Question Answering – try to find best answer based on user’s environment
Data Mining

                                                                     Text
•   Can enable much faster data mining

•   Can bypass some setup and modeling                          Text Mining
    effort
                                                         Other use              Entity
                                                         cases                  Extraction
•   Data Mining is “the automatic or semi-automatic
    analysis of large quantities of data to extract
    previously unknown interesting patterns” Wikipedia                            Data
                                                             Structured
                                                             Data                 Mining
•   Examples of data mining:

     •   Association analysis - e.g., which 2 or 3                            Something
         materials when used together are correlated                          Interesting?
         with a high degree of infection

     •   Cluster analysis – e.g., patient micro-
         segmentation

     •   Anomaly / Outlier Detection –e.g., network
         breaches
Transaction Processing


•   Some Big Data platforms can be used for some types of
    transaction processing

•   Where performance is more important than consistency e.g.,
    a Facebook user updating his/her status

•   More on this later…
Poll


•   What type of Big Data use case would be most beneficial for
    your client?

     •   Complex Event Processing (using massive/numerous
         streams of real-time data)

     •   Unstructured Data Analysis

     •   Predictive Data Mining

     •   Transaction Processing (where performance more
         important than consistency)




                                                                    25
Big Data Architecture and Key Technologies
Big Data Stack
Hadoop

•   Used for batch processing – inserts/appends only – no updates

•   Single master – works across many nodes, but only a single data
    center

•   Key components

     •   HDFS – Hadoop Distributed File System

     •   MapReduce – Distributes data in key value pairs across nodes, parallel
         processing, summarize results

     •   Hbase – database built on top of Hadoop (with interactive capabilities)

     •   Hive – SQL like query tool (converts to MapReduce)

     •   Pig – Higher level execution language (vs. having to use Java, Python) –
         converts to MapRduce




                                                                                       28
Cassandra


•   Used for real-time processing / transaction processing

•   Multiple masters – works across many nodes and many data
    centers

•   Key components

     •   CFS – Cassandra File Systems

     •   CQL – Cassandra Query Language (SQL like)

•   Tunable consistency for writes or reads. E.g., option to ensure a write
    succeeds to each replica in all data centers before returning control to
    program …. or can be much less restrictive




                                                                               29
In memory processing


•   To support real-time operations, an IMDB (In-Memory Database)
    may be used

     •   Solo – or in conjunction with a disk based DBMS

•   I/O most expensive part of computing – using in memory database /cache
    reduces bottlenecks

•   Can be distributed (e.g., memcache, Terracotta, Kx)

•   Relational or non-relational

     •   E.g., for a DW, current values might reside in an IMDB, historical data on disk




                                                                                           30
MPP RDBMS


•   Have been in around for 15+ years

•   Used for large scale Data Warehousing

•   Ideal where lots of joins are needed on massive amount of data

•   Many NoSQL DB’s rely on 100% denormalization. Many do not
    support join operations (e.g., wide column stores) or updates




                                                                     31
Semantic Web


•   Semantic Web – web of data, not documents

•   Machine learning (inferencing) can be enabled via Semantic Web
    technologies. May use a graph database/triplestore (e.g.,
    Neo4j, Allegrograph, Meronymy)

•   Bridge the semantic divide (varying vocabularies) with
    ontologies – helps address the “Variety” aspect of Big Data

•   Encapsulate data values, metadata, joins, logic, business rules,
    ontologies, access methods in the data via common logical model
    (e.g., RDF triples) – very powerful for automation, federated
    queries




                                                                       32
Semantic Web
Find Jane Doe’s relatives (with machine inferencing)

           System X                            System Y                   System Z


                      a:JoeDoe                    :isInLaw

            :hasBrother          :hasBrother

                                                             :marriedTo
x:DebDoe                                       y:JohnDoe                    z:JaneDoe
                  :hasBrother



                                          :isInLaw
                                                                            Original data
                                                                            Inferred data


                                                                                        33
No One Size Fits All


 Many types of solutions will require multiple data
  paradigms

 E.g. Facebook uses MySQL (relational), Hadoop, Cassandra,
  Hive, etc., for the different types of processing required

 Be sure to have a solid use case before deciding to use Big
  Data / NoSQL technology

 Provide solid business and technical justification
What type of data store to use??
Big Data impact on Application Development
           and Data Management
ACID / CAP / BASE


 If your transaction processing application must be ACID compliant, you must
  use an RDBMS (or ODBMS)

 ACID – Atomic, Consistent, Isolated, Durable

        Atomic – All tasks in a transaction succeed – or none do
        Consistent – Adheres to db rules, no partially completed transactions
        Isolated – Transactions can’t see data from other uncommitted transactions
        Durable – Committed transaction persists even if system fails



 Not all transactions require ACID – eventual consistency may be adequate



                                       Vs..
ACID / CAP / BASE


 Brewer’s CAP theorum for distributed database

      Consistency, Availability, Partition Tolerance - Pick 2!

 For Big Data, BASE is alternative for ACID


     Basically Available – data will be available for requests, might not be consistent

     Soft state – due to eventual consistency, the system might be continually changing

     Eventually consistent – the system will eventually be consistent when input stops

•   Example: HBase every transaction will execute, but only the most recent for a
    key will persist (LILO – last in, last out) – no locking
Data Management


 Security not as mature with NoSQL – might use OS level encryption (e.g.,, IBM
  Guardium Encryption Expert, Gazzanga) - encyrpt/decrypt at IO level

 Data Governance needs to oversee Big Data – new knowledge uncovered can
  lead to risks - privacy, intellectual property, regulatory compliance, etc.

•   Physical Data Modeling less important – due to “schema-less” nature of NoSQL

     •   Conceptual Modeling still important for understanding business objects and
         relationships
     •   Semantic modeling – inform ontologies which enable inferencing
     •   Logical Data Modeling still useful for reasoning and communicating about how
         data will be organized

•   Due to schema-less nature of NoSQL – metadata management will be more
    important!
      • E.g., wide-column store with billions of records and millions of variable columns
        – useless unless you have the metadata to understand the data
Getting started


•   Data Scientist is a key role in Big Data – requires statistics, data modeling, and
    programming skills. Not many around and expect to pay $$$’s.

•   Big Data technologies represent a significant paradigm shift. Be sure to allow budget
    for training, sandbox environment, etc.

•   Start small with Big Data . Start with a single use case – allocate significant
    amount of time for learning curve, and environment setup, testing, tuning,
    management.

•   Working with open source software can present challenges. Investigate purchase of
    value added software for simplification. Tools such as IBM Big Insights, EMC
    Greenplum UAP (Unified Analytics Platform) adds analytical, administration, workflow,
    security, and other functionality.




                                                                                         40
Summary
Summary


 Big Data presents significant opportunities

 Big Data is distinguished by volume, velocity, and variety

 Big Data is not just Hadoop / MapReduce and not just NoSQL

 Key enabler for Big Data is Massively Parallel Processing (MPP)

 Using commodity hardware and open source software are options to drive
  down cost of Big Data

 Big Data and NoSQL technologies require a learning curve, and will continue to
  mature
Resources


 Perficient Healthcare: http://healthcare.perficient.com

 Perficient Healthcare IT blog: http://blogs.perficient.com/healthcare/

 Perficient Healthcare Twitter: @Perficient_HC

 Apache – download and learn more about Hadoop, Cassandra, etc.

     http://hadoop.apache.org/

     http://cassandra.apache.org/

 Comprehensive list with description of NoSQL databases: http://nosql-
  database.org/links.html

 Translational Medicine Ontology (TMO) - applying Semantic Web for
  personalized medicine: http://www.w3.org/wiki/HCLSIG/PharmaOntology
Q&A
About Perficient




Perficient is a leading information technology consulting firm serving
clients throughout North America.

We help clients implement business-driven technology solutions that
integrate business processes, improve worker productivity, increase
customer loyalty and create a more agile enterprise to better respond
to new business opportunities.
PRFT Profile
   Founded in 1997

   Public, NASDAQ: PRFT

   2011 Revenue of $260 million

   20 major market locations throughout North America
     — Atlanta, Austin, Charlotte, Chicago, Cincinnati, Cleveland,
       Columbus, Dallas, Denver, Detroit, Fairfax, Houston,
       Indianapolis, Minneapolis, New Orleans, Philadelphia, San
       Francisco, San Jose, St. Louis and Toronto


   1,800+ colleagues

   Dedicated solution practices

   600+ enterprise clients (2011) and 85% repeat business
    rate

   Alliance partnerships with major technology vendors

   Multiple vendor/industry technology and growth awards
Our Solutions Expertise & Services

Business-Driven Solutions              Perficient Services
• Enterprise Portals                    End-to-End Solution Delivery
• SOA and Business Process              IT Strategic Consulting
  Management                            IT Architecture Planning
• Business Intelligence                 Business Process & Workflow
• User-Centered Custom Applications       Consulting
• CRM Solutions                         Usability and UI Consulting
• Enterprise Performance Management     Custom Application Development
• Customer Self-Service                 Offshore Development
• eCommerce & Product Information       Package Selection, Implementation
  Management                              and Integration
• Enterprise Content Management         Architecture & Application Migrations
• Industry-Specific Solutions           Education
• Mobile Technology
• Security Assessments
                       Perficient brings deep solutions expertise and offers
                       a complete set of flexible services to help clients
                       implement business-driven IT solutions                  47

More Related Content

What's hot

Big Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation SlidesBig Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation SlidesSlideTeam
 
Big Data Analytics for Healthcare
Big Data Analytics for HealthcareBig Data Analytics for Healthcare
Big Data Analytics for HealthcareChandan Reddy
 
5 Reasons Why Healthcare Data is Unique and Difficult to Measure
5 Reasons Why Healthcare Data is Unique and Difficult to Measure5 Reasons Why Healthcare Data is Unique and Difficult to Measure
5 Reasons Why Healthcare Data is Unique and Difficult to MeasureHealth Catalyst
 
Healthcare analytics
Healthcare analytics Healthcare analytics
Healthcare analytics Arun K
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareSkillspeed
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcareDeZyre
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data ScienceSpotle.ai
 
Ppt for Application of big data
Ppt for Application of big dataPpt for Application of big data
Ppt for Application of big dataPrashant Sharma
 
Big Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data ScientistsBig Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data ScientistsWay-Yen Lin
 
Top 10 digital transformation trends for healthcare in 2022
Top 10 digital transformation trends for healthcare in 2022Top 10 digital transformation trends for healthcare in 2022
Top 10 digital transformation trends for healthcare in 2022IndusNetMarketing
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingHealth Catalyst
 
Top Healthcare Trends 2022
Top Healthcare Trends 2022Top Healthcare Trends 2022
Top Healthcare Trends 2022Capgemini
 
HL7 & Health Information Exchange in Thailand
HL7 & Health Information Exchange in ThailandHL7 & Health Information Exchange in Thailand
HL7 & Health Information Exchange in ThailandNawanan Theera-Ampornpunt
 
eBook - Data Analytics in Healthcare
eBook - Data Analytics in HealthcareeBook - Data Analytics in Healthcare
eBook - Data Analytics in HealthcareNextGen Healthcare
 

What's hot (20)

Big Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation SlidesBig Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation Slides
 
Big Data Analytics for Healthcare
Big Data Analytics for HealthcareBig Data Analytics for Healthcare
Big Data Analytics for Healthcare
 
5 Reasons Why Healthcare Data is Unique and Difficult to Measure
5 Reasons Why Healthcare Data is Unique and Difficult to Measure5 Reasons Why Healthcare Data is Unique and Difficult to Measure
5 Reasons Why Healthcare Data is Unique and Difficult to Measure
 
Big Data
Big DataBig Data
Big Data
 
Healthcare analytics
Healthcare analytics Healthcare analytics
Healthcare analytics
 
BIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in HealthcareBIG Data & Hadoop Applications in Healthcare
BIG Data & Hadoop Applications in Healthcare
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
Data Analytics Life Cycle
Data Analytics Life CycleData Analytics Life Cycle
Data Analytics Life Cycle
 
Ppt for Application of big data
Ppt for Application of big dataPpt for Application of big data
Ppt for Application of big data
 
Big Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data ScientistsBig Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data Scientists
 
Top 10 digital transformation trends for healthcare in 2022
Top 10 digital transformation trends for healthcare in 2022Top 10 digital transformation trends for healthcare in 2022
Top 10 digital transformation trends for healthcare in 2022
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
 
Top Healthcare Trends 2022
Top Healthcare Trends 2022Top Healthcare Trends 2022
Top Healthcare Trends 2022
 
HL7 & Health Information Exchange in Thailand
HL7 & Health Information Exchange in ThailandHL7 & Health Information Exchange in Thailand
HL7 & Health Information Exchange in Thailand
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data architecture
Big data architectureBig data architecture
Big data architecture
 
eBook - Data Analytics in Healthcare
eBook - Data Analytics in HealthcareeBook - Data Analytics in Healthcare
eBook - Data Analytics in Healthcare
 

Viewers also liked

How Big Data is Transforming Medical Information Insights - DIA 2014
How Big Data is Transforming Medical Information Insights - DIA 2014How Big Data is Transforming Medical Information Insights - DIA 2014
How Big Data is Transforming Medical Information Insights - DIA 2014CREATION
 
Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Matthieu Schapranow
 
Big Data
Big DataBig Data
Big DataNGDATA
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedInData Infrastructure at LinkedIn
Data Infrastructure at LinkedInAmy W. Tang
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedInData Infrastructure at LinkedIn
Data Infrastructure at LinkedInAmy W. Tang
 
Personal branding playbook
Personal branding playbookPersonal branding playbook
Personal branding playbookOnline Business
 
LinkedIn Segmentation & Targeting Platform: A Big Data Application
LinkedIn Segmentation & Targeting Platform: A Big Data ApplicationLinkedIn Segmentation & Targeting Platform: A Big Data Application
LinkedIn Segmentation & Targeting Platform: A Big Data ApplicationAmy W. Tang
 
Resume- William Myers FD2016.1.4
Resume- William Myers FD2016.1.4Resume- William Myers FD2016.1.4
Resume- William Myers FD2016.1.4William Myers
 
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Shirshanka Das
 
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Shirshanka Das
 
Unlocking the Experts
Unlocking the ExpertsUnlocking the Experts
Unlocking the ExpertsLinkedIn
 
Participatory Design: Bringing Users Into Your Process
Participatory Design: Bringing Users Into Your ProcessParticipatory Design: Bringing Users Into Your Process
Participatory Design: Bringing Users Into Your ProcessDavid Sherwin
 
Introduction To TensorFlow | Deep Learning Using TensorFlow | TensorFlow Tuto...
Introduction To TensorFlow | Deep Learning Using TensorFlow | TensorFlow Tuto...Introduction To TensorFlow | Deep Learning Using TensorFlow | TensorFlow Tuto...
Introduction To TensorFlow | Deep Learning Using TensorFlow | TensorFlow Tuto...Edureka!
 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShareSlideShare
 
Making Great User Experiences, Pittsburgh Scrum MeetUp, Oct 17, 2017
Making Great User Experiences, Pittsburgh Scrum MeetUp, Oct 17, 2017Making Great User Experiences, Pittsburgh Scrum MeetUp, Oct 17, 2017
Making Great User Experiences, Pittsburgh Scrum MeetUp, Oct 17, 2017Carol Smith
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Edureka!
 

Viewers also liked (20)

Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
How Big Data is Transforming Medical Information Insights - DIA 2014
How Big Data is Transforming Medical Information Insights - DIA 2014How Big Data is Transforming Medical Information Insights - DIA 2014
How Big Data is Transforming Medical Information Insights - DIA 2014
 
Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data
Big DataBig Data
Big Data
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedInData Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedInData Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 
Personal branding playbook
Personal branding playbookPersonal branding playbook
Personal branding playbook
 
LinkedIn Segmentation & Targeting Platform: A Big Data Application
LinkedIn Segmentation & Targeting Platform: A Big Data ApplicationLinkedIn Segmentation & Targeting Platform: A Big Data Application
LinkedIn Segmentation & Targeting Platform: A Big Data Application
 
Resume- William Myers FD2016.1.4
Resume- William Myers FD2016.1.4Resume- William Myers FD2016.1.4
Resume- William Myers FD2016.1.4
 
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
 
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
 
Unlocking the Experts
Unlocking the ExpertsUnlocking the Experts
Unlocking the Experts
 
Participatory Design: Bringing Users Into Your Process
Participatory Design: Bringing Users Into Your ProcessParticipatory Design: Bringing Users Into Your Process
Participatory Design: Bringing Users Into Your Process
 
Introduction To TensorFlow | Deep Learning Using TensorFlow | TensorFlow Tuto...
Introduction To TensorFlow | Deep Learning Using TensorFlow | TensorFlow Tuto...Introduction To TensorFlow | Deep Learning Using TensorFlow | TensorFlow Tuto...
Introduction To TensorFlow | Deep Learning Using TensorFlow | TensorFlow Tuto...
 
What is big data?
What is big data?What is big data?
What is big data?
 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShare
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Making Great User Experiences, Pittsburgh Scrum MeetUp, Oct 17, 2017
Making Great User Experiences, Pittsburgh Scrum MeetUp, Oct 17, 2017Making Great User Experiences, Pittsburgh Scrum MeetUp, Oct 17, 2017
Making Great User Experiences, Pittsburgh Scrum MeetUp, Oct 17, 2017
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
 

Similar to Using Big Data for Improved Healthcare Operations and Analytics

BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la IglesiaBIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la IglesiaMaria de la Iglesia
 
BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. maigva
 
Big Data is the Future of Healthcare
Big Data is the Future of HealthcareBig Data is the Future of Healthcare
Big Data is the Future of HealthcareCognizant
 
Introduction to Big Data and its Potential for Dementia Research
Introduction to Big Data and its Potential for Dementia ResearchIntroduction to Big Data and its Potential for Dementia Research
Introduction to Big Data and its Potential for Dementia ResearchDavid De Roure
 
Big data
Big dataBig data
Big dataCisco
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applicationsPadma Metta
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2RojaT4
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forumChris Dwan
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcareBYTE Project
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Robert Grossman
 
1 PSUT Big Data Class, introduction
1 PSUT Big Data Class,  introduction1 PSUT Big Data Class,  introduction
1 PSUT Big Data Class, introductionAkram Al-Kouz
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysisPoonam Kshirsagar
 

Similar to Using Big Data for Improved Healthcare Operations and Analytics (20)

Big data analystics
Big data analysticsBig data analystics
Big data analystics
 
Big Data
Big Data Big Data
Big Data
 
2015 04-18-wilson cg
2015 04-18-wilson cg2015 04-18-wilson cg
2015 04-18-wilson cg
 
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la IglesiaBIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
BIMCV, Banco de Imagen Medica de la Comunidad Valenciana. María de la Iglesia
 
Big Data for Library Services (2017)
Big Data for Library Services (2017)Big Data for Library Services (2017)
Big Data for Library Services (2017)
 
BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm. BIMCV: The Perfect "Big Data" Storm.
BIMCV: The Perfect "Big Data" Storm.
 
Big Data in Clinical Research
Big Data in Clinical ResearchBig Data in Clinical Research
Big Data in Clinical Research
 
Big Data is the Future of Healthcare
Big Data is the Future of HealthcareBig Data is the Future of Healthcare
Big Data is the Future of Healthcare
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Introduction to Big Data and its Potential for Dementia Research
Introduction to Big Data and its Potential for Dementia ResearchIntroduction to Big Data and its Potential for Dementia Research
Introduction to Big Data and its Potential for Dementia Research
 
Big data
Big dataBig data
Big data
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applications
 
BIG DATA.ppt
BIG DATA.pptBIG DATA.ppt
BIG DATA.ppt
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
2016 09 cxo forum
2016 09 cxo forum2016 09 cxo forum
2016 09 cxo forum
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)
 
1 PSUT Big Data Class, introduction
1 PSUT Big Data Class,  introduction1 PSUT Big Data Class,  introduction
1 PSUT Big Data Class, introduction
 
Data minig with Big data analysis
Data minig with Big data analysisData minig with Big data analysis
Data minig with Big data analysis
 

More from Perficient, Inc.

Driving Strong 2020 Holiday Season Results
Driving Strong 2020 Holiday Season ResultsDriving Strong 2020 Holiday Season Results
Driving Strong 2020 Holiday Season ResultsPerficient, Inc.
 
Transforming Pharmacovigilance Workflows with AI & Automation
Transforming Pharmacovigilance Workflows with AI & Automation Transforming Pharmacovigilance Workflows with AI & Automation
Transforming Pharmacovigilance Workflows with AI & Automation Perficient, Inc.
 
The Secret to Acquiring and Retaining Customers in Financial Services
The Secret to Acquiring and Retaining Customers in Financial ServicesThe Secret to Acquiring and Retaining Customers in Financial Services
The Secret to Acquiring and Retaining Customers in Financial ServicesPerficient, Inc.
 
Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.
Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.
Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.Perficient, Inc.
 
Content, Commerce, and... COVID
Content, Commerce, and... COVIDContent, Commerce, and... COVID
Content, Commerce, and... COVIDPerficient, Inc.
 
Centene's Financial Transformation Journey: A OneStream Success Story
Centene's Financial Transformation Journey: A OneStream Success StoryCentene's Financial Transformation Journey: A OneStream Success Story
Centene's Financial Transformation Journey: A OneStream Success StoryPerficient, Inc.
 
Automate Medical Coding With WHODrug Koda
Automate Medical Coding With WHODrug KodaAutomate Medical Coding With WHODrug Koda
Automate Medical Coding With WHODrug KodaPerficient, Inc.
 
Preparing for Your Oracle, Medidata, and Veeva CTMS Migration Project
Preparing for Your Oracle, Medidata, and Veeva CTMS Migration ProjectPreparing for Your Oracle, Medidata, and Veeva CTMS Migration Project
Preparing for Your Oracle, Medidata, and Veeva CTMS Migration ProjectPerficient, Inc.
 
Accelerating Partner Management: How Manufacturers Can Navigate Covid-19
Accelerating Partner Management: How Manufacturers Can Navigate Covid-19Accelerating Partner Management: How Manufacturers Can Navigate Covid-19
Accelerating Partner Management: How Manufacturers Can Navigate Covid-19Perficient, Inc.
 
The Critical Role of Audience Intelligence with Eric Enge and Rand Fishkin
The Critical Role of Audience Intelligence with Eric Enge and Rand FishkinThe Critical Role of Audience Intelligence with Eric Enge and Rand Fishkin
The Critical Role of Audience Intelligence with Eric Enge and Rand FishkinPerficient, Inc.
 
Cardtronics Future Ready with Oracle EPM Cloud
Cardtronics Future Ready with Oracle EPM CloudCardtronics Future Ready with Oracle EPM Cloud
Cardtronics Future Ready with Oracle EPM CloudPerficient, Inc.
 
Teams Summit - What is New and Coming
Teams Summit -  What is New and ComingTeams Summit -  What is New and Coming
Teams Summit - What is New and ComingPerficient, Inc.
 
Empower Your Organization with Teams & Remote Work Crisis Management
Empower Your Organization with Teams & Remote Work Crisis ManagementEmpower Your Organization with Teams & Remote Work Crisis Management
Empower Your Organization with Teams & Remote Work Crisis ManagementPerficient, Inc.
 
Adoption & Change Management Overview
Adoption & Change Management OverviewAdoption & Change Management Overview
Adoption & Change Management OverviewPerficient, Inc.
 
Microsoft Teams: Measuring Activity of Employees Working from Home
Microsoft Teams: Measuring Activity of Employees Working from HomeMicrosoft Teams: Measuring Activity of Employees Working from Home
Microsoft Teams: Measuring Activity of Employees Working from HomePerficient, Inc.
 
Securing Teams with Microsoft 365 Security for Remote Work
Securing Teams with Microsoft 365 Security for Remote WorkSecuring Teams with Microsoft 365 Security for Remote Work
Securing Teams with Microsoft 365 Security for Remote WorkPerficient, Inc.
 
Infrastructure Best Practices for Teams Remote Workers
Infrastructure Best Practices for Teams Remote WorkersInfrastructure Best Practices for Teams Remote Workers
Infrastructure Best Practices for Teams Remote WorkersPerficient, Inc.
 
Accelerate Adoption for Microsoft Teams
Accelerate Adoption for Microsoft TeamsAccelerate Adoption for Microsoft Teams
Accelerate Adoption for Microsoft TeamsPerficient, Inc.
 
Preparing for Project Cortex and the Future of Knowledge Management
Preparing for Project Cortex and the Future of Knowledge ManagementPreparing for Project Cortex and the Future of Knowledge Management
Preparing for Project Cortex and the Future of Knowledge ManagementPerficient, Inc.
 
Utilizing Microsoft 365 Security for Remote Work
Utilizing Microsoft 365 Security for Remote Work Utilizing Microsoft 365 Security for Remote Work
Utilizing Microsoft 365 Security for Remote Work Perficient, Inc.
 

More from Perficient, Inc. (20)

Driving Strong 2020 Holiday Season Results
Driving Strong 2020 Holiday Season ResultsDriving Strong 2020 Holiday Season Results
Driving Strong 2020 Holiday Season Results
 
Transforming Pharmacovigilance Workflows with AI & Automation
Transforming Pharmacovigilance Workflows with AI & Automation Transforming Pharmacovigilance Workflows with AI & Automation
Transforming Pharmacovigilance Workflows with AI & Automation
 
The Secret to Acquiring and Retaining Customers in Financial Services
The Secret to Acquiring and Retaining Customers in Financial ServicesThe Secret to Acquiring and Retaining Customers in Financial Services
The Secret to Acquiring and Retaining Customers in Financial Services
 
Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.
Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.
Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.
 
Content, Commerce, and... COVID
Content, Commerce, and... COVIDContent, Commerce, and... COVID
Content, Commerce, and... COVID
 
Centene's Financial Transformation Journey: A OneStream Success Story
Centene's Financial Transformation Journey: A OneStream Success StoryCentene's Financial Transformation Journey: A OneStream Success Story
Centene's Financial Transformation Journey: A OneStream Success Story
 
Automate Medical Coding With WHODrug Koda
Automate Medical Coding With WHODrug KodaAutomate Medical Coding With WHODrug Koda
Automate Medical Coding With WHODrug Koda
 
Preparing for Your Oracle, Medidata, and Veeva CTMS Migration Project
Preparing for Your Oracle, Medidata, and Veeva CTMS Migration ProjectPreparing for Your Oracle, Medidata, and Veeva CTMS Migration Project
Preparing for Your Oracle, Medidata, and Veeva CTMS Migration Project
 
Accelerating Partner Management: How Manufacturers Can Navigate Covid-19
Accelerating Partner Management: How Manufacturers Can Navigate Covid-19Accelerating Partner Management: How Manufacturers Can Navigate Covid-19
Accelerating Partner Management: How Manufacturers Can Navigate Covid-19
 
The Critical Role of Audience Intelligence with Eric Enge and Rand Fishkin
The Critical Role of Audience Intelligence with Eric Enge and Rand FishkinThe Critical Role of Audience Intelligence with Eric Enge and Rand Fishkin
The Critical Role of Audience Intelligence with Eric Enge and Rand Fishkin
 
Cardtronics Future Ready with Oracle EPM Cloud
Cardtronics Future Ready with Oracle EPM CloudCardtronics Future Ready with Oracle EPM Cloud
Cardtronics Future Ready with Oracle EPM Cloud
 
Teams Summit - What is New and Coming
Teams Summit -  What is New and ComingTeams Summit -  What is New and Coming
Teams Summit - What is New and Coming
 
Empower Your Organization with Teams & Remote Work Crisis Management
Empower Your Organization with Teams & Remote Work Crisis ManagementEmpower Your Organization with Teams & Remote Work Crisis Management
Empower Your Organization with Teams & Remote Work Crisis Management
 
Adoption & Change Management Overview
Adoption & Change Management OverviewAdoption & Change Management Overview
Adoption & Change Management Overview
 
Microsoft Teams: Measuring Activity of Employees Working from Home
Microsoft Teams: Measuring Activity of Employees Working from HomeMicrosoft Teams: Measuring Activity of Employees Working from Home
Microsoft Teams: Measuring Activity of Employees Working from Home
 
Securing Teams with Microsoft 365 Security for Remote Work
Securing Teams with Microsoft 365 Security for Remote WorkSecuring Teams with Microsoft 365 Security for Remote Work
Securing Teams with Microsoft 365 Security for Remote Work
 
Infrastructure Best Practices for Teams Remote Workers
Infrastructure Best Practices for Teams Remote WorkersInfrastructure Best Practices for Teams Remote Workers
Infrastructure Best Practices for Teams Remote Workers
 
Accelerate Adoption for Microsoft Teams
Accelerate Adoption for Microsoft TeamsAccelerate Adoption for Microsoft Teams
Accelerate Adoption for Microsoft Teams
 
Preparing for Project Cortex and the Future of Knowledge Management
Preparing for Project Cortex and the Future of Knowledge ManagementPreparing for Project Cortex and the Future of Knowledge Management
Preparing for Project Cortex and the Future of Knowledge Management
 
Utilizing Microsoft 365 Security for Remote Work
Utilizing Microsoft 365 Security for Remote Work Utilizing Microsoft 365 Security for Remote Work
Utilizing Microsoft 365 Security for Remote Work
 

Recently uploaded

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Recently uploaded (20)

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Using Big Data for Improved Healthcare Operations and Analytics

  • 1. Big Data for Healthcare: Usage, Architecture and Technologies
  • 2. Presenters Pete Stiglich – Sr. Technical Architect  Over 20 years IT experience  Enterprise Data Architecture, Data Management, Data Modeling, Data Quality, DW/BI, MDM, Metadata Management, Data Quality, Database Administration (DBA)  President of DAMA Phoenix, writer, speaker, former editor Real World Decision Support, listed expert for SearchDataManagement – Data Warehousing and Data Modeling  Certified Data Management Professional (CDMP) and Certified Business Intelligence Professional (CBIP), both at master level Email: Pete.Stiglich@Perficient.com Phone: 602-284-0992 Twitter: @pstiglich Blog: http://blogs.perficient.com/healthcare/blog/author/pstiglich/
  • 3. Presenters Hari Rajagopal – Sr. Solution Architect • Over 15 years IT experience • SOA solutions, Enterprise Service Bus technologies, Data Architecture, Algorithms • Presenter at conferences, Author and Blogger • IBM certified SOA solutions designer Email: Hari.Rajagopal@Perficient.com Phone: 303-517-9634
  • 4. Key Takeaway Points • Big Data technologies represent a major paradigm shift – and is here to stay! • Big Data enables “all” the data to be leveraged for new insight– clinical notes, medical literature, OR videos, X-rays, consultation recordings, streaming medical device data, etc. • More intelligent enterprise – more efficient and prevalent advanced analytics (predictive data mining, text mining, etc.) • Big Data will affect application development and data management
  • 5. Agenda • What is Big Data?  How Big Data can enable better healthcare  Types of Big Data processing  Key technologies  Impacts of Big Data on:  Application Development  Data Management  Q&A
  • 6. What is Big Data?
  • 7. What is “Big Data” • Datasets which are too large, grow too rapidly, or are too varied to handle using traditional techniques • Volume, Velocity, Variety • Volume – 100’s of TB’s, petabytes, and beyond • Velocity – e.g., machine generated data, medical devices, sensors • Variety – unstructured data, many formats, varying semantics • Not every data problem is a “Big Data” problem!!
  • 8. MPP enables Big Data 100’s, 1,000’s of nodes Scalability Scalability Cluster (homogenous) or Grid (heterogeneous) SMP – Symmetric MPP – Massively Parallel Multiprocessing Processing “Shared Everything” “Shared Nothing” CPU, memory, disk (SAN, NAS) Nodes do not share CPU, memory, disk (DAS)
  • 9. Cost Factor  Cost of storing and analyzing Big Data can be driven down by:  Low cost commodity hardware  Open source software  Public Cloud? Yes, But for really massive amounts of data with many accesses, may be cost prohibitive  Learning curve? You bet!
  • 10. Hadoop / MapReduce • Hadoop and MapReduce – key Big Data technologies developed at Google, now open source • “Divide and conquer” approach • Highly fault tolerant – nodes are expected to fail • Every data block (by default) replicated on 3 nodes (is also rack aware) • MapReduce – component of Hadoop, programming framework for distributed processing • Not the only Big Data technology…
  • 11. NoSQL • Stands for “Not only SQL” – really s/b “Not only Relational”  New(ish) paradigms for storing and retrieving data  Many Big Data platforms don’t use a RDBMS  Might take too long to setup / change  Problems with certain types of queries (e.g., social media, ragged hierarchies)  Key Types of NoSQL Data Stores • Key-Value Pair • Wide Column • Graph • Document • Object • XML
  • 12. How can “Big Data” improve Healthcare?
  • 13. Healthcare “Big Data” opportunities • Examples of Big Data opportunities  Patient Monitoring – inpatient, ICU, ER, home health  Personalized Medicine  Population health management / ACO  Epidemiology  Keeping abreast of medical literature  Research  Many more…
  • 14. Healthcare “Big Data” opportunities • Patient Monitoring  Big Data can enable Complex Event Processing (CEP) – dealing with multiple, large streams of data in real-time from medical devices, sensors, RFID, etc.  Proactively address risk, improve quality, improve processes, etc.  Data might not be persisted – Big Data can be used for distributed processing with the data located only in memory  Example – an HL7 A01 message (admit a patient) received for an inpatient visit – but no PV1 Assigned Patient Location received within X hours. Is the patient on a gurney in a hallway somewhere???  Example – home health sensor in a bed indicates patient hasn’t gotten out of bed for X number of hours
  • 15. Healthcare “Big Data” opportunities • Personalized Medicine  Genomic, proteomic, and metabolic data is large, complex, and varied  Can have gigabytes of data for a single patient  Use case examples - protein footprints, gene expression  Difficult to use with a relational database, XML performance problematic  Use wide-column stores, graphs, key-value stores (or combinations) for better scalability and performance Source: wikipedia
  • 16. Healthcare “Big Data” opportunities • Population Management  Preventative care for ACO – micro-segmentation of patients  Identify most at risk patients – allocate resources wisely to help these patients (e.g., 1% of 100,000 patients had 30% of the costs)*  Reduce admits/re-admits, ER visits, etc.  Identify potential causes for infections, readmissions (e.g., which two materials when used together are correlated with high rates of infection)  Even with structured data, data mining can be time consuming – distributed processing can speed up data mining * http://nyr.kr/L8o1Ag (New Yorker article)
  • 17. Healthcare “Big Data” opportunities • Epidemiology  Analysis of patterns and trends in health issues across a geography  Tracking of the spread of disease based on streaming data  Visualization of global outbreaks enabling the determination of ‘source’ of infection 17
  • 18. Healthcare “Big Data” opportunities • Unstructured data analysis  Most data (80%) resides in unstructured or semi-structured sources – and a wealth of information might be gleaned  One company allows dermatology patients to upload pictures on a regular basis to analyze moles in an automated fashion to check for melanoma based on redness, asymmetry, thickness, etc.  A lot of information contained in clinical notes, but hard to extract  Providers can’t keep abreast of medical literature – even specialists! Use Big Data and Semantic Web technologies to identify highly relevant literature  Sentiment analysis – using surveys, social media  Etc…
  • 19. Poll • What Healthcare Big Data use case do you see as being most important for your organization? • Patient Monitoring • Personalized Medicine • Population Management (e.g., for ACO) • Epidemiology • More effective use of medical literature • Medical research • Unstructured data analysis • Quality Improvement • Other 19
  • 20. Types of Big Data processing
  • 21. Analytics • Big Data ideal for experimental / discovery analytics • Faster setup, data quality not as critical • Enables Data Scientists to formulate and investigate hypotheses more rapidly, with less expense • May discover useful knowledge . . . or not • Fail faster – so as to move on to the next hypothesis !
  • 22. Unstructured Data Mining • Big Data can make mining unstructured sources(text, audio, video, image) more prevalent - more cost effective, with better performance • E.g., extract structured information, categorize documents, analyze shapes, coloration, how long was a video viewed, etc. • Text Mining capabilities • Entity Extraction – extracting names, locations, dates, products, diseases, Rx, conditions, etc., from text • Topic Tracking – track information of interest to a user • Categorization – categorize a document based on wordcounts/synonyms, etc. • Clustering – grouping similar documents • Concept Linking – related documents based on shared concepts • Question Answering – try to find best answer based on user’s environment
  • 23. Data Mining Text • Can enable much faster data mining • Can bypass some setup and modeling Text Mining effort Other use Entity cases Extraction • Data Mining is “the automatic or semi-automatic analysis of large quantities of data to extract previously unknown interesting patterns” Wikipedia Data Structured Data Mining • Examples of data mining: • Association analysis - e.g., which 2 or 3 Something materials when used together are correlated Interesting? with a high degree of infection • Cluster analysis – e.g., patient micro- segmentation • Anomaly / Outlier Detection –e.g., network breaches
  • 24. Transaction Processing • Some Big Data platforms can be used for some types of transaction processing • Where performance is more important than consistency e.g., a Facebook user updating his/her status • More on this later…
  • 25. Poll • What type of Big Data use case would be most beneficial for your client? • Complex Event Processing (using massive/numerous streams of real-time data) • Unstructured Data Analysis • Predictive Data Mining • Transaction Processing (where performance more important than consistency) 25
  • 26. Big Data Architecture and Key Technologies
  • 28. Hadoop • Used for batch processing – inserts/appends only – no updates • Single master – works across many nodes, but only a single data center • Key components • HDFS – Hadoop Distributed File System • MapReduce – Distributes data in key value pairs across nodes, parallel processing, summarize results • Hbase – database built on top of Hadoop (with interactive capabilities) • Hive – SQL like query tool (converts to MapReduce) • Pig – Higher level execution language (vs. having to use Java, Python) – converts to MapRduce 28
  • 29. Cassandra • Used for real-time processing / transaction processing • Multiple masters – works across many nodes and many data centers • Key components • CFS – Cassandra File Systems • CQL – Cassandra Query Language (SQL like) • Tunable consistency for writes or reads. E.g., option to ensure a write succeeds to each replica in all data centers before returning control to program …. or can be much less restrictive 29
  • 30. In memory processing • To support real-time operations, an IMDB (In-Memory Database) may be used • Solo – or in conjunction with a disk based DBMS • I/O most expensive part of computing – using in memory database /cache reduces bottlenecks • Can be distributed (e.g., memcache, Terracotta, Kx) • Relational or non-relational • E.g., for a DW, current values might reside in an IMDB, historical data on disk 30
  • 31. MPP RDBMS • Have been in around for 15+ years • Used for large scale Data Warehousing • Ideal where lots of joins are needed on massive amount of data • Many NoSQL DB’s rely on 100% denormalization. Many do not support join operations (e.g., wide column stores) or updates 31
  • 32. Semantic Web • Semantic Web – web of data, not documents • Machine learning (inferencing) can be enabled via Semantic Web technologies. May use a graph database/triplestore (e.g., Neo4j, Allegrograph, Meronymy) • Bridge the semantic divide (varying vocabularies) with ontologies – helps address the “Variety” aspect of Big Data • Encapsulate data values, metadata, joins, logic, business rules, ontologies, access methods in the data via common logical model (e.g., RDF triples) – very powerful for automation, federated queries 32
  • 33. Semantic Web Find Jane Doe’s relatives (with machine inferencing) System X System Y System Z a:JoeDoe :isInLaw :hasBrother :hasBrother :marriedTo x:DebDoe y:JohnDoe z:JaneDoe :hasBrother :isInLaw Original data Inferred data 33
  • 34. No One Size Fits All  Many types of solutions will require multiple data paradigms  E.g. Facebook uses MySQL (relational), Hadoop, Cassandra, Hive, etc., for the different types of processing required  Be sure to have a solid use case before deciding to use Big Data / NoSQL technology  Provide solid business and technical justification
  • 35. What type of data store to use??
  • 36. Big Data impact on Application Development and Data Management
  • 37. ACID / CAP / BASE  If your transaction processing application must be ACID compliant, you must use an RDBMS (or ODBMS)  ACID – Atomic, Consistent, Isolated, Durable Atomic – All tasks in a transaction succeed – or none do Consistent – Adheres to db rules, no partially completed transactions Isolated – Transactions can’t see data from other uncommitted transactions Durable – Committed transaction persists even if system fails  Not all transactions require ACID – eventual consistency may be adequate Vs..
  • 38. ACID / CAP / BASE  Brewer’s CAP theorum for distributed database  Consistency, Availability, Partition Tolerance - Pick 2!  For Big Data, BASE is alternative for ACID Basically Available – data will be available for requests, might not be consistent Soft state – due to eventual consistency, the system might be continually changing Eventually consistent – the system will eventually be consistent when input stops • Example: HBase every transaction will execute, but only the most recent for a key will persist (LILO – last in, last out) – no locking
  • 39. Data Management  Security not as mature with NoSQL – might use OS level encryption (e.g.,, IBM Guardium Encryption Expert, Gazzanga) - encyrpt/decrypt at IO level  Data Governance needs to oversee Big Data – new knowledge uncovered can lead to risks - privacy, intellectual property, regulatory compliance, etc. • Physical Data Modeling less important – due to “schema-less” nature of NoSQL • Conceptual Modeling still important for understanding business objects and relationships • Semantic modeling – inform ontologies which enable inferencing • Logical Data Modeling still useful for reasoning and communicating about how data will be organized • Due to schema-less nature of NoSQL – metadata management will be more important! • E.g., wide-column store with billions of records and millions of variable columns – useless unless you have the metadata to understand the data
  • 40. Getting started • Data Scientist is a key role in Big Data – requires statistics, data modeling, and programming skills. Not many around and expect to pay $$$’s. • Big Data technologies represent a significant paradigm shift. Be sure to allow budget for training, sandbox environment, etc. • Start small with Big Data . Start with a single use case – allocate significant amount of time for learning curve, and environment setup, testing, tuning, management. • Working with open source software can present challenges. Investigate purchase of value added software for simplification. Tools such as IBM Big Insights, EMC Greenplum UAP (Unified Analytics Platform) adds analytical, administration, workflow, security, and other functionality. 40
  • 42. Summary  Big Data presents significant opportunities  Big Data is distinguished by volume, velocity, and variety  Big Data is not just Hadoop / MapReduce and not just NoSQL  Key enabler for Big Data is Massively Parallel Processing (MPP)  Using commodity hardware and open source software are options to drive down cost of Big Data  Big Data and NoSQL technologies require a learning curve, and will continue to mature
  • 43. Resources  Perficient Healthcare: http://healthcare.perficient.com  Perficient Healthcare IT blog: http://blogs.perficient.com/healthcare/  Perficient Healthcare Twitter: @Perficient_HC  Apache – download and learn more about Hadoop, Cassandra, etc.  http://hadoop.apache.org/  http://cassandra.apache.org/  Comprehensive list with description of NoSQL databases: http://nosql- database.org/links.html  Translational Medicine Ontology (TMO) - applying Semantic Web for personalized medicine: http://www.w3.org/wiki/HCLSIG/PharmaOntology
  • 44. Q&A
  • 45. About Perficient Perficient is a leading information technology consulting firm serving clients throughout North America. We help clients implement business-driven technology solutions that integrate business processes, improve worker productivity, increase customer loyalty and create a more agile enterprise to better respond to new business opportunities.
  • 46. PRFT Profile  Founded in 1997  Public, NASDAQ: PRFT  2011 Revenue of $260 million  20 major market locations throughout North America — Atlanta, Austin, Charlotte, Chicago, Cincinnati, Cleveland, Columbus, Dallas, Denver, Detroit, Fairfax, Houston, Indianapolis, Minneapolis, New Orleans, Philadelphia, San Francisco, San Jose, St. Louis and Toronto  1,800+ colleagues  Dedicated solution practices  600+ enterprise clients (2011) and 85% repeat business rate  Alliance partnerships with major technology vendors  Multiple vendor/industry technology and growth awards
  • 47. Our Solutions Expertise & Services Business-Driven Solutions Perficient Services • Enterprise Portals  End-to-End Solution Delivery • SOA and Business Process  IT Strategic Consulting Management  IT Architecture Planning • Business Intelligence  Business Process & Workflow • User-Centered Custom Applications Consulting • CRM Solutions  Usability and UI Consulting • Enterprise Performance Management  Custom Application Development • Customer Self-Service  Offshore Development • eCommerce & Product Information  Package Selection, Implementation Management and Integration • Enterprise Content Management  Architecture & Application Migrations • Industry-Specific Solutions  Education • Mobile Technology • Security Assessments Perficient brings deep solutions expertise and offers a complete set of flexible services to help clients implement business-driven IT solutions 47

Editor's Notes

  1. Avro – data serialization (keeps schema (JSON) with data)Kafka – real time streaming, coordination via Zookeeper. Hcatalog – metadata for all the data stored in Hadoop. Read data from Pig or Hive or HbaseOozie – scheduling system (Azkhaban – not Apache – more graphical scheduler)Flume – Log aggregation – ship to HadoopWhirr – Hadoop on Cloud – Whirr helps to automateSqoop – transfers data from RDBMS to HadoopMRUnit – unit testingMahout – Machine learning on HadoopBigTop – integrate Hadoop based software so it all works togetherCrunch – Library on top of Java Giraph – large scale distributed graph
  2. In this case the properties would have to be associated with rules to describe entailments (i.e., the inferences that can be drawn). These could be encoded using SWRL (Semantic web Rule Language), which also uses RDF.