SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Integrate Big Data into Your Organization
with Lower Total Costs
2
Perficient is a leading information technology consulting firm serving clients
throughout North America.
We help clients implement business-driven technology solutions that integrate
business processes, improve worker productivity, increase customer loyalty and create
a more agile enterprise to better respond to new business opportunities.
About Perficient
3
• Founded in 1997
• Public, NASDAQ: PRFT
• 2012 revenue of $327 million
• Major market locations throughout North America
• Atlanta, Austin, Boston, Charlotte, Chicago, Cincinnati, Cleveland, Columbus,
Dallas, Denver, Detroit, Fairfax, Houston, Indianapolis, Minneapolis, New
Orleans, New York, Northern California, Philadelphia, Southern California, St.
Louis , Toronto, and Washington, D.C.
• Global delivery centers in China, Europe and India
• ~2,000 colleagues
• Dedicated solution practices
• ~85% repeat business rate
• Alliance partnerships with major technology vendors
• Multiple vendor/industry technology and growth awards
Perficient Profile
4
Business Solutions
• Business Intelligence
• Business Process Management
• Customer Experience and CRM
• Enterprise Performance Management
• Enterprise Resource Planning
• Experience Design (XD)
• Management Consulting
Technology Solutions
• Business Integration/SOA
• Cloud Services
• Commerce
• Content Management
• Custom Application Development
• Education
• Information Management
• Mobile Platforms
• Platform Integration
• Portal & Social
Our Solutions Expertise
Speakers
Randall Gayle
• Data Management Director for Perficient
• 30+ years of data management experience
• Helps companies develop solutions around master data
management, data quality, data governance and data
integration.
• Provides data management expertise to industries including
oil and gas, financial services, banking, healthcare,
government, retail and manufacturing.
John Haddad
• Senior Director of Big Data Product Marketing for Informatica
• 25+ years of experience developing and marketing
enterprise applications.
• Advises organizations on Big Data best practices from a
management and technology perspective.
5
Interesting Facts about BIG Data
1. It took from the dawn of civilization to the year 2003 for the world to generate 1.8
zettabytes (10 to the 12th gigabytes) of data. In 2011 it took two days on average to
generate the same amount of data.
2. If you stacked a pile of CD-ROMs on top of one another until you’d reached the current
global storage capacity for digital information – about 295 Exabyte – if would stretch
80,000 km beyond the moon.
3. Every hour, enough information is consumed by internet traffic to fill 7 million DVDs.
Side by side, they’d scale Mount Everest 95 times.
4. 247 billion e-mail messages are sent each day… up to 80% of them are spam.
5. 48 hours of video are uploaded to YouTube every minute, resulting in 8 years’ worth of
digital content each day
6. The world’s data doubles every two years
7. There are nearly as many bits of information in the digital universe as there are stars in
our actual universe.
8. There are 30 billion pieces of content shared on Facebook every day and 750 million
photos uploaded every two days
6
Agenda
• Innovation vs. Cost
• PowerCenter Big Data Edition
• What else does Informatica offer for Big Data?
• What Are Customers Doing with Informatica and Big
Data?
• Next Steps
• Q&A
7
How do you balance innovation and
cost?
Business
CEO and VP/Director of
Sales & Marketing,
Customer Service,
Product Development
INNOVATION
How do you balance innovation and
cost?
IT
CIO and VP/Director of
Information Management,
BI / Data Warehousing,
Enterprise Architecture
Business
CEO and VP/Director of
Sales & Marketing,
Customer Service,
Product Development
COSTINNOVATION
How do you balance innovation and
cost?
Financial Services Retail & Telco Media & Entertainment
Public SectorManufacturing Healthcare & Pharma
Business is connecting innovation to Big Data
Risk & Portfolio
Analysis,
Investment
Recommendations
Proactive Customer
Engagement,
Location Based
Services
Financial Services Retail & Telco
Public SectorManufacturing Healthcare & Pharma
Media & Entertainment
Online & In-Game
Behavior
Customer X/Up-Sell
Business is connecting innovation to Big Data
Risk & Portfolio
Analysis,
Investment
Recommendations
Connected Vehicle,
Predictive Maintenance
Health Insurance
Exchanges,
Public Safety,
Tax Optimization
Fraud Detection
Predicting Patient
Outcomes,
Total Cost of Care
Drug Discovery
Proactive Customer
Engagement,
Location Based
Services
Financial Services Retail & Telco
Public SectorManufacturing Healthcare & Pharma
Media & Entertainment
Online & In-Game
Behavior
Customer X/Up-Sell
Business is connecting innovation to Big Data
IT is struggling with the cost of Big Data
• Growing data volume is
quickly consuming capacity
• Growing data volume is
quickly consuming capacity
• Need to onboard, store, &
process new types of data
IT is struggling with the cost of Big Data
• Growing data volume is
quickly consuming capacity
• Need to onboard, store, &
process new types of data
• High expense and lack of
big data skills
IT is struggling with the cost of Big Data
Big Data
Analysis
Big Data
Integration &
Quality
Big Data Projects
Big Data
Analysis
Big Data
Integration &
Quality
80% of the work in Big Data projects is
data integration and data quality
Big Data Projects
PowerCenter Big Data
Edition
T i m e
a v a i l a b
l e f o r
d a t a
a n a l y s i
s
T i m e s p e n t o n d a t a
p r e p a r a t i o n (p a r s e ,
p r o f i l e , c l e a n s e ,
t r a n s f o r m , m a t c h )
Without
PowerCenter
Big Data Edition
T i m e
a v a i l a b
l e f o r
d a t a
a n a l y s i
s
T i m e s p e n t o n d a t a
p r e p a r a t i o n (p a r s e ,
p r o f i l e , c l e a n s e ,
t r a n s f o r m , m a t c h )
Without
PowerCenter
Big Data Edition
With
PowerCenter
Big Data Edition
Informatica + Hadoop
PowerCenter Developers are Now Hadoop Developers
Transactions,
OLTP, OLAP
Social Media, Web Logs
Machine Device,
Scientific
Documents and Emails
Analytics & Op
Dashboards
Mobile
Apps
Real-Time
Alerts
Archive Profile Parse ETL Cleanse Match
23
The Vibe Virtual Data Machine
Optimizer
Virtual Data Machine
Executor
Connectors
Transformation Library
Defines logic
Deploys most efficiently
based on data, logic and
execution environment
Run-time physical
execution
Connectivity to data
sources
24
Virtual Data Machine
Information
Exchange
Master Data
Management
3rd Party
Solutions
Data Integration Data Quality
Information
Lifecycle
Infrastructureservices
Role-basedtools
INFORMATION
SOLUTIONS
AND DATA
SERVICES
Vibe Virtual Data Machine
Map Once. Deploy Anywhere.
DEPLOY
ANYWHERE
Cloud
Embedded
DQ in apps
Data
Virtualization
ServerDesktop HADOOP
Data
IntegrationHub
PowerCenter Big Data Edition
The Safe On-Ramp To Big Data
Big Transaction Data Big Interaction Data
Online Transaction
Processing (OLTP)
Oracle
DB2
Ingres
Informix
Sysbase
SQL Server
…
Cloud
Salesforce.com
Concur
Google App Engine
Amazon
…
Other Interaction Data
Clickstream
image/Text
Scientific
Genomoic/pharma
Medical
Medical/Device
Sensors/meters
RFID tags
CDR/mobile
…
Social Media & Web Data
Facebook
Twitter
Linkedin
Youtube
…
Big Data Processing
Online Analytical
Processing (OLAP) &
DW Appliances
Teradata
Redbrick
EssBase
Sybase IQ
Netezza
Exadata
HANA
Greenplum
DataAllegro
Asterdata
Vertica
Paraccel …
Web applications
Blogs
Discussion forums
Communities
Partner portals
…
PowerCenter Big Data Edition
The Safe On-Ramp To Big Data
Big Transaction Data Big Interaction Data
Online Transaction
Processing (OLTP)
Oracle
DB2
Ingres
Informix
Sysbase
SQL Server
…
Cloud
Salesforce.com
Concur
Google App Engine
Amazon
…
Other Interaction Data
Clickstream
image/Text
Scientific
Genomoic/pharma
Medical
Medical/Device
Sensors/meters
RFID tags
CDR/mobile
…
Social Media & Web Data
Facebook
Twitter
Linkedin
Youtube
…
Big Data Processing
Online Analytical
Processing (OLAP) &
DW Appliances
Teradata
Redbrick
EssBase
Sybase IQ
Netezza
Exadata
HANA
Greenplum
DataAllegro
Asterdata
Vertica
Paraccel …
Web applications
Blogs
Discussion forums
Communities
Partner portals
…
Universal Data Access
High-Speed Data
Ingestion and
Extraction
ETL on Hadoop
Profiling on Hadoop
Complex Data
Parsing on Hadoop
Entity Extraction and
Data Classification on
Hadoop
No-Code Productivity
Business-IT
Collaboration
Unified Administration
the VibeTM virtual
data machine
PowerCenter
Big Data Edition
PowerCenter Big Data Edition
Lower Costs
Transactions,
OLTP, OLAP
Social Media, Web Logs
Machine Device,
Scientific
Documents and Emails
EDW
Data
Mart
Data
Mart
Optimize processing on
low cost hardware
Increase productivity up to 5X
Traditional Grid
Traditional Grid
Deploy On-Premise or
in the Cloud
Quickly staff projects
with trained experts
Map Once. Deploy AnywhereTM
PowerCenter Big Data Edition
Minimize Risk
PowerCenter Big Data Edition
Innovate Faster
Transactions,
OLTP, OLAP
Social Media, Web Logs
Machine Device,
Scientific
Documents and Emails
Analytics & Op
Dashboards
Mobile
Apps
Real-Time
Alerts
Onboard and analyze any type of
data to gain big data insights
Discover insights faster through
rapid development and collaboration
Operationalize big data insights to
generate new revenue streams
• Currently using Hadoop?
• Plan to implement Hadoop in 3-6 months
• Plan to implement Hadoop in 6-12 months
• No plans for Hadoop
30
What are your plans for Hadoop? (select one)
Poll Question #1
What Else Does Informatica
Offer for Big Data?
Inactive data
Active data
Performance
T I M E
DATABASESIZE
Enterprise Data
Warehouse
Transactions,
OLTP, OLAP
• Identify dormant data
• Archive inactive data to low-cost storage
Lower Data Management Costs
Active data
T I M E
DATABASESIZE
Enterprise Data
Warehouse
Low-Cost
Storage
Archive
Transactions,
OLTP, OLAP
Low-Cost
Storage
Archive
• Identify dormant data
• Archive inactive data to low-cost storage
Lower Data Management Costs
Data
Mart
Data
Mart
Data
Mart
Data
Mart
Data
Mart
Data
Mart
Data
Mart
Data
Mart
Data
Mart
EDW
BI Reports /
Dashboards
ODS MDM
• Avoid copies of data and augment the data warehouse using
data virtualization
• Role-based fine-grained secure access
Minimize Risk
EDW
BI Reports /
Dashboards
ODS MDM
• Avoid copies of data and augment the data warehouse using
data virtualization
• Role-based fine-grained secure access
Minimize Risk
Dynamic Data Masking
Data Virtualization
Production
(ERP, CRM,
EDW, Custom)
BI Reports /
Dashboards
Development
Test
• Mask sensitive data in non-production systems
Minimize Risk
Training
Apply
Data
Governance
Apply
Measure
and
Monitor
Define
Discover
Discover
• Data discovery
• Data profiling
• Data inventories
• Process inventories
• CRUD analysis
• Capabilities assessment
Define
• Business glossary creation
• Data classifications
• Data relationships
• Reference data
• Business rules
• Data governance policies
• Other dependent policies
Measure and Monitor
• Proactive monitoring
• Operational dashboards
• Reactive operational DQ audits
• Dashboard monitoring/audits
• Data lineage analysis
• Program performance
• Business value/ROI
Apply
• Automated rules
• Manual rules
• End to end workflows
• Business/IT collaboration
Innovate Faster With Big Data
• Enrich master data to proactively engage
customers & improve products and services
Innovate Faster With Big Data
• Analyze data in real-time using event-based
processing and proactive monitoring
Innovate Faster With Big Data
Customer
Business Rules
Social Data
Alert
Geo-location
Data
Transaction Data
Merchant
Offers
• Data archiving
• Data masking
• Data virtualization
• Data quality
• Data discovery
• MDM
• Real-time event based processing
40
What other data management technologies are you
considering within the next 12 months? (check all that apply)
Poll Question #2
What Are Customers Doing with
Informatica and Big Data?
The Challenge. Data volumes growing at 3-5 times over the next 2-3 years
The Solution The Result
• Manage data integration
and load of 10+ billion
records from multiple
disparate data sources
• Flexible data integration
architecture to support
changing business
requirements in a
heterogeneous data
management
environment
Flexible architecture to support rapid changes
EDW
Mainframe
DataVirtualization
RDBMS
Unstructured
Data
Business
Reports
Traditional Grid
Large Government Agency
DW
DW
The Challenge. Data warehouse exploding with over 200TB of data. User activity
generating up to 5 million queries a day impacting query performance
The Solution The Result
• Saved $20M + $2-3M
on-going by archiving
& optimization
• Reduced project
timeline from
6 months to 2 weeks
• Improved
performance by 25%
• Return on investment
in less than 6 months
Lower costs of Big Data projects
ERP
CRM
Custom
Business
Reports
Archived
DataInteraction Data
Large Global Financial Institution
EDW
Archived
Data
Web Logs
Traditional Grid
Near Real-Time
The Challenge. Increasing demand for faster data driven decision making and analytics
as data volumes and processing loads rapidly increase
The Solution The Result
• Cost-effectively scale
performance
• Lower hardware costs
• Increased agility by
standardizing on one
data integration
platform
• Leverage new data
sources for faster
innovation
Lower costs and minimize risk
Datamarts
Data
Warehouse
RDBMS
RDBMS
Large Global Financial Institution
The Challenge. Collect data in real-time from all cars by end of the year for
“Connected Car” program
The Solution The Result
• Helps enable goals of
connected vehicle program:
• Embedding mobile
technologies to enhance
customer experience
• Predictive maintenance
and improved fuel
efficiency
• On call roadside
assistance and auto
scheduling service
Create Innovative Products and Services
Connected Vehicle Program
Business
Reports
Large Global Automotive Manufacturer
EDW
Complex
Event
Processing
Next Steps
3
What should you be doing?
• Tomorrow
– Identify a business goal where data can have a significant impact
– Identify the skills you need to build a big data analytics team
• 3 months
– Identify and prioritize the data you need to achieve your business
goals
– Put a business plan and reference architecture together to optimize
your enterprise information management infrastructure
– Execute a quick win big data project with measurable ROI
• 1 year
– Extend data governance to include more data and more types of data
that impacts the business
– Consider a shared-services model to promote best practices and
further lower infrastructure and labor costs
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

SharePoint Online: New & Improved
SharePoint Online: New & ImprovedSharePoint Online: New & Improved
SharePoint Online: New & ImprovedPerficient, Inc.
 
10 Steps for Taking Control of Your Organization's Digital Debris
10 Steps for Taking Control of Your Organization's Digital Debris 10 Steps for Taking Control of Your Organization's Digital Debris
10 Steps for Taking Control of Your Organization's Digital Debris Perficient, Inc.
 
Robert Winter - Enterprise Wide Information Logistics - Data Quality Summit 2008
Robert Winter - Enterprise Wide Information Logistics - Data Quality Summit 2008Robert Winter - Enterprise Wide Information Logistics - Data Quality Summit 2008
Robert Winter - Enterprise Wide Information Logistics - Data Quality Summit 2008DataValueTalk
 
How PIH Is Using Office 365 to Improve Global Collaboration
How PIH Is Using Office 365 to Improve Global CollaborationHow PIH Is Using Office 365 to Improve Global Collaboration
How PIH Is Using Office 365 to Improve Global CollaborationPerficient, Inc.
 
Driving In-Store Traffic in the Digital Age
Driving In-Store Traffic in the Digital AgeDriving In-Store Traffic in the Digital Age
Driving In-Store Traffic in the Digital AgePerficient, Inc.
 
Console Power Productive Agents and Happy Customers
Console Power Productive Agents and Happy CustomersConsole Power Productive Agents and Happy Customers
Console Power Productive Agents and Happy CustomersPerficient, Inc.
 
Lower Cost and Complexity with Azure and StorSimple Hybrid Cloud Solutions
Lower Cost and Complexity with Azure and StorSimple Hybrid Cloud SolutionsLower Cost and Complexity with Azure and StorSimple Hybrid Cloud Solutions
Lower Cost and Complexity with Azure and StorSimple Hybrid Cloud SolutionsPerficient, Inc.
 
Five Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyFive Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyPerficient, Inc.
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyDataWorks Summit
 
5 big data at work linking discovery and bi to improve business outcomes from...
5 big data at work linking discovery and bi to improve business outcomes from...5 big data at work linking discovery and bi to improve business outcomes from...
5 big data at work linking discovery and bi to improve business outcomes from...Dr. Wilfred Lin (Ph.D.)
 
SharePoint 2013 Search & Social - What You Need to Know!
SharePoint 2013 Search & Social - What You Need to Know!SharePoint 2013 Search & Social - What You Need to Know!
SharePoint 2013 Search & Social - What You Need to Know!Perficient, Inc.
 
Transform IT Service Delivery Helion
Transform IT Service Delivery Helion Transform IT Service Delivery Helion
Transform IT Service Delivery Helion Andrey Karpov
 
Agile BI: How to Deliver More Value in Less Time
Agile BI: How to Deliver More Value in Less TimeAgile BI: How to Deliver More Value in Less Time
Agile BI: How to Deliver More Value in Less TimePerficient, Inc.
 
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...Datameer
 
IBM Watson Content Analytics: Discover Hidden Value in Your Unstructured Data
IBM Watson Content Analytics: Discover Hidden Value in Your Unstructured DataIBM Watson Content Analytics: Discover Hidden Value in Your Unstructured Data
IBM Watson Content Analytics: Discover Hidden Value in Your Unstructured DataPerficient, Inc.
 
RWDG Slides: Governing Your Data Catalog, Business Glossary, and Data Dictionary
RWDG Slides: Governing Your Data Catalog, Business Glossary, and Data DictionaryRWDG Slides: Governing Your Data Catalog, Business Glossary, and Data Dictionary
RWDG Slides: Governing Your Data Catalog, Business Glossary, and Data DictionaryDATAVERSITY
 
2013 Data Governance Information Quality (DGIQ) Conference session
2013 Data Governance Information Quality (DGIQ) Conference session2013 Data Governance Information Quality (DGIQ) Conference session
2013 Data Governance Information Quality (DGIQ) Conference sessionDeepak Bhaskar, MBA, BSEE
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationDATAVERSITY
 
Implementing Digital Signatures in an FDA-Regulated Environment
Implementing Digital Signatures in an FDA-Regulated EnvironmentImplementing Digital Signatures in an FDA-Regulated Environment
Implementing Digital Signatures in an FDA-Regulated EnvironmentPerficient, Inc.
 

Was ist angesagt? (20)

SharePoint Online: New & Improved
SharePoint Online: New & ImprovedSharePoint Online: New & Improved
SharePoint Online: New & Improved
 
10 Steps for Taking Control of Your Organization's Digital Debris
10 Steps for Taking Control of Your Organization's Digital Debris 10 Steps for Taking Control of Your Organization's Digital Debris
10 Steps for Taking Control of Your Organization's Digital Debris
 
Robert Winter - Enterprise Wide Information Logistics - Data Quality Summit 2008
Robert Winter - Enterprise Wide Information Logistics - Data Quality Summit 2008Robert Winter - Enterprise Wide Information Logistics - Data Quality Summit 2008
Robert Winter - Enterprise Wide Information Logistics - Data Quality Summit 2008
 
How PIH Is Using Office 365 to Improve Global Collaboration
How PIH Is Using Office 365 to Improve Global CollaborationHow PIH Is Using Office 365 to Improve Global Collaboration
How PIH Is Using Office 365 to Improve Global Collaboration
 
Driving In-Store Traffic in the Digital Age
Driving In-Store Traffic in the Digital AgeDriving In-Store Traffic in the Digital Age
Driving In-Store Traffic in the Digital Age
 
Console Power Productive Agents and Happy Customers
Console Power Productive Agents and Happy CustomersConsole Power Productive Agents and Happy Customers
Console Power Productive Agents and Happy Customers
 
Lower Cost and Complexity with Azure and StorSimple Hybrid Cloud Solutions
Lower Cost and Complexity with Azure and StorSimple Hybrid Cloud SolutionsLower Cost and Complexity with Azure and StorSimple Hybrid Cloud Solutions
Lower Cost and Complexity with Azure and StorSimple Hybrid Cloud Solutions
 
Five Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyFive Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data Strategy
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
 
5 big data at work linking discovery and bi to improve business outcomes from...
5 big data at work linking discovery and bi to improve business outcomes from...5 big data at work linking discovery and bi to improve business outcomes from...
5 big data at work linking discovery and bi to improve business outcomes from...
 
SharePoint 2013 Search & Social - What You Need to Know!
SharePoint 2013 Search & Social - What You Need to Know!SharePoint 2013 Search & Social - What You Need to Know!
SharePoint 2013 Search & Social - What You Need to Know!
 
Transform IT Service Delivery Helion
Transform IT Service Delivery Helion Transform IT Service Delivery Helion
Transform IT Service Delivery Helion
 
Agile BI: How to Deliver More Value in Less Time
Agile BI: How to Deliver More Value in Less TimeAgile BI: How to Deliver More Value in Less Time
Agile BI: How to Deliver More Value in Less Time
 
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
The State of Big Data Adoption: A Glance at Top Industries Adopting Big Data ...
 
IBM Watson Content Analytics: Discover Hidden Value in Your Unstructured Data
IBM Watson Content Analytics: Discover Hidden Value in Your Unstructured DataIBM Watson Content Analytics: Discover Hidden Value in Your Unstructured Data
IBM Watson Content Analytics: Discover Hidden Value in Your Unstructured Data
 
RWDG Slides: Governing Your Data Catalog, Business Glossary, and Data Dictionary
RWDG Slides: Governing Your Data Catalog, Business Glossary, and Data DictionaryRWDG Slides: Governing Your Data Catalog, Business Glossary, and Data Dictionary
RWDG Slides: Governing Your Data Catalog, Business Glossary, and Data Dictionary
 
Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?
 
2013 Data Governance Information Quality (DGIQ) Conference session
2013 Data Governance Information Quality (DGIQ) Conference session2013 Data Governance Information Quality (DGIQ) Conference session
2013 Data Governance Information Quality (DGIQ) Conference session
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
 
Implementing Digital Signatures in an FDA-Regulated Environment
Implementing Digital Signatures in an FDA-Regulated EnvironmentImplementing Digital Signatures in an FDA-Regulated Environment
Implementing Digital Signatures in an FDA-Regulated Environment
 

Andere mochten auch

Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Giip kb-hadoop sizing
Giip kb-hadoop sizingGiip kb-hadoop sizing
Giip kb-hadoop sizingLowy Shin
 
Informatica Big Data Edition - Profinit - Jan Ulrych
Informatica Big Data Edition - Profinit - Jan UlrychInformatica Big Data Edition - Profinit - Jan Ulrych
Informatica Big Data Edition - Profinit - Jan UlrychProfinit
 
Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7mmathipra
 
⭐⭐⭐⭐⭐ Examen Sistemas Digitales SD+MSA (2do Parcial)
⭐⭐⭐⭐⭐ Examen Sistemas Digitales SD+MSA (2do Parcial)⭐⭐⭐⭐⭐ Examen Sistemas Digitales SD+MSA (2do Parcial)
⭐⭐⭐⭐⭐ Examen Sistemas Digitales SD+MSA (2do Parcial)Victor Asanza
 
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the fieldHadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the fieldUwe Printz
 
Solving Performance Problems on Hadoop
Solving Performance Problems on HadoopSolving Performance Problems on Hadoop
Solving Performance Problems on HadoopTyler Mitchell
 
Apache Cassandra for Timeseries- and Graph-Data
Apache Cassandra for Timeseries- and Graph-DataApache Cassandra for Timeseries- and Graph-Data
Apache Cassandra for Timeseries- and Graph-DataGuido Schmutz
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Hortonworks
 
Pragmatic approach to Microservice Architecture: Role of Middleware
Pragmatic approach to Microservice Architecture: Role of MiddlewarePragmatic approach to Microservice Architecture: Role of Middleware
Pragmatic approach to Microservice Architecture: Role of MiddlewareAsanka Abeysinghe
 
Hadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data WarehouseHadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data WarehouseDataWorks Summit
 
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...Amazon Web Services
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?Venu Anuganti
 
Intro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of TwingoIntro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of TwingoMapR Technologies
 
Visualizing big data in the browser using spark
Visualizing big data in the browser using sparkVisualizing big data in the browser using spark
Visualizing big data in the browser using sparkDatabricks
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicBig Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicDataWorks Summit
 
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop ProfessionalsBest Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop ProfessionalsCloudera, Inc.
 

Andere mochten auch (20)

Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Giip kb-hadoop sizing
Giip kb-hadoop sizingGiip kb-hadoop sizing
Giip kb-hadoop sizing
 
Informatica Big Data Edition - Profinit - Jan Ulrych
Informatica Big Data Edition - Profinit - Jan UlrychInformatica Big Data Edition - Profinit - Jan Ulrych
Informatica Big Data Edition - Profinit - Jan Ulrych
 
Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7Meet the experts dwo bde vds v7
Meet the experts dwo bde vds v7
 
⭐⭐⭐⭐⭐ Examen Sistemas Digitales SD+MSA (2do Parcial)
⭐⭐⭐⭐⭐ Examen Sistemas Digitales SD+MSA (2do Parcial)⭐⭐⭐⭐⭐ Examen Sistemas Digitales SD+MSA (2do Parcial)
⭐⭐⭐⭐⭐ Examen Sistemas Digitales SD+MSA (2do Parcial)
 
Elastic map reduce
Elastic map reduceElastic map reduce
Elastic map reduce
 
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the fieldHadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the field
 
Solving Performance Problems on Hadoop
Solving Performance Problems on HadoopSolving Performance Problems on Hadoop
Solving Performance Problems on Hadoop
 
Apache spark meetup
Apache spark meetupApache spark meetup
Apache spark meetup
 
Apache Cassandra for Timeseries- and Graph-Data
Apache Cassandra for Timeseries- and Graph-DataApache Cassandra for Timeseries- and Graph-Data
Apache Cassandra for Timeseries- and Graph-Data
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
 
Pragmatic approach to Microservice Architecture: Role of Middleware
Pragmatic approach to Microservice Architecture: Role of MiddlewarePragmatic approach to Microservice Architecture: Role of Middleware
Pragmatic approach to Microservice Architecture: Role of Middleware
 
Hadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data WarehouseHadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data Warehouse
 
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
(BDT305) Lessons Learned and Best Practices for Running Hadoop on AWS | AWS r...
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?
 
Intro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of TwingoIntro to Apache Spark by CTO of Twingo
Intro to Apache Spark by CTO of Twingo
 
Visualizing big data in the browser using spark
Visualizing big data in the browser using sparkVisualizing big data in the browser using spark
Visualizing big data in the browser using spark
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicBig Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
 
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop ProfessionalsBest Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
 

Ähnlich wie Integrate Big Data into Your Organization with Lower Total Costs

SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"MDS ap
 
Let's make money from big data!
Let's make money from big data! Let's make money from big data!
Let's make money from big data! B Spot
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.saranya270513
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
Seminário Big Data, 19/05/2014 - Apresentação Federico Grosso
Seminário Big Data, 19/05/2014 - Apresentação Federico GrossoSeminário Big Data, 19/05/2014 - Apresentação Federico Grosso
Seminário Big Data, 19/05/2014 - Apresentação Federico GrossoFecomercioSP
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationDenodo
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big dataRaul Chong
 
Data Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febData Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febJonathan Woodward
 
big-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfbig-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfVirajSaud
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017SingleStore
 
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014pietvz
 
Big Data Customer Experience Analytics -- The Next Big Opportunity for You
Big Data Customer Experience Analytics -- The Next Big Opportunity for You Big Data Customer Experience Analytics -- The Next Big Opportunity for You
Big Data Customer Experience Analytics -- The Next Big Opportunity for You Dr.Dinesh Chandrasekar PhD(hc)
 
Cisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt onlyCisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt onlyArthur_Hansen
 
Impacto del Big Data en la empresa española
Impacto del Big Data en la empresa españolaImpacto del Big Data en la empresa española
Impacto del Big Data en la empresa españolaParadigma Digital
 
8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshareJulianna DeLua
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIBig Data Week
 
Smart Data Module 1 introduction to big and smart data
Smart Data Module 1 introduction to big and smart dataSmart Data Module 1 introduction to big and smart data
Smart Data Module 1 introduction to big and smart datacaniceconsulting
 
Big data's impact on online marketing
Big data's impact on online marketingBig data's impact on online marketing
Big data's impact on online marketingPros Global Inc
 

Ähnlich wie Integrate Big Data into Your Organization with Lower Total Costs (20)

SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
 
Let's make money from big data!
Let's make money from big data! Let's make money from big data!
Let's make money from big data!
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
Seminário Big Data, 19/05/2014 - Apresentação Federico Grosso
Seminário Big Data, 19/05/2014 - Apresentação Federico GrossoSeminário Big Data, 19/05/2014 - Apresentação Federico Grosso
Seminário Big Data, 19/05/2014 - Apresentação Federico Grosso
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and Visualization
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Data Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febData Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th feb
 
Big data
Big dataBig data
Big data
 
big-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfbig-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdf
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
 
Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...
 
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
 
Big Data Customer Experience Analytics -- The Next Big Opportunity for You
Big Data Customer Experience Analytics -- The Next Big Opportunity for You Big Data Customer Experience Analytics -- The Next Big Opportunity for You
Big Data Customer Experience Analytics -- The Next Big Opportunity for You
 
Cisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt onlyCisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt only
 
Impacto del Big Data en la empresa española
Impacto del Big Data en la empresa españolaImpacto del Big Data en la empresa española
Impacto del Big Data en la empresa española
 
8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
 
Smart Data Module 1 introduction to big and smart data
Smart Data Module 1 introduction to big and smart dataSmart Data Module 1 introduction to big and smart data
Smart Data Module 1 introduction to big and smart data
 
Big data's impact on online marketing
Big data's impact on online marketingBig data's impact on online marketing
Big data's impact on online marketing
 

Mehr von Perficient, Inc.

Driving Strong 2020 Holiday Season Results
Driving Strong 2020 Holiday Season ResultsDriving Strong 2020 Holiday Season Results
Driving Strong 2020 Holiday Season ResultsPerficient, Inc.
 
Transforming Pharmacovigilance Workflows with AI & Automation
Transforming Pharmacovigilance Workflows with AI & Automation Transforming Pharmacovigilance Workflows with AI & Automation
Transforming Pharmacovigilance Workflows with AI & Automation Perficient, Inc.
 
The Secret to Acquiring and Retaining Customers in Financial Services
The Secret to Acquiring and Retaining Customers in Financial ServicesThe Secret to Acquiring and Retaining Customers in Financial Services
The Secret to Acquiring and Retaining Customers in Financial ServicesPerficient, Inc.
 
Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.
Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.
Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.Perficient, Inc.
 
Content, Commerce, and... COVID
Content, Commerce, and... COVIDContent, Commerce, and... COVID
Content, Commerce, and... COVIDPerficient, Inc.
 
Centene's Financial Transformation Journey: A OneStream Success Story
Centene's Financial Transformation Journey: A OneStream Success StoryCentene's Financial Transformation Journey: A OneStream Success Story
Centene's Financial Transformation Journey: A OneStream Success StoryPerficient, Inc.
 
Automate Medical Coding With WHODrug Koda
Automate Medical Coding With WHODrug KodaAutomate Medical Coding With WHODrug Koda
Automate Medical Coding With WHODrug KodaPerficient, Inc.
 
Preparing for Your Oracle, Medidata, and Veeva CTMS Migration Project
Preparing for Your Oracle, Medidata, and Veeva CTMS Migration ProjectPreparing for Your Oracle, Medidata, and Veeva CTMS Migration Project
Preparing for Your Oracle, Medidata, and Veeva CTMS Migration ProjectPerficient, Inc.
 
Accelerating Partner Management: How Manufacturers Can Navigate Covid-19
Accelerating Partner Management: How Manufacturers Can Navigate Covid-19Accelerating Partner Management: How Manufacturers Can Navigate Covid-19
Accelerating Partner Management: How Manufacturers Can Navigate Covid-19Perficient, Inc.
 
The Critical Role of Audience Intelligence with Eric Enge and Rand Fishkin
The Critical Role of Audience Intelligence with Eric Enge and Rand FishkinThe Critical Role of Audience Intelligence with Eric Enge and Rand Fishkin
The Critical Role of Audience Intelligence with Eric Enge and Rand FishkinPerficient, Inc.
 
Cardtronics Future Ready with Oracle EPM Cloud
Cardtronics Future Ready with Oracle EPM CloudCardtronics Future Ready with Oracle EPM Cloud
Cardtronics Future Ready with Oracle EPM CloudPerficient, Inc.
 
Teams Summit - What is New and Coming
Teams Summit -  What is New and ComingTeams Summit -  What is New and Coming
Teams Summit - What is New and ComingPerficient, Inc.
 
Empower Your Organization with Teams & Remote Work Crisis Management
Empower Your Organization with Teams & Remote Work Crisis ManagementEmpower Your Organization with Teams & Remote Work Crisis Management
Empower Your Organization with Teams & Remote Work Crisis ManagementPerficient, Inc.
 
Adoption & Change Management Overview
Adoption & Change Management OverviewAdoption & Change Management Overview
Adoption & Change Management OverviewPerficient, Inc.
 
Microsoft Teams: Measuring Activity of Employees Working from Home
Microsoft Teams: Measuring Activity of Employees Working from HomeMicrosoft Teams: Measuring Activity of Employees Working from Home
Microsoft Teams: Measuring Activity of Employees Working from HomePerficient, Inc.
 
Securing Teams with Microsoft 365 Security for Remote Work
Securing Teams with Microsoft 365 Security for Remote WorkSecuring Teams with Microsoft 365 Security for Remote Work
Securing Teams with Microsoft 365 Security for Remote WorkPerficient, Inc.
 
Infrastructure Best Practices for Teams Remote Workers
Infrastructure Best Practices for Teams Remote WorkersInfrastructure Best Practices for Teams Remote Workers
Infrastructure Best Practices for Teams Remote WorkersPerficient, Inc.
 
Accelerate Adoption for Microsoft Teams
Accelerate Adoption for Microsoft TeamsAccelerate Adoption for Microsoft Teams
Accelerate Adoption for Microsoft TeamsPerficient, Inc.
 
Preparing for Project Cortex and the Future of Knowledge Management
Preparing for Project Cortex and the Future of Knowledge ManagementPreparing for Project Cortex and the Future of Knowledge Management
Preparing for Project Cortex and the Future of Knowledge ManagementPerficient, Inc.
 
Utilizing Microsoft 365 Security for Remote Work
Utilizing Microsoft 365 Security for Remote Work Utilizing Microsoft 365 Security for Remote Work
Utilizing Microsoft 365 Security for Remote Work Perficient, Inc.
 

Mehr von Perficient, Inc. (20)

Driving Strong 2020 Holiday Season Results
Driving Strong 2020 Holiday Season ResultsDriving Strong 2020 Holiday Season Results
Driving Strong 2020 Holiday Season Results
 
Transforming Pharmacovigilance Workflows with AI & Automation
Transforming Pharmacovigilance Workflows with AI & Automation Transforming Pharmacovigilance Workflows with AI & Automation
Transforming Pharmacovigilance Workflows with AI & Automation
 
The Secret to Acquiring and Retaining Customers in Financial Services
The Secret to Acquiring and Retaining Customers in Financial ServicesThe Secret to Acquiring and Retaining Customers in Financial Services
The Secret to Acquiring and Retaining Customers in Financial Services
 
Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.
Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.
Oracle Strategic Modeling Live: Defined. Discussed. Demonstrated.
 
Content, Commerce, and... COVID
Content, Commerce, and... COVIDContent, Commerce, and... COVID
Content, Commerce, and... COVID
 
Centene's Financial Transformation Journey: A OneStream Success Story
Centene's Financial Transformation Journey: A OneStream Success StoryCentene's Financial Transformation Journey: A OneStream Success Story
Centene's Financial Transformation Journey: A OneStream Success Story
 
Automate Medical Coding With WHODrug Koda
Automate Medical Coding With WHODrug KodaAutomate Medical Coding With WHODrug Koda
Automate Medical Coding With WHODrug Koda
 
Preparing for Your Oracle, Medidata, and Veeva CTMS Migration Project
Preparing for Your Oracle, Medidata, and Veeva CTMS Migration ProjectPreparing for Your Oracle, Medidata, and Veeva CTMS Migration Project
Preparing for Your Oracle, Medidata, and Veeva CTMS Migration Project
 
Accelerating Partner Management: How Manufacturers Can Navigate Covid-19
Accelerating Partner Management: How Manufacturers Can Navigate Covid-19Accelerating Partner Management: How Manufacturers Can Navigate Covid-19
Accelerating Partner Management: How Manufacturers Can Navigate Covid-19
 
The Critical Role of Audience Intelligence with Eric Enge and Rand Fishkin
The Critical Role of Audience Intelligence with Eric Enge and Rand FishkinThe Critical Role of Audience Intelligence with Eric Enge and Rand Fishkin
The Critical Role of Audience Intelligence with Eric Enge and Rand Fishkin
 
Cardtronics Future Ready with Oracle EPM Cloud
Cardtronics Future Ready with Oracle EPM CloudCardtronics Future Ready with Oracle EPM Cloud
Cardtronics Future Ready with Oracle EPM Cloud
 
Teams Summit - What is New and Coming
Teams Summit -  What is New and ComingTeams Summit -  What is New and Coming
Teams Summit - What is New and Coming
 
Empower Your Organization with Teams & Remote Work Crisis Management
Empower Your Organization with Teams & Remote Work Crisis ManagementEmpower Your Organization with Teams & Remote Work Crisis Management
Empower Your Organization with Teams & Remote Work Crisis Management
 
Adoption & Change Management Overview
Adoption & Change Management OverviewAdoption & Change Management Overview
Adoption & Change Management Overview
 
Microsoft Teams: Measuring Activity of Employees Working from Home
Microsoft Teams: Measuring Activity of Employees Working from HomeMicrosoft Teams: Measuring Activity of Employees Working from Home
Microsoft Teams: Measuring Activity of Employees Working from Home
 
Securing Teams with Microsoft 365 Security for Remote Work
Securing Teams with Microsoft 365 Security for Remote WorkSecuring Teams with Microsoft 365 Security for Remote Work
Securing Teams with Microsoft 365 Security for Remote Work
 
Infrastructure Best Practices for Teams Remote Workers
Infrastructure Best Practices for Teams Remote WorkersInfrastructure Best Practices for Teams Remote Workers
Infrastructure Best Practices for Teams Remote Workers
 
Accelerate Adoption for Microsoft Teams
Accelerate Adoption for Microsoft TeamsAccelerate Adoption for Microsoft Teams
Accelerate Adoption for Microsoft Teams
 
Preparing for Project Cortex and the Future of Knowledge Management
Preparing for Project Cortex and the Future of Knowledge ManagementPreparing for Project Cortex and the Future of Knowledge Management
Preparing for Project Cortex and the Future of Knowledge Management
 
Utilizing Microsoft 365 Security for Remote Work
Utilizing Microsoft 365 Security for Remote Work Utilizing Microsoft 365 Security for Remote Work
Utilizing Microsoft 365 Security for Remote Work
 

Kürzlich hochgeladen

Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 

Kürzlich hochgeladen (20)

Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 

Integrate Big Data into Your Organization with Lower Total Costs

  • 1. Integrate Big Data into Your Organization with Lower Total Costs
  • 2. 2 Perficient is a leading information technology consulting firm serving clients throughout North America. We help clients implement business-driven technology solutions that integrate business processes, improve worker productivity, increase customer loyalty and create a more agile enterprise to better respond to new business opportunities. About Perficient
  • 3. 3 • Founded in 1997 • Public, NASDAQ: PRFT • 2012 revenue of $327 million • Major market locations throughout North America • Atlanta, Austin, Boston, Charlotte, Chicago, Cincinnati, Cleveland, Columbus, Dallas, Denver, Detroit, Fairfax, Houston, Indianapolis, Minneapolis, New Orleans, New York, Northern California, Philadelphia, Southern California, St. Louis , Toronto, and Washington, D.C. • Global delivery centers in China, Europe and India • ~2,000 colleagues • Dedicated solution practices • ~85% repeat business rate • Alliance partnerships with major technology vendors • Multiple vendor/industry technology and growth awards Perficient Profile
  • 4. 4 Business Solutions • Business Intelligence • Business Process Management • Customer Experience and CRM • Enterprise Performance Management • Enterprise Resource Planning • Experience Design (XD) • Management Consulting Technology Solutions • Business Integration/SOA • Cloud Services • Commerce • Content Management • Custom Application Development • Education • Information Management • Mobile Platforms • Platform Integration • Portal & Social Our Solutions Expertise
  • 5. Speakers Randall Gayle • Data Management Director for Perficient • 30+ years of data management experience • Helps companies develop solutions around master data management, data quality, data governance and data integration. • Provides data management expertise to industries including oil and gas, financial services, banking, healthcare, government, retail and manufacturing. John Haddad • Senior Director of Big Data Product Marketing for Informatica • 25+ years of experience developing and marketing enterprise applications. • Advises organizations on Big Data best practices from a management and technology perspective. 5
  • 6. Interesting Facts about BIG Data 1. It took from the dawn of civilization to the year 2003 for the world to generate 1.8 zettabytes (10 to the 12th gigabytes) of data. In 2011 it took two days on average to generate the same amount of data. 2. If you stacked a pile of CD-ROMs on top of one another until you’d reached the current global storage capacity for digital information – about 295 Exabyte – if would stretch 80,000 km beyond the moon. 3. Every hour, enough information is consumed by internet traffic to fill 7 million DVDs. Side by side, they’d scale Mount Everest 95 times. 4. 247 billion e-mail messages are sent each day… up to 80% of them are spam. 5. 48 hours of video are uploaded to YouTube every minute, resulting in 8 years’ worth of digital content each day 6. The world’s data doubles every two years 7. There are nearly as many bits of information in the digital universe as there are stars in our actual universe. 8. There are 30 billion pieces of content shared on Facebook every day and 750 million photos uploaded every two days 6
  • 7. Agenda • Innovation vs. Cost • PowerCenter Big Data Edition • What else does Informatica offer for Big Data? • What Are Customers Doing with Informatica and Big Data? • Next Steps • Q&A 7
  • 8. How do you balance innovation and cost?
  • 9. Business CEO and VP/Director of Sales & Marketing, Customer Service, Product Development INNOVATION How do you balance innovation and cost?
  • 10. IT CIO and VP/Director of Information Management, BI / Data Warehousing, Enterprise Architecture Business CEO and VP/Director of Sales & Marketing, Customer Service, Product Development COSTINNOVATION How do you balance innovation and cost?
  • 11. Financial Services Retail & Telco Media & Entertainment Public SectorManufacturing Healthcare & Pharma Business is connecting innovation to Big Data
  • 12. Risk & Portfolio Analysis, Investment Recommendations Proactive Customer Engagement, Location Based Services Financial Services Retail & Telco Public SectorManufacturing Healthcare & Pharma Media & Entertainment Online & In-Game Behavior Customer X/Up-Sell Business is connecting innovation to Big Data
  • 13. Risk & Portfolio Analysis, Investment Recommendations Connected Vehicle, Predictive Maintenance Health Insurance Exchanges, Public Safety, Tax Optimization Fraud Detection Predicting Patient Outcomes, Total Cost of Care Drug Discovery Proactive Customer Engagement, Location Based Services Financial Services Retail & Telco Public SectorManufacturing Healthcare & Pharma Media & Entertainment Online & In-Game Behavior Customer X/Up-Sell Business is connecting innovation to Big Data
  • 14. IT is struggling with the cost of Big Data • Growing data volume is quickly consuming capacity
  • 15. • Growing data volume is quickly consuming capacity • Need to onboard, store, & process new types of data IT is struggling with the cost of Big Data
  • 16. • Growing data volume is quickly consuming capacity • Need to onboard, store, & process new types of data • High expense and lack of big data skills IT is struggling with the cost of Big Data
  • 17. Big Data Analysis Big Data Integration & Quality Big Data Projects
  • 18. Big Data Analysis Big Data Integration & Quality 80% of the work in Big Data projects is data integration and data quality Big Data Projects
  • 20. T i m e a v a i l a b l e f o r d a t a a n a l y s i s T i m e s p e n t o n d a t a p r e p a r a t i o n (p a r s e , p r o f i l e , c l e a n s e , t r a n s f o r m , m a t c h ) Without PowerCenter Big Data Edition
  • 21. T i m e a v a i l a b l e f o r d a t a a n a l y s i s T i m e s p e n t o n d a t a p r e p a r a t i o n (p a r s e , p r o f i l e , c l e a n s e , t r a n s f o r m , m a t c h ) Without PowerCenter Big Data Edition With PowerCenter Big Data Edition
  • 22. Informatica + Hadoop PowerCenter Developers are Now Hadoop Developers Transactions, OLTP, OLAP Social Media, Web Logs Machine Device, Scientific Documents and Emails Analytics & Op Dashboards Mobile Apps Real-Time Alerts Archive Profile Parse ETL Cleanse Match
  • 23. 23 The Vibe Virtual Data Machine Optimizer Virtual Data Machine Executor Connectors Transformation Library Defines logic Deploys most efficiently based on data, logic and execution environment Run-time physical execution Connectivity to data sources
  • 24. 24 Virtual Data Machine Information Exchange Master Data Management 3rd Party Solutions Data Integration Data Quality Information Lifecycle Infrastructureservices Role-basedtools INFORMATION SOLUTIONS AND DATA SERVICES Vibe Virtual Data Machine Map Once. Deploy Anywhere. DEPLOY ANYWHERE Cloud Embedded DQ in apps Data Virtualization ServerDesktop HADOOP Data IntegrationHub
  • 25. PowerCenter Big Data Edition The Safe On-Ramp To Big Data Big Transaction Data Big Interaction Data Online Transaction Processing (OLTP) Oracle DB2 Ingres Informix Sysbase SQL Server … Cloud Salesforce.com Concur Google App Engine Amazon … Other Interaction Data Clickstream image/Text Scientific Genomoic/pharma Medical Medical/Device Sensors/meters RFID tags CDR/mobile … Social Media & Web Data Facebook Twitter Linkedin Youtube … Big Data Processing Online Analytical Processing (OLAP) & DW Appliances Teradata Redbrick EssBase Sybase IQ Netezza Exadata HANA Greenplum DataAllegro Asterdata Vertica Paraccel … Web applications Blogs Discussion forums Communities Partner portals …
  • 26. PowerCenter Big Data Edition The Safe On-Ramp To Big Data Big Transaction Data Big Interaction Data Online Transaction Processing (OLTP) Oracle DB2 Ingres Informix Sysbase SQL Server … Cloud Salesforce.com Concur Google App Engine Amazon … Other Interaction Data Clickstream image/Text Scientific Genomoic/pharma Medical Medical/Device Sensors/meters RFID tags CDR/mobile … Social Media & Web Data Facebook Twitter Linkedin Youtube … Big Data Processing Online Analytical Processing (OLAP) & DW Appliances Teradata Redbrick EssBase Sybase IQ Netezza Exadata HANA Greenplum DataAllegro Asterdata Vertica Paraccel … Web applications Blogs Discussion forums Communities Partner portals … Universal Data Access High-Speed Data Ingestion and Extraction ETL on Hadoop Profiling on Hadoop Complex Data Parsing on Hadoop Entity Extraction and Data Classification on Hadoop No-Code Productivity Business-IT Collaboration Unified Administration the VibeTM virtual data machine PowerCenter Big Data Edition
  • 27. PowerCenter Big Data Edition Lower Costs Transactions, OLTP, OLAP Social Media, Web Logs Machine Device, Scientific Documents and Emails EDW Data Mart Data Mart Optimize processing on low cost hardware Increase productivity up to 5X Traditional Grid
  • 28. Traditional Grid Deploy On-Premise or in the Cloud Quickly staff projects with trained experts Map Once. Deploy AnywhereTM PowerCenter Big Data Edition Minimize Risk
  • 29. PowerCenter Big Data Edition Innovate Faster Transactions, OLTP, OLAP Social Media, Web Logs Machine Device, Scientific Documents and Emails Analytics & Op Dashboards Mobile Apps Real-Time Alerts Onboard and analyze any type of data to gain big data insights Discover insights faster through rapid development and collaboration Operationalize big data insights to generate new revenue streams
  • 30. • Currently using Hadoop? • Plan to implement Hadoop in 3-6 months • Plan to implement Hadoop in 6-12 months • No plans for Hadoop 30 What are your plans for Hadoop? (select one) Poll Question #1
  • 31. What Else Does Informatica Offer for Big Data?
  • 32. Inactive data Active data Performance T I M E DATABASESIZE Enterprise Data Warehouse Transactions, OLTP, OLAP • Identify dormant data • Archive inactive data to low-cost storage Lower Data Management Costs
  • 33. Active data T I M E DATABASESIZE Enterprise Data Warehouse Low-Cost Storage Archive Transactions, OLTP, OLAP Low-Cost Storage Archive • Identify dormant data • Archive inactive data to low-cost storage Lower Data Management Costs
  • 34. Data Mart Data Mart Data Mart Data Mart Data Mart Data Mart Data Mart Data Mart Data Mart EDW BI Reports / Dashboards ODS MDM • Avoid copies of data and augment the data warehouse using data virtualization • Role-based fine-grained secure access Minimize Risk
  • 35. EDW BI Reports / Dashboards ODS MDM • Avoid copies of data and augment the data warehouse using data virtualization • Role-based fine-grained secure access Minimize Risk Dynamic Data Masking Data Virtualization
  • 36. Production (ERP, CRM, EDW, Custom) BI Reports / Dashboards Development Test • Mask sensitive data in non-production systems Minimize Risk Training
  • 37. Apply Data Governance Apply Measure and Monitor Define Discover Discover • Data discovery • Data profiling • Data inventories • Process inventories • CRUD analysis • Capabilities assessment Define • Business glossary creation • Data classifications • Data relationships • Reference data • Business rules • Data governance policies • Other dependent policies Measure and Monitor • Proactive monitoring • Operational dashboards • Reactive operational DQ audits • Dashboard monitoring/audits • Data lineage analysis • Program performance • Business value/ROI Apply • Automated rules • Manual rules • End to end workflows • Business/IT collaboration Innovate Faster With Big Data
  • 38. • Enrich master data to proactively engage customers & improve products and services Innovate Faster With Big Data
  • 39. • Analyze data in real-time using event-based processing and proactive monitoring Innovate Faster With Big Data Customer Business Rules Social Data Alert Geo-location Data Transaction Data Merchant Offers
  • 40. • Data archiving • Data masking • Data virtualization • Data quality • Data discovery • MDM • Real-time event based processing 40 What other data management technologies are you considering within the next 12 months? (check all that apply) Poll Question #2
  • 41. What Are Customers Doing with Informatica and Big Data?
  • 42. The Challenge. Data volumes growing at 3-5 times over the next 2-3 years The Solution The Result • Manage data integration and load of 10+ billion records from multiple disparate data sources • Flexible data integration architecture to support changing business requirements in a heterogeneous data management environment Flexible architecture to support rapid changes EDW Mainframe DataVirtualization RDBMS Unstructured Data Business Reports Traditional Grid Large Government Agency DW DW
  • 43. The Challenge. Data warehouse exploding with over 200TB of data. User activity generating up to 5 million queries a day impacting query performance The Solution The Result • Saved $20M + $2-3M on-going by archiving & optimization • Reduced project timeline from 6 months to 2 weeks • Improved performance by 25% • Return on investment in less than 6 months Lower costs of Big Data projects ERP CRM Custom Business Reports Archived DataInteraction Data Large Global Financial Institution EDW Archived Data
  • 44. Web Logs Traditional Grid Near Real-Time The Challenge. Increasing demand for faster data driven decision making and analytics as data volumes and processing loads rapidly increase The Solution The Result • Cost-effectively scale performance • Lower hardware costs • Increased agility by standardizing on one data integration platform • Leverage new data sources for faster innovation Lower costs and minimize risk Datamarts Data Warehouse RDBMS RDBMS Large Global Financial Institution
  • 45. The Challenge. Collect data in real-time from all cars by end of the year for “Connected Car” program The Solution The Result • Helps enable goals of connected vehicle program: • Embedding mobile technologies to enhance customer experience • Predictive maintenance and improved fuel efficiency • On call roadside assistance and auto scheduling service Create Innovative Products and Services Connected Vehicle Program Business Reports Large Global Automotive Manufacturer EDW Complex Event Processing
  • 47. What should you be doing? • Tomorrow – Identify a business goal where data can have a significant impact – Identify the skills you need to build a big data analytics team • 3 months – Identify and prioritize the data you need to achieve your business goals – Put a business plan and reference architecture together to optimize your enterprise information management infrastructure – Execute a quick win big data project with measurable ROI • 1 year – Extend data governance to include more data and more types of data that impacts the business – Consider a shared-services model to promote best practices and further lower infrastructure and labor costs

Hinweis der Redaktion

  1. Cost saving/ control of growing data environmentData management cost optimizationBusiness specific big data analyticsBig data integration to support analytics and new data products and services
  2. Cost saving/ control of growing data environmentData management cost optimizationBusiness specific big data analyticsBig data integration to support analytics and new data products and services
  3. Cost saving/ control of growing data environmentData management cost optimizationBusiness specific big data analyticsBig data integration to support analytics and new data products and services
  4. Challenges & Problems Customers are facing with Big DataGrowing data volumes, expensive data warehouse upgradesVariety of data, onboarding new types of dataLack of Big Data skillsBuilding the business case for a big data projectDon’t know where to beginRegulatory compliance and security (e.g. data privacy, data sharing)Speaker Notes:There are several challenges related to Big Data AnalyticsAs data volumes continue to grow how can you continue to meet your SLAs for existing projects while controlling costs?It’s estimated that Big Transaction Data alone is growing at 50-60% per yearApplication databases are growing to the point where not only the hardware and software costs are rising but application performance is adversely affectedData warehouses are also growing too fast using up the capacity of current infrastructure investments. And with Big Interaction Data exploding who can afford to store all this information in their enterprise data warehouse. In fact one financial institution estimated that it costs $180K to manage 1 TB of data in their data warehouse over a 3-year periodAs more and more users demand information, organizations also experience a proliferation of datamarts that further increases hardware and database costs. A large healthcare insurance provider had over 30,000 datamarts and spreadmarts across the companyWith data volumes growing exponentially its becoming difficult to process all the data required for the data warehouse during the nightly batch windows.If you continue to just throw expensive hardware and database licenses at the Big Data problem your costs will spiral out of controlMore and more organizations would like to leverage the massive amounts of interaction data such as social media and machine device data to attract and retain customers, improve business operations, and their competitive advantage. But because so much of this data is multi-structured and being generated at a rate that is akin to drinking through a fire hose they find accessing, storing, and processing interaction data can be extremely difficult.Another challenge with big data is that because there is so much new data being generated and stored, it is difficult for organizations to find, understand, and trust the data
  5. Challenges & Problems Customers are facing with Big DataGrowing data volumes, expensive data warehouse upgradesVariety of data, onboarding new types of dataLack of Big Data skillsBuilding the business case for a big data projectDon’t know where to beginRegulatory compliance and security (e.g. data privacy, data sharing)Speaker Notes:There are several challenges related to Big Data AnalyticsAs data volumes continue to grow how can you continue to meet your SLAs for existing projects while controlling costs?It’s estimated that Big Transaction Data alone is growing at 50-60% per yearApplication databases are growing to the point where not only the hardware and software costs are rising but application performance is adversely affectedData warehouses are also growing too fast using up the capacity of current infrastructure investments. And with Big Interaction Data exploding who can afford to store all this information in their enterprise data warehouse. In fact one financial institution estimated that it costs $180K to manage 1 TB of data in their data warehouse over a 3-year periodAs more and more users demand information, organizations also experience a proliferation of datamarts that further increases hardware and database costs. A large healthcare insurance provider had over 30,000 datamarts and spreadmarts across the companyWith data volumes growing exponentially its becoming difficult to process all the data required for the data warehouse during the nightly batch windows.If you continue to just throw expensive hardware and database licenses at the Big Data problem your costs will spiral out of controlMore and more organizations would like to leverage the massive amounts of interaction data such as social media and machine device data to attract and retain customers, improve business operations, and their competitive advantage. But because so much of this data is multi-structured and being generated at a rate that is akin to drinking through a fire hose they find accessing, storing, and processing interaction data can be extremely difficult.Another challenge with big data is that because there is so much new data being generated and stored, it is difficult for organizations to find, understand, and trust the data
  6. Challenges & Problems Customers are facing with Big DataGrowing data volumes, expensive data warehouse upgradesVariety of data, onboarding new types of dataLack of Big Data skillsBuilding the business case for a big data projectDon’t know where to beginRegulatory compliance and security (e.g. data privacy, data sharing)Speaker Notes:There are several challenges related to Big Data AnalyticsAs data volumes continue to grow how can you continue to meet your SLAs for existing projects while controlling costs?It’s estimated that Big Transaction Data alone is growing at 50-60% per yearApplication databases are growing to the point where not only the hardware and software costs are rising but application performance is adversely affectedData warehouses are also growing too fast using up the capacity of current infrastructure investments. And with Big Interaction Data exploding who can afford to store all this information in their enterprise data warehouse. In fact one financial institution estimated that it costs $180K to manage 1 TB of data in their data warehouse over a 3-year periodAs more and more users demand information, organizations also experience a proliferation of datamarts that further increases hardware and database costs. A large healthcare insurance provider had over 30,000 datamarts and spreadmarts across the companyWith data volumes growing exponentially its becoming difficult to process all the data required for the data warehouse during the nightly batch windows.If you continue to just throw expensive hardware and database licenses at the Big Data problem your costs will spiral out of controlMore and more organizations would like to leverage the massive amounts of interaction data such as social media and machine device data to attract and retain customers, improve business operations, and their competitive advantage. But because so much of this data is multi-structured and being generated at a rate that is akin to drinking through a fire hose they find accessing, storing, and processing interaction data can be extremely difficult.Another challenge with big data is that because there is so much new data being generated and stored, it is difficult for organizations to find, understand, and trust the data
  7. You don’t want expensive data scientists ($300K FTE) doing this work. JPMC – hand coding 3 weeks and INFA was 3 dayIn a recent Information Week article – Meet The Elusive Data Scientist – CatalinCiobanu, a physicist who spent ten years at Fermi National Accelerator Laboratory (Fermilab) and is now senior manager-BI at Carlson Wagonlit Travel, said “70% of my value is an ability to pull the data, 20% of my value is using data-science methods and asking the right questions, and 10% of my value is knowing the tools”. DJ Patil, Data Scientist in Residence at Greylock Partners (formerly Chief Data Scientist at LinkedIn) states in his book "Data Jujitsu" that “80% of the work in any data project is in cleaning the data.” In a recent study that surveyed 35 data scientists across 25 companies (Kandel, et al. Enterprise Data Analysis and Visualization: An Interview Study. IEEE Visual Analytics Science and Technology (VAST), 2012) a couple of data scientists expressed their frustration in preparing data for analysis: “I spend more than half my time integrating, cleansing, and transforming data without doing any actual analysis. Most of the time I’m lucky if I get to do any ‘analysis’ at all.”, and another data scientist informs us that “most of the time once you transform the data … the insights can be scarily obvious.”44% of big data projects are cancelled versus 25% for IT projects in general and many more fail to achieve project objectives according to a 2012 Enterprise Big Data Survey – Infochimps/SSWUG Enterprise Big Data Survey 2012, Synamic Markets Enterprise IT Survey 2008http://www.slideshare.net/infochimps/top-strategies-for-successful-big-data-projectsWhy do projects fail?BusinessInaccurate scope – not enough time / deadlines bustedNon-cooperation between departmentsHaving the right talent / lack of expertiseTechnicalTechnical or roll-out roadblocks – gathering data from different sourcesFinding and understanding tools, platforms, technologies
  8. You don’t want expensive data scientists ($300K FTE) doing this work. JPMC – hand coding 3 weeks and INFA was 3 dayIn a recent Information Week article – Meet The Elusive Data Scientist – CatalinCiobanu, a physicist who spent ten years at Fermi National Accelerator Laboratory (Fermilab) and is now senior manager-BI at Carlson Wagonlit Travel, said “70% of my value is an ability to pull the data, 20% of my value is using data-science methods and asking the right questions, and 10% of my value is knowing the tools”. DJ Patil, Data Scientist in Residence at Greylock Partners (formerly Chief Data Scientist at LinkedIn) states in his book "Data Jujitsu" that “80% of the work in any data project is in cleaning the data.” In a recent study that surveyed 35 data scientists across 25 companies (Kandel, et al. Enterprise Data Analysis and Visualization: An Interview Study. IEEE Visual Analytics Science and Technology (VAST), 2012) a couple of data scientists expressed their frustration in preparing data for analysis: “I spend more than half my time integrating, cleansing, and transforming data without doing any actual analysis. Most of the time I’m lucky if I get to do any ‘analysis’ at all.”, and another data scientist informs us that “most of the time once you transform the data … the insights can be scarily obvious.”44% of big data projects are cancelled versus 25% for IT projects in general and many more fail to achieve project objectives according to a 2012 Enterprise Big Data Survey – Infochimps/SSWUG Enterprise Big Data Survey 2012, Synamic Markets Enterprise IT Survey 2008http://www.slideshare.net/infochimps/top-strategies-for-successful-big-data-projectsWhy do projects fail?BusinessInaccurate scope – not enough time / deadlines bustedNon-cooperation between departmentsHaving the right talent / lack of expertiseTechnicalTechnical or roll-out roadblocks – gathering data from different sourcesFinding and understanding tools, platforms, technologies
  9. Use the Two slide version of thisLower CostsLower HW/SW costsOptimized end-to-end performanceRich pre-built connectors, library of transforms for ETL, data quality, parsing, profilingIncreased ProductivityUp to 5x productivity gains with no-code visual development environmentNo need for Hadoop expertise for data integrationProven Path to Innovation5000+ customers, 500+ partners, 100,000+ trained Informatica developersEnterprise scalability, security, & support
  10. Use the Two slide version of thisLower CostsLower HW/SW costsOptimized end-to-end performanceRich pre-built connectors, library of transforms for ETL, data quality, parsing, profilingIncreased ProductivityUp to 5x productivity gains with no-code visual development environmentNo need for Hadoop expertise for data integrationProven Path to Innovation5000+ customers, 500+ partners, 100,000+ trained Informatica developersEnterprise scalability, security, & support
  11. Use the Two slide version of thisLower CostsLower HW/SW costsOptimized end-to-end performanceRich pre-built connectors, library of transforms for ETL, data quality, parsing, profilingIncreased ProductivityUp to 5x productivity gains with no-code visual development environmentNo need for Hadoop expertise for data integrationProven Path to Innovation5000+ customers, 500+ partners, 100,000+ trained Informatica developersEnterprise scalability, security, & support
  12. The Vibe VDM works by receiving a set of instructions that describe the data source(s) from which it will extract data, the rules and flow by which that data will be transformed, analyzed, masked, archived, matched, or cleansed, and ultimately where that data will be loaded when the processing is finished. Vibe consists of a number of fundamental components (see Figure 2):Transformation Library: This is a collectionof useful, prebuilt transformations that the engine calls to combine, transform, cleanse, match, and mask data. For those familiar with PowerCenter or Informatica Data Quality, this library is represented by the icons that the developer can drag and drop onto the canvas to perform actions on data. Optimizer: The Optimizer compiles data processing logic into internal representation to ensure effective resource usage and efficient run time based on data characteristics and execution environment configurations. Executor:This is a run-time execution engine that orchestrates the data logic using the appropriate transformations. The engine reads/writes data from an adapter or directly streams the data from an application. Connectors:Informatica’s connectivity extensions provide data access from various data sources. This is what allows Informatica Platform users to connect to almost any data source or application for use by a variety of data movement technologies and modes, including batch, request/response, and publish/subscribe.
  13. The Vibe virtual data machine, although critical, is not sufficient by itself to solve the wide spectrum of data integration challenges. Vibe lets you master complexity and change, and it makes all data accessible. But in many places where data lives, especially some of the emerging data sources, the data is unfiltered. Unstandardized. Uncleansed. Unrelated. Some of it is even unnecessary. It takes a considerable amount of work and expertise to understand how to transform raw data into information that provides insight and value.So in addition to the enabling capabilities that Vibe delivers, you also need to layer on the data services and information solutions from a fully integrated information platform that ensures that data is: Complete: Insight comes from a complete picture, not from fragments. You have to integrate the data fragments so you are looking at a whole— a whole person, a whole account, a whole product, a whole business process, a whole organization, a whole nation —rather than pieces or parts. Timely: Different consumers and different use cases require data at different times and frequencies. You want one platform that accelerates the delivery of data when, where, and how it is needed, whether it is via messaging, bulk delivery, or through a virtual view. Trusted: If data is incomplete, inaccurate, or unrelated, it’s not of much use. You need data quality services that let you diagnose problems and then cleanse the data in a sustainable, efficient way. Authoritative: You also need master data management services to master the data and relationships that constitute the “whole” for your key business entities, even as the data fragments feeding into the “whole” continually change. Actionable: Ultimately, data needs to serve a user—whether it is a human or a machine. The platform needs to help the user understand when it needs to pay attention to an event, investigate an issue, or act. Secure: With the exponential rise in combinations of people accessing data across different systems, the potential for a security breach also rises exponentially. You must be able to secure data consistently and universally, no matter where it resides or how it is used.But it is not sufficient for an information platform to merely have a long checklist of information services. Only an information platform powered by a VDM provides the interoperability required to easily combine services on the fly to meet your specific business requirements. Only an information platform powered by a VDM can provide the right tools and capabilities for the simplest entry‐level uses to the most complex cross‐enterprise initiatives, allowing you to share work across that entire span without recoding. And only an information platform powered by a VDM has the flexibility to be deployed stand‐alone in the data center, as a cloud service, or embedded into applications, middleware infrastructure, and devices.
  14. Informatica announced the launch of the PowerCenter Big Data Edition at Hadoop World with general availability in December.The PowerCenter Big Data Edition provides a proven path to innovation that lowers data management costs with benefits that include:Bringing innovative products and services to market faster and improve business operationsReducing big data management costs while handling growing data volumes and complexity Realizing performance and costs benefits by expanding adoption of Hadoop across projects Minimizing risk by investing in proven data integration software that hides the complexity of emerging technologiesPowerCenter Big Data Edition Key Features include:Universal Data AccessYour IT team can access all types of big transaction data, including RDBMS, OLTP, OLAP, ERP, CRM, mainframe, cloud, and others. You can also access all types of big interaction data, including social media data, log files, machine sensor data, Web sites, blogs, documents, emails, and other unstructured or multi-structured data. High-Speed Data Ingestion and ExtractionYou can access, load, replicate, transform, and extract big data between source and target systems or directly into Hadoop or your data warehouse. High performance connectivity through native APIs to source and target systems with parallel processing ensures high-speed data ingestion and extraction.No-Code ProductivityRemoving hand-coding within Hadoop through the visual Informatica development environment. Develop and scale data flows with no specialized hand-coding in order to maximize reuse. Users can build once and deploy anywhere Unlimited ScalabilityYour IT organization can process all types of data at any scale—from terabytes to petabytes—with no specialized coding on distributed computing platforms such as Hadoop.Optimized Performance for Lowest CostBased on data volumes, data type, latency requirements, and available hardware, PowerCenter Big Data Edition deploys big data processing on the highest performance and most cost-effective data processing platforms. You get the most out of your current investments and capacity whether you deploy data processing on SMP machines, traditional grid clusters, distributed computing platforms like Hadoop, or data warehouse appliancesETL on HadoopThis edition provides an extensive library of prebuilt transformation capabilities on Hadoop, including data type conversions and string manipulations, high-performance cache-enabled lookups, joiners, sorters, routers, aggregations, and many more. Your IT team can rapidly develop data flows on Hadoop using a codeless graphical development environment that increases productivity and promotes reuse.Profiling on HadoopData on Hadoop can be profiled through the Informatica developer tool and a browser-based analyst tool. This makes it easy for developers, analysts, and data scientists to understand the data, identify data quality issues earlier, collaborate on data flow specifications, and validate mapping transformation and rules logic.Design Once and Deploy AnywhereETL developers can focus on the data and transformation logic without having to worry where the ETL process is deployed—on Hadoop or traditional data processing platforms. Developers can design once, without any specialized knowledge of Hadoop concepts and languages, and easily deploy data flows on Hadoop or traditional systems. Complex Data Parsing on HadoopThis edition makes it easy to access and parse complex, multi-structured, unstructured, and industry standards data such as Web logs, JSON, XML, and machine device data. Prebuilt parsers for market data and industry standards like FIX, SWIFT, ACORD, HL7, HIPAA, and EDI are also available and licensed separately.Entity Extraction and Data Classification on HadoopUsing a list of keywords or phrases, entities related to your customers and products can be easily extracted and classified from unstructured data such as emails, social media data, and documents. You can enrich master data with insights into customer behavior or product information such as competitive pricing.Mixed WorkflowsYour IT team can easily coordinate, schedule, monitor, and manage all interrelated processes and workflows across your traditional and Hadoop environment to simplify operations and meet your SLAs. You can also drill down into individual Hadoop jobs. High AvailabilityThis edition provides 24x7 high availability with seamless failover, flexible recovery, and connection resilience. When it comes time to develop new products and services using big data insights, you can rest assured that they will scale and be available 24x7 for mission-critical operations.
  15. Informatica announced the launch of the PowerCenter Big Data Edition at Hadoop World with general availability in December.The PowerCenter Big Data Edition provides a proven path to innovation that lowers data management costs with benefits that include:Bringing innovative products and services to market faster and improve business operationsReducing big data management costs while handling growing data volumes and complexity Realizing performance and costs benefits by expanding adoption of Hadoop across projects Minimizing risk by investing in proven data integration software that hides the complexity of emerging technologiesPowerCenter Big Data Edition Key Features include:Universal Data AccessYour IT team can access all types of big transaction data, including RDBMS, OLTP, OLAP, ERP, CRM, mainframe, cloud, and others. You can also access all types of big interaction data, including social media data, log files, machine sensor data, Web sites, blogs, documents, emails, and other unstructured or multi-structured data. High-Speed Data Ingestion and ExtractionYou can access, load, replicate, transform, and extract big data between source and target systems or directly into Hadoop or your data warehouse. High performance connectivity through native APIs to source and target systems with parallel processing ensures high-speed data ingestion and extraction.No-Code ProductivityRemoving hand-coding within Hadoop through the visual Informatica development environment. Develop and scale data flows with no specialized hand-coding in order to maximize reuse. Users can build once and deploy anywhere Unlimited ScalabilityYour IT organization can process all types of data at any scale—from terabytes to petabytes—with no specialized coding on distributed computing platforms such as Hadoop.Optimized Performance for Lowest CostBased on data volumes, data type, latency requirements, and available hardware, PowerCenter Big Data Edition deploys big data processing on the highest performance and most cost-effective data processing platforms. You get the most out of your current investments and capacity whether you deploy data processing on SMP machines, traditional grid clusters, distributed computing platforms like Hadoop, or data warehouse appliancesETL on HadoopThis edition provides an extensive library of prebuilt transformation capabilities on Hadoop, including data type conversions and string manipulations, high-performance cache-enabled lookups, joiners, sorters, routers, aggregations, and many more. Your IT team can rapidly develop data flows on Hadoop using a codeless graphical development environment that increases productivity and promotes reuse.Profiling on HadoopData on Hadoop can be profiled through the Informatica developer tool and a browser-based analyst tool. This makes it easy for developers, analysts, and data scientists to understand the data, identify data quality issues earlier, collaborate on data flow specifications, and validate mapping transformation and rules logic.Design Once and Deploy AnywhereETL developers can focus on the data and transformation logic without having to worry where the ETL process is deployed—on Hadoop or traditional data processing platforms. Developers can design once, without any specialized knowledge of Hadoop concepts and languages, and easily deploy data flows on Hadoop or traditional systems. Complex Data Parsing on HadoopThis edition makes it easy to access and parse complex, multi-structured, unstructured, and industry standards data such as Web logs, JSON, XML, and machine device data. Prebuilt parsers for market data and industry standards like FIX, SWIFT, ACORD, HL7, HIPAA, and EDI are also available and licensed separately.Entity Extraction and Data Classification on HadoopUsing a list of keywords or phrases, entities related to your customers and products can be easily extracted and classified from unstructured data such as emails, social media data, and documents. You can enrich master data with insights into customer behavior or product information such as competitive pricing.Mixed WorkflowsYour IT team can easily coordinate, schedule, monitor, and manage all interrelated processes and workflows across your traditional and Hadoop environment to simplify operations and meet your SLAs. You can also drill down into individual Hadoop jobs. High AvailabilityThis edition provides 24x7 high availability with seamless failover, flexible recovery, and connection resilience. When it comes time to develop new products and services using big data insights, you can rest assured that they will scale and be available 24x7 for mission-critical operations.
  16. ETL, parsing, data quality, profiling, NLPFor talend and pentaho you need to code and now MRPowerCenter Big Data Edition reduces big data costs. Your IT team can manage twice the data volume with your existing analytics environment. You can offload data from your warehouse and source systems and offload processing to low-cost commodity hardware. High-Speed Data Ingestion and ExtractionLoad, process and extract big data across heterogeneous environments to optimize the end-to-end flow of data between Hadoop and traditional data management infrastructure.Near-Universal Data Access and Comprehensive ETL on HadoopReliably access a variety of types and sources of data using a rich library of pre-built ETL transforms for both transaction and interaction data that run on Hadoop or traditional grid infrastructure
  17. By moving away from hand coding to proven data integration productivity tools, you triple your productivity—you no longer need an army of developers. This edition provides unified administration for all data integration projects. You can build it once and deploy it anywhere, which keeps costs down by optimizing data processing utilization across both existing data platforms and emerging technologies like HadoopNo-code Development EnvironmentRemoves hand-coding within Hadoop through a visual development environmentDevelop and scale data flows with no specialized hand-coding in order to maximize reuse.Virtual Data MachineBuild transformation logic once, and deploy at any scale on Hadoop or traditional ETL grid infrastructureAt a recent TDWI Big Data Summit last summer, eHarmony presented their Informatica Hadoop implementation. There was a question from the audience that asked, “How many new resources did you need to hire to implement this on Hadoop?”. The Director of IT at eHarmony, said “None”
  18. Informatica® PowerCenter® Big Data Edition is the safe on-ramp to big data that works with both emerging technologies and traditional data management infrastructure. With this edition, your IT organization can rapidly create innovative products and services by integrating and analyzing new types and sources of data. It provides a proven path of innovation while lowering big data management costs and minimizing risk. With Big Data you don’t always know what you are looking for. Instead of being given the requirements from the business for a report you are instead tasked with a business goal such as increase customer acquisition and retention or improve fraud detection.With this goal in mind and a wealth of big transaction data, big interaction data, and big data processing technologies how can you achieve this goal cost-effectively?Let’s consider an online retailer having several big data projects at various stages of implementation to increase customer acquisition & retention, increase profitability and improve fraud detectionSince we don’t necessarily have a well-defined set of requirements we need to create a sandbox environment where data science teams can play and experiment with big data.A team of data scientists, analysts, developers, architects, and LOB stakeholders collaborate within the sandbox to discover insights that will achieve the goals of each project.This requires us to access and ingest in this case customer transaction information from the ERP and CRM systems, web logs from our online store, social data from twitter, facebook, and blogs, and geo-location data from mobile devices.The data science team goes through an iterative process of accessing, preparing, profiling, and analyzing data sets to discover patterns and insights that could help achieve the business goals of the project.However, what many people fail to acknowledge is that 70-80% of this work is accessing and preparing the data sets for analysis. This includes parsing, transforming, and integrating a variety of disparate data sets coming from different platforms, in different formats, and at different latencies.DJ Patil, Data Scientist in Residence at Greylock (a VC firm) stated in his book – “Good data scientists understand, in a deep way, that the heavy lifting of cleanup and preparation isn’t something that gets in the way of solving the problem: it is the problem.”The data science team may discover a few insights which they need to test, validate, and measure the impact to the business. They might apply techniques such as A/B testing to determine which algorithms and data flows produce the best results for a stated goal like increase customer share of wallet with next best offer recommendations or increase profitability through pricing optimization or identify trends & reduce false positives related to fraud detection.Once organizations overcome the hurdles of accessing, preparing, and integrating data sets to discover these insights they then face the challenge of operationalizing the insights. This is where organizations seem to struggle in turning insights into real business value.To turn insights into business value means the insight needs to be delivered reliably to the point of use whether it be a report, enterprise or web app, or part of an automated workflowFor example…The Fraud department needs to be notified in real-time if fraud is suspected or if there is a spike in a particular region that is seeing an upward trend of fraudCustomers shopping on an eCommerce website need to see next best offers in real-time as they click through the websiteThe customer service rep needs to immediately know if a customer is likely to churn when a customer calls or files an online complaintPricing optimization needs to be delivered directly to the sales rep via a CRM mobile app based on customer location, purchase history, demographics, etc.Too many organizations end up rebuilding and hand-coding the data flows created during design and analysis when it comes time to deploy to production Informatica is a metadata driven development environment that provides near universal data access, hundreds of pre-built parsers, transformations, and data quality rules and the data flows created during design and analysis can be easily extended and deployed for production use.Another benefit is that datasets and data flows can easily be shared across projectsThis helps an organization to be agile and rapidly innovate with big dataThe datasets used in design and analysis may not have been optimized for a production data flow. PowerCenter Big Data Edition provides highly scalable throughput data integration to handle all volumes of data processing with high performancePowerCenter Big Data Edition separates the design from deployment and as we have seen enables organizations to have a consistent & repeatable process, reuse data flows to maximize productivity, scale performance using high performance distributed computing platforms, ensure 24x7 operations with high availability, flexiblity to change as data volumes continue to grow and as new data sources are added, deliver and analyze data at various latencies, easier to support and maintain than hand-coding, all while controlling costs and cost-effectively optimizing processing
  19. Subset production data and mask sensitive data in non-production systems
  20. But once you discover thesetreasures of data can you trust its origins and the transformations that have been applied?On this island we have found a treasure of valuable data combined with hazardous waste of bad data. How can we extract the gold and get rid of the waste?Do you know where the treasure of data came from? Is it authentic?In order to trust the data you need to know where it came from and what was done to the dataIts also unfortunate that data management teams end up recreating datasets over and over that have already been normalized, cleansed, and curated using up a lot of storage and resources.<CLICK> I’d like to recommend that you commit to data governance to improve your business processes, decisions and interactionsWe talk about managing data as asset but what does that really mean. You need a process to effectively govern your data so you can deliver trusted and reliable data.First you need to decide the cost of bad data – for example the cost of having bad customer addresses or duplicate parts could be millions<CLICK> The process starts iteratively with the discovery and definition of data so that you know what you have in terms of data definitions, domains, relationships, and business rules, etc.For example one company had people showing up to meetings with different numbers related to claims payments. The problem wasn’t in the data. The problem was that the people were from 3 different departments with three different definitions of the dataTherefore the Business and IT require a process and tools to efficiently collaborate and automate steps that continuously improve the quality of data over time.In order to manage data governance effectively and support continuous improvement requires KPI dashboards, proactive monitoring, and a clear ROI
  21. Enrich master data with customer behavior insights, relationships, and key influencers so you can proactively engage with customers and increase upsell/cross-sell opportunities
  22. Identify unused and unnecessary data to drive data retention policiesAssess data usage and performance metrics to focus optimizationArchived 13 TB of data in first 2 months and continue to retire data monthlyPhase 2: Offload data and processing to Hadoop
  23. Why INFAEase of use for developers and administratorsEasy to scale performance and at a comparatively lower cost using PowerCenter grid with commodity hardwareCould standardize on one data integration platform for all data movement use-cases and requirementsComprehensive data integration platform for big data processing, batch and real-time data movement, metadata management, profiling, test data management, and protecting sensitive dataThe Challenge: Company is growing fast with data volumes and processing loads increasing. Ever increasing demand for data driven decision making , analytics and reporting. Inability to scale legacy systems due to cost as well as time factors (not being plug and play). Needed to standardize on a single platform/vendor to meet the various data related needs – ETL, Metadata, Masking, Subsetting, Real Time. (need to wordsmith this)The Result: Ability to scale easily by adding incremental nodes in comparatively shorter time periods. Reduction in hardware cost due to commodity hardware stack.Phase 1:PowerCenter Grid and HA implementationSeveral site facing OLTP Oracle DBsSeveral Oracle Data Marts and a Petabyte scale Teradata EDWTransactional Data, Behavioral Data , Web LogsProcess a few terabytes of incremental data every day through PowerCenter GridImplemented a single domain dual data center PowerCenter Grid (Primary vs. DR)Currently Active/Passive, will eventually become active/active and expand with further node additions.Commodity Linux machines with 64GB memory with shared NFS file system mounted across all nodes within a data center.Multiple Integration services assigned to the GRID, with the repository DB running on a dedicated DB.Grid requirementsHighly available data movement/data integration environment.Ability to scale horizontally with out having to extensively re-architect application design.Ability to automatically load balance.Ability to recover automatically in case of system errors.Phase 2:Grow PowerCenter grid to increase processing capacity to meet growing data volumes/ reduced processing times.Current BenefitsAbility to scale easily by adding incremental nodes in comparatively shorter time periods.Reduction in hardware cost due to commodity hardware stack.Future BenefitsExpect to reduce time to perform impact/lineage analysis when we implement metadata solution.Expect to re-use profiling information when we implement profiling solution.Expect to perform more comprehensive testing much faster when we implement masking/ sub setting.Expect to reduce batch loads from 30 minutes to a few seconds for fraud detection when we implement Ultra MessagingParticipate in PowerCenter on Hadoop Beta Testing program.Today use PERL scripts to process Web logs and move results into TeradataCurrently looking at utilizing Hadoop for various text and log data mining and analysis capabilities for things such as Risk Monitoring, Behavior Tracking and  for various marketing related activities. We believe that using Hadoop for low cost big data analysis/processing alongside capabilities of Infa Grid to deliver mission critical data to our datamarts would be complementary to each other while allowing us to maintain metadata and other operational capabilities within a single integrated platform.
  24. Continuously collect all data from all carsAll cars by end of year will transmit data to central Teradata data warehouseReal-time data integration using PowerCenter, CDC, CEP
  25. Do we want these after the final slide (currently Connect with Perficient)?