SlideShare ist ein Scribd-Unternehmen logo
1 von 120
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Craig Stires
Head of Analytics, Big Data, AI
Asia-Pacific
Modern Data Architectures for
Business Insights at Scale
Today's workshop
2:00pm - 2:15pm Overview on using modern data architectures on AWS
2:15pm - 3:40pm Modern data architectures for business insights at scale
(Includes Live Demos)
3:40pm – 4:00pm Break
4:00pm - 5:15pm Modern data architectures for real-time analytics and
engagement
(Includes Live Demos)
Overview on using modern data architectures on AWS
What is driving the requests for information?
- What information is needed?
- Where does the source data live?
- Freshness - how real-time?
What kind of persona are you serving?
- Measurable business outcome?
- Speed to access / urgency
- UI - interactive vs file vs embedded
- On-demand vs published
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
Available for analysis
Generated data
Data volume - Gap
1990 2000 2010 2020
Should we collect "all the data" and see what's in it?
Starting by amassing "all your data" and dumping
into a large repository for the data gurus to start
finding "insights" is like trying to win the lottery
Three big indicators of individual behavior
Purchases Movement Influence
A platform to build business outcomes from data
Purchases
Movement
Influence
Ingest/
Collect
Consume/
visualize
Store
Process/
analyze
1 4
0 9
5
Revenue Lift
Market
acquisition
Customer delight
Brand advocacy
Inventory
optimization
Supply chain
efficiency
...
The AWS Cloud helps remove constraints
How have some IT teams stopped being the "VILLIAN"?
Starting small is powerful, when you can scale
up fast
Scaling up your analytics systems With AWS Traditional IT *
get a new BI server 20 minutes 3 months
upgrade your analytics server to the
newest Intel processors and add 16GB
memory
15 minutes 2 months
add 500TB of storage instant 2 months
grow a DWH cluster from 8GB to 1PB 1 hour 8 months
build a 1024-node Hadoop cluster 30 minutes unlikely
roll out multi-region production
environment
hours months
* actual provisioning times in a well-organized IT division
Big Data:
• Potentially massive datasets
• Iterative, experimental style of
data manipulation and analysis
• Frequently not a steady-state
workload; peaks and valleys
• Data is a combination of
structured and unstructured
data in many formats
AWS Cloud:
• Virtually unlimited capacity
• Iterative, experimental usage cost
through on-demand
infrastructure
• Fully scalable infrastructure for
highly variable workloads
• Tools & Services for managing
structured, unstructured and
stream data
Let’s talk business outcomes of data analytics!
Outcome 1 : Modernize and consolidate
• Insights to enhance business applications and
create new digital services
Outcome 2 : Innovate for new revenues
• Personalization, demand forecasting, risk analysis
Outcome 3 : Real-time engagement
• Interactive customer experience, event-driven
automation, fraud detection
Outcome 4 : Automate for expansive reach
• Automation of business processes and physical
infrastructure
Driving Business Outcomes via Data Analytics
Modern data architectures for business insights at scale
Insights to enhance business applications, new digital services
Technology: Backend system integration, on-prem data center extension, business application
integration, BI provisioning, data lakes, external APIs, access control and logging
Common initiatives
Insights: 360 view of the business
• Legacy data systems migration to enable self-service for business analysts
• Integration of all customer data, from orders, payments, interactions
• Supplier performance for inventory and vendor management
Digitization: Web-service that gives on-demand insights
• Delivery of digital content, with behavior tracking, and upsell (or ads)
• Ordering system for enterprise customers or consumers
Data monetization: Enrich, aggregate, and sell business data
• External data enrichment API, including digital marketing platforms
• Purchasable data sets of anonymized, domain-enriched insights
Outcome 1 : Modernize and Consolidate
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Enhancing business applications and creating new digital services takes a few
steps. Business goals often consist of being an agile, well-run organization,
and to stop missing opportunities because people are making decisions
without accurate insights. These initiatives are focused on giving important
personas fast and secure access to business-relevant insights.
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Business users
External buyers
1. Define personas and use case requirements (including UI)
Data analysts
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Business users
External buyers
Transactions
Web logs /
cookies
ERP
2. Locate the data sources that have the information to extract
Data analysts
Fluentd: Open Source Log Collection
https://github.com/fluent/fluentd/
• Fluentd is an open source
data collector to unify data
collection and consumption
• Integration into many data
sources (App Logs, Syslogs,
Twitter etc.)
• Direct integration into AWS
<source>
type tail
format apache2
path /var/log/apache2/access_log
tag s3.apache.access
</source>
<match s3.*.*>
type s3
s3_bucket myweblogs
path logs/
</match>
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Business users
External buyers
Transactions
Web logs /
cookies
ERP
Ingest
AWS Database
Migration Service
AWS Direct
Connect
AWS Storage
Gateway
Internet
Interfaces
Changed Data
3. Ingest data through incremental or full loads, across secure connections
Data analysts
A single, large system may perform a single task
well, but is often too difficult to adapt and scale
A system that is decoupled can adapt to a fast
moving business, and can scale up and down with
significantly lower barriers
Decouple Storage and Compute
Traditionally analytical workloads
required large databases or data
warehouses, with storage and
compute close to each other
Big Data often benefits from
decoupling storage and compute
Amazon S3 offers virtually unlimited
storage at a per GB/month rate
Amazon
S3
Highly available object storage
99.999999999% data durability
Replicated across 3 facilities
Virtually unlimited scale
Pay only for usage, no pre-provisioning
Event notifications to trigger actions
Amazon
EMR
Fully managed Hadoop
Optimized with S3
Autoscaling for elasticity
Transient and long running clusters
Integration with AWS Spot Market
1 instance x 100 hours = 100 instances x 1 hour
(and with Spot Pricing not only faster but also cheaper)
Amazon EMR
• Amazon EMR supports all common
Hadoop Frameworks such as:
• Spark, Pig, Hive, Hue, Oozie …
• Hbase, Presto, Impala …
• Decouples storage from compute
• Allows independent scaling
• Direct Integration with DynamoDB
and S3
Amazon S3Amazon
DynamoDB
Amazon EMR
AWS
Glue
Managed Transform Engine
Job Scheduler
Data Catalog
Built on Apache Spark
Integrated with S3, RDS, Redshift & any
JDBC-compliant data store
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Business users
External buyers
Transactions
Web logs /
cookies
ERP
Ingest
AWS Database
Migration Service
AWS Direct
Connect
AWS Storage
Gateway
Internet
Interfaces
Changed Data
4. Use Hadoop for large scale ETL, data quality, and preparation [*EMRFS]
AWS Glue
Amazon S3
Raw Data
Amazon EMR
ETL
Data analysts
Amazon S3
Clean Data
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Business users
External buyers
Transactions
Web logs /
cookies
ERP
Ingest
AWS Database
Migration Service
AWS Direct
Connect
AWS Storage
Gateway
Internet
Interfaces
Changed Data
5. Stage all data into centralized, highly available, durable storage for further access
AWS Glue
Amazon S3
Raw Data
Data analysts
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Fully managed
MPP SQL database - fully relational
Optimised for analytics
Gigabytes to Petabytes
Less than 1/10th the cost of traditional
Amazon
Redshift
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Business users
External buyers
Transactions
Web logs /
cookies
ERP
Ingest
AWS Database
Migration Service
AWS Direct
Connect
AWS Storage
Gateway
Internet
Interfaces
Changed Data
6. Load semi-structured into Hadoop, structured into the DWH, and application data
into managed legacy application databases
AWS Glue
Amazon S3
Raw Data
Amazon EMR
Semi-structured
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
Data analysts
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Business users
External buyers
Transactions
Web logs /
cookies
ERP
Ingest
AWS Database
Migration Service
AWS Direct
Connect
AWS Storage
Gateway
Internet
Interfaces
Changed Data
7. Data is protected through identity and access management and logging
AWS Glue
Amazon S3
Raw Data
Amazon EMR
Semi-structured
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
Data analysts
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Fast, cloud-powered, BI service
Visualizations and ad-hoc analysis
Connectors for AWS and 3rd party sources
In-memory calculation engine (SPICE)
$9 per user per month
Amazon
QuickSight
AWS Marketplace
• Pre-Configured machine images
ready to be launched into virtual
server instances
• Launch applications with 1-Click
• Pay software licenses by the
hour or bring your own license
(BYOL)
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Business users
External buyers
Transactions
Web logs /
cookies
ERP
Ingest
AWS Database
Migration Service
AWS Direct
Connect
AWS Storage
Gateway
Internet
Interfaces
Changed Data
8. Data analysts use BI tools of choice to access all serving services
AWS Glue
Amazon S3
Raw Data
Amazon EMR
Semi-structured
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
Data analysts
Amazon
QuickSight
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Business users
External buyers
Transactions
Web logs /
cookies
ERP
Ingest
AWS Database
Migration Service
AWS Direct
Connect
AWS Storage
Gateway
Internet
Interfaces
Changed Data
9. Business users have enterprise applications enhanced by analytics
AWS Glue
Amazon S3
Raw Data
Amazon EMR
Semi-structured
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
Data analysts
Amazon
QuickSight
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Business users
External buyers
Transactions
Web logs /
cookies
ERP
Ingest
AWS Database
Migration Service
AWS Direct
Connect
AWS Storage
Gateway
Internet
Interfaces
Changed Data
10. External parties can buy services or data in a governed, secure way
AWS Glue
Amazon S3
Raw Data
Amazon EMR
Semi-structured
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
Data analysts
Amazon
QuickSight
Amazon
API Gateway
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Speed (Real-time)
Ingest ServingData
sources
Scale (Batch)
Modernize and consolidate
Insights to enhance business applications, new digital services
Business users
External buyers
Transactions
Web logs /
cookies
ERP
Ingest
AWS Database
Migration Service
AWS Direct
Connect
AWS Storage
Gateway
Internet
Interfaces
Changed Data
AWS Glue
Amazon S3
Raw Data
Amazon EMR
Semi-structured
Amazon RedShift
Data Warehouse
Amazon RDS
Legacy Apps
Data analysts
Amazon
QuickSight
Amazon
API Gateway
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Personalization, demand forecasting, risk analysis
Technology: Advanced analytics, customer segmentations, high volume transactional data, un/semi-
structured data, design of experiment, A/B & hypothesis testing, machine learning
Common initiatives
Personalization: Refine market approaches based on optimal segments
• Offer products to new customers based on clusters of similar individuals
• Launch share of wallet initiatives, understanding likely total spend
• Targeted marketing to capture interests and increase conversion rates
Predict demand: Guide business owners to select the best scenarios
• Launch items or promotions at the optimal time to maximize response
• Modeling for store assortment, product selection, and merchandizing
• New product design, based on known market propensities
Risk measurement: Create freedom to act by quantifying exposures
• Scenario simulation to encourage investments and new offerings
• Supply chain analytics allows for faster confirmation of goods to customers
Outcome 2 : Innovate for new revenues
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Driving net new revenues is realized by business teams that have access to
skilled analysts, using platforms that can scale up and out, without IT
bottlenecks. Organizations start operating based on what they know about
their customers, and can approach new ventures in terms of confidence
levels. Product launches, campaigns, supply chain management, packaged
services, and customized offerings are designed and executed based on
predictive models.
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
AWS
Cloud TrailAWS IAM
Amazon
CloudWatch
Data analysts
Data scientists
Business users
Engagement platforms
AWS KMS
1. Personas involved in generating new revenues are data scientists, data
analysts (often embedded), business users, and customers/suppliers
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Direct
Connect
Data analysts
Data scientists
Business users
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
AWS Glue
2. Advanced analytics are built from a base of traditional data processing
Amazon EMR
Amazon RedShift
Amazon RDS
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Direct
Connect
Data analysts
Data scientists
Business users
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
AWS Glue
3. On-premise storage and databases are connected and converted
Amazon EMR
Amazon RedShift
Amazon RDS
AWS Database
Migration Service
AWS Storage
Gateway
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Direct
Connect
Internet
Interfaces
Data analysts
Data scientists
Business users
Web logs /
cookies
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
AWS Glue
4. Internet-native data sources, like web and mobile, are captured
Amazon EMR
Amazon RedShift
Amazon RDS
AWS Database
Migration Service
AWS Storage
Gateway
Stream in Real Time: Amazon Kinesis
• Real-Time Data Processing over
large distributed streams
• Elastic capacity that scales to
millions of events per second
• React In real-time upon incoming
stream events
• Reliable stream storage
replicated across 3 facilities
Amazon Kinesis
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
AWS Glue
5. Streaming un/semi-structured data feeds, like social and devices are
captured
Amazon EMR
Amazon RedShift
Amazon RDS
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
AWS Glue
6. Log files and other schemaless data converted to Parquet and staged
Amazon EMR
Amazon RedShift
Amazon RDS
Interactive query service to analyze data
in Amazon S3 directly using standard SQL
No need to move data
No infrastructure to setup & manage
Fast -- results within seconds
Pay for only the queries you run
Amazon
Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
AWS Glue
7. Data analysts explore and visualize un/semi-structured data
Amazon RedShift
Amazon RDS
Amazon Athena
Amazon Machine Learning
• Easy to use, managed machine
learning service built for developers
• Machine learning technology based
on Amazon’s internal systems
• Create models using data stored in
Amazon S3, Amazon RDS or Amazon
Redshift
• Request predictions on batch or real-
time
Amazon Machine
Learning
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon
Machine Learning
Amazon S3
Schemaless
AWS Glue
8. Simple analytical models are built against Amazon Machine Learning
Amazon EMR
Amazon RedShift
Amazon RDS
Amazon ElasticSearch
Amazon Athena
Apache Spark
• In-memory analytics cluster using RDD
(Resilient Distributed Dataset) for fast
processing
• Spark MLlib offers machine learning out of the box
• Apache Spark can read directly from Amazon S3
data = sc.textFile("s3://...")
parsedData = data.map(lambda line: array([float(x) for x in line.split(' ')]))
model = KMeans.train(parsedData, 2, maxIterations=10, initializationMode="random")
model.save(sc, "MyModel")
sameModel = KMeansModel.load(sc, "MyModel")
Intel® Processor Technologies
Intel® AVX – Dramatically increases performance for highly parallel HPC workloads
such as life science engineering, data mining, financial analysis, media processing
Intel® AES-NI – Enhances security with new encryption instructions that reduce the
performance penalty associated with encrypting/decrypting data
Intel® Turbo Boost Technology – Increases computing power with performance that
adapts to spikes in workloads
Intel Transactional Synchronization (TSX) Extensions – Enables execution of
transactions that are independent to accelerate throughput
P state & C state control – provides granular performance tuning for cores and sleep
states to improve overall application performance
New X1 Instance - Tons of Memory
• Designed for large-scale, in-memory
applications in the cloud
• Ideal for in-memory databases like SAP
HANA and big data processing apps like
Spark and Presto
• Powered by Intel® Xeon® E7 8880 v3
Haswell processors
• Features up to 2TB of memory and up to
128 vCPUs per instance
• 8X the memory offered by any other Amazon EC2
instance
Machine Learning Algorithms
• Classification
• Sentiment analysis – Do people like my new product?
• Linear Regression
• Trend prediction – How much revenue next month?
• Clustering
• Recommendation - Other people bought this!
• Association
• Market basket analysis – Bundled products
• Neural Networks
• Pattern recognition - Speech recognition
Amazon Machine
Learning
Amazon EMR +
Spark Mlib
GPU Optimized
EC2 Instance
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon EMR
MLlib
Amazon S3
Schemaless
AWS Glue
9. Complex analytical models are built against EMR (Spark) clusters
Amazon EMR
Amazon RedShift
Amazon RDS
Amazon ElasticSearchAmazonML
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon EMR
MLlib
Amazon S3
Schemaless
AWS Glue
10. Deep learning models are built against mxnet clusters
Amazon EMR
Amazon RedShift
Amazon RDS
Amazon ElasticSearch
Deep Learning
AmazonML
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
AWS Glue
11. Predictive models and scored datasets are published to data staging
Amazon EMR
Amazon RedShift
Amazon RDS
Amazon ElasticSearchAmazon EMR
MLlib
Deep Learning
AmazonML
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
AWS Glue
12. Analysts use DWH, EMR, ES to find patterns & measure performance
Amazon RedShift
Amazon RDS
Amazon EMR
MLlib
Deep Learning
AmazonML
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
AWS Glue
13. Risk models evaluated to create new products and assess customers
Amazon RDS
Amazon EMR
MLlib
Deep Learning
AmazonML
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
AWS Glue
14. Demand forecasts loaded into supply chain management systems
Amazon EMR
MLlib
Deep Learning
AmazonML
Amazon Athena
Amazon SNS & Amazon Pinpoint
• Amazon SNS is a fully
managed, cross-platform
mobile push intermediary
service
• Fully scalable to millions
of devices
• Amazon Pinpoint allows
to created targeted
campaigns and measure
engagement and results
Amazon SNS
Apple APNS
Google GCM
Amazon ADM
Windows WNS and
MPNS
Baidu CP
Android Phones and Tablets
Apple iPhones and iPads
Kindle Fire Devices
Android Phones and Tablets in China
iOS
Windows Phone Devices
Amazon
SNS
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
AWS Glue
15. Personalized offers are broadcast out over notification channels
Amazon SNS
Amazon EMR
MLlib
Deep Learning
AmazonML
Amazon Athena
Amazon Pinpoint
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Innovate for new revenues
Personalization, demand forecasting, risk analysis
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
AWS Glue
Amazon SNS
Amazon EMR
MLlib
Deep Learning
AmazonML
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Amazon Pinpoint
See it live in action!
Athena & Quicksight Demo
Amazon
S3
Amazon
Athena
Amazon
Quicksight
Analyze past flight performance data stored in S3
Bureau of Transportation Flight Data Statistics
www.transtats.bts.gov
Create visualizations from S3 with Athena & Quicksight
BREAK
Next up: Real-Time Analytics and Engagement
Modern data architectures for real-time analytics and engagement
Interactive customer experience, event-driven automation, fraud detection
Technology: Clickstream/mobile apps/sensor/video (computer vision)/audio (intent comprehension), event
detection and pipelining, in-line scoring, serverless compute, computer vision, deep learning
Common initiatives
Interactive CX: Natural customer journeys with adaptive interfaces
• Behavior-based recommendations, improving personalization along the journey
• Seamless session transfer across UI, from browser to mobile to physical location
• Voice-driven commands, and use of gestures and other natural interfaces
Event-driven automation: Full execution of business process driven by an action
• Order fulfillment, with real-time update notifications to customer
• Fast response to customer complaints/comments over direct or social channels
Fraud detection: Protect customer and business w/ real-time anomaly detection
• Purchase and payment verification, using behavioral models and location assessment
• Application and account opening validation
Outcome 3 : Real-time Engagement
Personalized content
- Account access
- Track spending
- Check balances
- Pay bills
- Prevent fraud
The Power of Speech: Alexa
Alexa, the voice service that powers
Echo, provides capabilities, or skills,
that enable customers to interact with
devices using voice
Alexa Skills Kit (ASK) allows everyone
to build and publish their own skills
Skills can be powered by AWS
Lambda
Automated Speech Recognition (ASR)
Natural Language Processing (NLP)
Alexa Skills Kit (ASK)
Over 80 services, including Core, Security, Database,
Artificial Intelligence, Analytics, Mobile Development
See it live in action!
Build your own Alexa Skill!
Amazon
Echo
Alexa Skills
Kit
AWS Lambda Facebook
Page
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Provide superior customer service by responding to opportunities in real
time. Fulfill requests for products or services in an automated fashion to
create a strong competitive advantage over those that are unable to.
Assurance becomes a different challenge, when speeds increase, and fraud
prevention must be adaptive and fast. Adding another layer of opportunity and
complexity is the use of vast streams of data from devices that are
measuring location, video, behaviors, environmental conditions, and more.
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Data analysts
Data scientists
Business users
Engagement platforms
Automation / events
1. Real-time engagement requires personas that develop the analytics,
and platforms for engaging and automating processes
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
AWS Glue
2. Real-time systems are built from a base of advanced data processing
Amazon EMR
MLlib
Deep Learning
AmazonML
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
AWS Glue
Amazon
Kinesis
3. Events are pipelined through Kinesis, into multiple streams, at scale
Amazon EMR
MLlib
Deep Learning
AmazonML
Amazon Athena
Also possible with Spark Streaming!
Amazon
Kinesis
EMR with
Spark Streaming
KinesisUtils.createStream(‘twitter-stream’)
.filter(_.getText.contains(‘Big Data’))
.countByWindow(Seconds(5))
Counting tweets on a sliding window
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon S3
Stream Data
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
Amazon
Kinesis
4. Event data is given context and structure in EMR and pushed for batch
Amazon EMR
AWS Glue
Amazon EMR
MLlib
Deep Learning
AmazonML
Amazon Athena
Amazon Kinesis Firehose
• Fully managed data streaming service to ingest and
capture data into your storage or data warehouse
• Ability to batch load, compress or encrypt streaming
data
• Elastic to scale to any throughput (no more sharding)
• Charged only per GB processed ($0.035 per GB)
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon S3
Stream Data
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Amazon
Kinesis Firehose
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
Amazon
Kinesis
5. Kinesis Firehose pumps events into a DWH for near real-time analysis
Amazon EMR
Amazon EMR
MLlib
Deep Learning
AmazonML
AWS Glue
Amazon Athena
AWS Lambda
• Use AWS Lambda to clean and
massage incoming data
• Write code to load data sources
(S3, DynamoDB) automatically in your
data warehouse (e.g. Amazon Redshift)
• React in real-time to incoming events in
Amazon Kinesis
Amazon Lambda
Amazon Redshift
Amazon
Kinesis
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon S3
Stream Data
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Amazon
Kinesis Firehose
Event Scoring
AWS Lambda
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
Amazon
Kinesis
6. The event is streamed to a scoring server for processing
Amazon EMR
Amazon EMR
MLlib
Deep Learning
AmazonML
AWS Glue
Amazon Athena
Artificial Intelligence
Unlimited
Replays
Returns an MP3
or audio stream
Lightning Fast
Response
Fully Managed and
Low Cost
Amazon Polly
Turn text into lifelike speech using deep
learning technologies to synthesize
speech that sounds like a human voice
Amazon Polly
“The temperature
in WA is 75°F”
“The temperature
in Washington is 75 degrees
Fahrenheit”
Amazon Polly: Text In, Life-like Speech Out
Amazon Lex
Conversational interfaces for your
applications, powered by the same
Natural Language Understanding
(NLU) & Automatic Speech Recognition
(ASR) models as Alexa
Integrated
development in
AWS console
Trigger AWS
Lambda
functions
Multi-step
conversations
Continually improving
ASR & NLU models
Enterprise
connectors
Fully Managed
Intents
A particular goal that the
user wants to achieve
Utterances
Spoken or typed phrases
that invoke your intent
Slots
Data the user must provide to fulfill the
intent
Prompts
Questions that ask the user to input
data
Fulfillment
The business logic required to fulfill the
user’s intent
BookHotel
Amazon Rekognition
Image Recognitions and Analysis
powered by Deep Learning which
allows to search, verify and organize
millions of images
Easy to use Batch Analysis Real-time
Analysis
Continually Improving Low Cost
Maple
Villa
Plant
Garden
Water
Swimming Pool
Tree
Potted Plant
Backyard
Demographic Data
Facial Landmarks
Sentiment Expressed
Image Quality
Brightness: 25.84
Sharpness: 160
General Attributes
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon S3
Stream Data
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Amazon
Kinesis Firehose
Event Scoring
Amazon AI
AWS Lambda
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
Amazon
Kinesis
7. Language, intent, and image processing are run and sent for scoring
Amazon EMR
Amazon EMR
MLlib
Deep Learning
AmazonML
AWS Glue
Amazon Athena
See it live in action!
Serverless Rekognition Demo
Serverless website that uses Rekognition to identify
faces and classify pictures
Amazon S3
AWS Lambda
Amazon API
Gateway
Amazon
DynamoDB
Amazon
Rekognition
Mobile
CodeFor.Cloud/image
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon S3
Stream Data
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Amazon
Kinesis Firehose
Event Scoring
Amazon AI
AWS Lambda
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
Amazon
Kinesis
8. Simple analytical models are checked on-demand against Amazon ML
Amazon EMR
Amazon EMR
MLlib
Deep Learning
AmazonML
AWS Glue
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon S3
Stream Data
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Amazon
Kinesis Firehose
Event Scoring
Amazon AI
AWS Lambda
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
Amazon
Kinesis
9. Complex analytical models are scored against coded models (PMML)
Amazon EMR
Amazon EMR
MLlib
Deep Learning
AmazonML
AWS Glue
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon S3
Stream Data
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Amazon
Kinesis Firehose
Event Scoring
Amazon AI
AWS Lambda
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
Amazon
Kinesis
10. Deep learning models are scored against imported models (eg JSON)
Amazon EMR
Amazon EMR
MLlib
Deep Learning
AmazonML
AWS Glue
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon S3
Stream Data
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
AWS Storage
Gateway
Amazon
Kinesis Firehose
Event Scoring
Amazon AI
AWS Lambda AWS Lambda
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
Amazon
Kinesis
11. Scored response to the event is processed to be pushed for action
Amazon EMR
Amazon EMR
MLlib
Deep Learning
AmazonML
AWS Glue
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon S3
Stream Data
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
Amazon DynamoDB
AWS Storage
Gateway
Amazon
Kinesis Firehose
Event Scoring
Amazon AI
AWS Lambda AWS Lambda
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
Amazon
Kinesis
12. Recommendations are pushed to DynamoDB for low latency serving
Amazon EMR
Amazon EMR
MLlib
Deep Learning
AmazonML
AWS Glue
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon S3
Stream Data
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
Amazon SQS
AWS Storage
Gateway
Amazon
Kinesis Firehose
Event Scoring
Amazon AI
AWS Lambda AWS Lambda
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
Amazon
Kinesis
12. Actions are pushed to RDS and SQS for business process automation
Amazon DynamoDB
Amazon EMR
Amazon EMR
MLlib
Deep Learning
AmazonML
AWS Glue
Amazon Athena
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Real-time engagement
Interactive customer experience, event-driven automation, fraud detection
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon S3
Stream Data
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
Amazon DynamoDB
Amazon SQS
AWS Storage
Gateway
Amazon
Kinesis Firehose
Event Scoring
Amazon AI
AWS Lambda AWS Lambda
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
Amazon
Kinesis
Amazon EMR
Amazon EMR
MLlib
Deep Learning
AmazonML
AWS Glue
Amazon Athena
See it live in action!
Amazon
Kinesis
Twitter Stream Amazon
Lambda
Demo: Live Twitter Feed Analysis
* https://blog.twitter.com/2013/new-tweets-per-second-record-and-how
Twitter Blog* - On a typical day (in 2013):
• More than 500 million Tweets sent
• Average 5,700 TPS
Amazon
Elasticsearch
Service
Automation of self-service, deployment, policy, and quality assurance
Technology: Self-service, on-demand provisioning, DevOps, spot pricing, Cloud Formations, security
automation, performance monitoring (CW&XR), global rollouts
Common initiatives
Self-service:
• Application catalog or portal for all employees, availability determined by role
• Service provisioning backed by automation of policy and governance
Agile development: Use of DevOps to allow very few resources to deploy globally
• CI/CD for software release, build/test, and deployment automation
• Templated infrastructure provisioning, and configuration management
• Business rules and policies are "gold coded" to be used for all deployments
• Use of Security by Design (SbD) to codify network, O/S, and encryption
Comprehensive monitoring: Assurance of SLA and issue remediation
• Logging and monitoring of all API calls and executions to ensure SLAs are met
• Analysis of performance variance for faster root cause analysis
Outcome 4 : Automate for expansive reach
AWS
Cloud TrailAWS IAM
Amazon
CloudWatchAWS KMS
Ingest ServingData
sources
Speed (Real-time)
Scale (Batch)
Automate for expansive reach
Automation of self-service, deployment, policy, and quality assurance
Transactions
AWS Database
Migration Service
AWS Direct
Connect
Internet
Interfaces
Amazon S3
Stream Data
Amazon
Kinesis
Amazon EMR
Amazon ElasticSearch
Amazon RedShift
Amazon RDS
Amazon DynamoDB
Amazon SQS
AWS Storage
Gateway
Amazon
Kinesis Firehose
Event Scoring
Amazon AI
AWS Lambda AWS Lambda
Data analysts
Data scientists
Business users
Connected
devices
Web logs /
cookies
Social media
Engagement platforms
Automation / events
ERP
Amazon S3
Raw Data
Amazon S3
Staged Data
(Data Lake)
Amazon EMR
ETL
Amazon S3
Clean Data
Amazon S3
Schemaless
Amazon
Kinesis
Amazon EMR
Amazon EMR
MLlib
Deep Learning
AmazonML
AWS Glue
Amazon Athena
AWS DevOps
Next steps for you to be the Big Data hero
Sharpen your skills (Singapore)
Attend the official AWS Training course organized by AWS Authorized local
training partner – Bespoke Training Services (www.bespoketraining.com).
Join the AWS Jumpstart (2 hr) session and hear from our customers and
partners on how they enabled their teams and successfully deployed on
AWS. Also stand a chance to win free seat to the above courses.
Point of contact – Gilbert Cheo - gilbert@bespoketraining.com
Courses Date
Architecting on AWS 28 Feb-2 Mar / 14-16 March
System Operations on
AWS
22-24 Feb
Developing on AWS 4-6 April
Big Data on AWS 4-6 April
Date Venue
AWS Singapore, Church Street, Capital Square,
#10-01, Singapore 049481
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
http://bit.ly/summitsg
April 11 – Marina Bay Sands - Singapore
Register Now!
You, the Big Data hero!
Start with the persona
De-couple to scale
Experiment and iterate
Deploy with automation
Thank You!
Next up: Q&A

Weitere ähnliche Inhalte

Was ist angesagt?

Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisAmazon Web Services
 
Structured, Unstructured and Streaming Big Data on the AWS
Structured, Unstructured and Streaming Big Data on the AWSStructured, Unstructured and Streaming Big Data on the AWS
Structured, Unstructured and Streaming Big Data on the AWSAmazon Web Services
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightAmazon Web Services
 
Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100Amazon Web Services
 
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Web Services
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewAmazon Web Services
 
BDA309 Building Your Data Lake on AWS
BDA309 Building Your Data Lake on AWSBDA309 Building Your Data Lake on AWS
BDA309 Building Your Data Lake on AWSAmazon Web Services
 
16h00 globant - aws globant-big-data_summit2012
16h00   globant - aws globant-big-data_summit201216h00   globant - aws globant-big-data_summit2012
16h00 globant - aws globant-big-data_summit2012infolive
 
How EIS Reduced Costs by 20% and Optimized SAP by Leveraging the Cloud PPT
How EIS Reduced Costs by 20% and Optimized SAP by Leveraging the Cloud PPTHow EIS Reduced Costs by 20% and Optimized SAP by Leveraging the Cloud PPT
How EIS Reduced Costs by 20% and Optimized SAP by Leveraging the Cloud PPTAmazon Web Services
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightAmazon Web Services
 
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Amazon Web Services
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924Amazon Web Services
 
Building Serverless Web Applications - DevDay Los Angeles 2017
Building Serverless Web Applications - DevDay Los Angeles 2017Building Serverless Web Applications - DevDay Los Angeles 2017
Building Serverless Web Applications - DevDay Los Angeles 2017Amazon Web Services
 
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You Scale
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You ScaleENT316 Keeping Pace With The Cloud: Managing and Optimizing as You Scale
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You ScaleAmazon Web Services
 
Running Lean Architectures: How to Optimize for Cost Efficiency
Running Lean Architectures: How to Optimize for Cost Efficiency Running Lean Architectures: How to Optimize for Cost Efficiency
Running Lean Architectures: How to Optimize for Cost Efficiency Amazon Web Services
 
使用 Amazon Lex 在應用程式中建立對話式機器人
使用 Amazon Lex 在應用程式中建立對話式機器人 使用 Amazon Lex 在應用程式中建立對話式機器人
使用 Amazon Lex 在應用程式中建立對話式機器人 Amazon Web Services
 
AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...
AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...
AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...Amazon Web Services
 
Building a Data Processing Pipeline on AWS
Building a Data Processing Pipeline on AWSBuilding a Data Processing Pipeline on AWS
Building a Data Processing Pipeline on AWSAmazon Web Services
 

Was ist angesagt? (20)

Real-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon KinesisReal-Time Streaming: Intro to Amazon Kinesis
Real-Time Streaming: Intro to Amazon Kinesis
 
Structured, Unstructured and Streaming Big Data on the AWS
Structured, Unstructured and Streaming Big Data on the AWSStructured, Unstructured and Streaming Big Data on the AWS
Structured, Unstructured and Streaming Big Data on the AWS
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSight
 
Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100Driving Business Outcomes with a Modern Data Architecture - Level 100
Driving Business Outcomes with a Modern Data Architecture - Level 100
 
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
 
Welcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution OverviewWelcome & AWS Big Data Solution Overview
Welcome & AWS Big Data Solution Overview
 
AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
 
BDA309 Building Your Data Lake on AWS
BDA309 Building Your Data Lake on AWSBDA309 Building Your Data Lake on AWS
BDA309 Building Your Data Lake on AWS
 
16h00 globant - aws globant-big-data_summit2012
16h00   globant - aws globant-big-data_summit201216h00   globant - aws globant-big-data_summit2012
16h00 globant - aws globant-big-data_summit2012
 
2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days2016 AWS Big Data Solution Days
2016 AWS Big Data Solution Days
 
How EIS Reduced Costs by 20% and Optimized SAP by Leveraging the Cloud PPT
How EIS Reduced Costs by 20% and Optimized SAP by Leveraging the Cloud PPTHow EIS Reduced Costs by 20% and Optimized SAP by Leveraging the Cloud PPT
How EIS Reduced Costs by 20% and Optimized SAP by Leveraging the Cloud PPT
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSight
 
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017 Full Stack Analytics on AWS - AWS Summit Cape Town 2017
Full Stack Analytics on AWS - AWS Summit Cape Town 2017
 
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
 
Building Serverless Web Applications - DevDay Los Angeles 2017
Building Serverless Web Applications - DevDay Los Angeles 2017Building Serverless Web Applications - DevDay Los Angeles 2017
Building Serverless Web Applications - DevDay Los Angeles 2017
 
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You Scale
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You ScaleENT316 Keeping Pace With The Cloud: Managing and Optimizing as You Scale
ENT316 Keeping Pace With The Cloud: Managing and Optimizing as You Scale
 
Running Lean Architectures: How to Optimize for Cost Efficiency
Running Lean Architectures: How to Optimize for Cost Efficiency Running Lean Architectures: How to Optimize for Cost Efficiency
Running Lean Architectures: How to Optimize for Cost Efficiency
 
使用 Amazon Lex 在應用程式中建立對話式機器人
使用 Amazon Lex 在應用程式中建立對話式機器人 使用 Amazon Lex 在應用程式中建立對話式機器人
使用 Amazon Lex 在應用程式中建立對話式機器人
 
AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...
AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...
AWS re:Invent 2016: Leveraging Amazon Machine Learning, Amazon Redshift, and ...
 
Building a Data Processing Pipeline on AWS
Building a Data Processing Pipeline on AWSBuilding a Data Processing Pipeline on AWS
Building a Data Processing Pipeline on AWS
 

Andere mochten auch

Optimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics WorkloadsOptimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics WorkloadsAmazon Web Services
 
使用Amazon Machine Learning 創建智能應用程式
使用Amazon Machine Learning 創建智能應用程式使用Amazon Machine Learning 創建智能應用程式
使用Amazon Machine Learning 創建智能應用程式Amazon Web Services
 
Migrating Large Scale Data Sets to the Cloud
Migrating Large Scale Data Sets to the CloudMigrating Large Scale Data Sets to the Cloud
Migrating Large Scale Data Sets to the CloudAmazon Web Services
 
Introduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesIntroduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesAmazon Web Services
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSAmazon Web Services
 
Tracxn Research - Mobile Advertising Landscape, February 2017
Tracxn Research - Mobile Advertising Landscape, February 2017Tracxn Research - Mobile Advertising Landscape, February 2017
Tracxn Research - Mobile Advertising Landscape, February 2017Tracxn
 
Strategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageStrategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageAmazon Web Services
 
Build an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersBuild an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersAmazon Web Services
 
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFSSimple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFSAmazon Web Services
 
Supercharging the Value of Your Data with Amazon S3
Supercharging the Value of Your Data with Amazon S3Supercharging the Value of Your Data with Amazon S3
Supercharging the Value of Your Data with Amazon S3Amazon Web Services
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Lucas Jellema
 
Tracxn Research - Insurance Tech Landscape, February 2017
Tracxn Research - Insurance Tech Landscape, February 2017Tracxn Research - Insurance Tech Landscape, February 2017
Tracxn Research - Insurance Tech Landscape, February 2017Tracxn
 
Introduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesIntroduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesAmazon Web Services
 
Tracxn Research - Healthcare Analytics Landscape, February 2017
Tracxn Research - Healthcare Analytics Landscape, February 2017Tracxn Research - Healthcare Analytics Landscape, February 2017
Tracxn Research - Healthcare Analytics Landscape, February 2017Tracxn
 
Webinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your BusinessWebinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your BusinessMongoDB
 
2017 iosco research report on financial technologies (fintech)
2017 iosco research report on  financial technologies (fintech)2017 iosco research report on  financial technologies (fintech)
2017 iosco research report on financial technologies (fintech)Ian Beckett
 
2015 Internet Trends Report
2015 Internet Trends Report2015 Internet Trends Report
2015 Internet Trends ReportIQbal KHan
 
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...Amazon Web Services
 

Andere mochten auch (20)

Optimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics WorkloadsOptimizing Storage for Big Data Analytics Workloads
Optimizing Storage for Big Data Analytics Workloads
 
使用Amazon Machine Learning 創建智能應用程式
使用Amazon Machine Learning 創建智能應用程式使用Amazon Machine Learning 創建智能應用程式
使用Amazon Machine Learning 創建智能應用程式
 
Migrating Large Scale Data Sets to the Cloud
Migrating Large Scale Data Sets to the CloudMigrating Large Scale Data Sets to the Cloud
Migrating Large Scale Data Sets to the Cloud
 
Introduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesIntroduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web Services
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWS
 
Tracxn Research - Mobile Advertising Landscape, February 2017
Tracxn Research - Mobile Advertising Landscape, February 2017Tracxn Research - Mobile Advertising Landscape, February 2017
Tracxn Research - Mobile Advertising Landscape, February 2017
 
Strategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud StorageStrategic Uses for Cost Efficient Long-Term Cloud Storage
Strategic Uses for Cost Efficient Long-Term Cloud Storage
 
Build an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million UsersBuild an App on AWS for Your First 10 Million Users
Build an App on AWS for Your First 10 Million Users
 
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFSSimple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
Simple, Scalable and Highly Durable NAS in the Cloud – Amazon EFS
 
Supercharging the Value of Your Data with Amazon S3
Supercharging the Value of Your Data with Amazon S3Supercharging the Value of Your Data with Amazon S3
Supercharging the Value of Your Data with Amazon S3
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
 
Tracxn Research - Insurance Tech Landscape, February 2017
Tracxn Research - Insurance Tech Landscape, February 2017Tracxn Research - Insurance Tech Landscape, February 2017
Tracxn Research - Insurance Tech Landscape, February 2017
 
Introduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesIntroduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web Services
 
Tracxn Research - Healthcare Analytics Landscape, February 2017
Tracxn Research - Healthcare Analytics Landscape, February 2017Tracxn Research - Healthcare Analytics Landscape, February 2017
Tracxn Research - Healthcare Analytics Landscape, February 2017
 
Webinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your BusinessWebinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your Business
 
2017 iosco research report on financial technologies (fintech)
2017 iosco research report on  financial technologies (fintech)2017 iosco research report on  financial technologies (fintech)
2017 iosco research report on financial technologies (fintech)
 
K8S in prod
K8S in prodK8S in prod
K8S in prod
 
2015 Internet Trends Report
2015 Internet Trends Report2015 Internet Trends Report
2015 Internet Trends Report
 
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
Introducing Amazon Lex – A Service for Building Voice or Text Chatbots - Marc...
 
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
 

Ähnlich wie Modern Data Architectures for Business Outcomes

Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017Amazon Web Services
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingAmazon Web Services
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantageAmazon Web Services
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale Amazon Web Services
 
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017Amazon Web Services
 
利用 Amazon QuickSight 視覺化分析服務剖析資料
利用 Amazon QuickSight 視覺化分析服務剖析資料利用 Amazon QuickSight 視覺化分析服務剖析資料
利用 Amazon QuickSight 視覺化分析服務剖析資料Amazon Web Services
 
Tapdata Product Intro
Tapdata Product IntroTapdata Product Intro
Tapdata Product IntroTapdata
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Amazon Web Services LATAM
 
Amazon Web Services
Amazon Web ServicesAmazon Web Services
Amazon Web ServicesJisc
 
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...Amazon Web Services
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftAmazon Web Services
 
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...Amazon Web Services
 
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...Amazon Web Services
 
Finding Meaning in the Noise: Understanding Big Data with AWS Analytics
Finding Meaning in the Noise: Understanding Big Data with AWS AnalyticsFinding Meaning in the Noise: Understanding Big Data with AWS Analytics
Finding Meaning in the Noise: Understanding Big Data with AWS AnalyticsAmazon Web Services
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationDenodo
 
Big Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of LightBig Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of LightAmazon Web Services LATAM
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analyticsAmazon Web Services
 
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...Amazon Web Services
 

Ähnlich wie Modern Data Architectures for Business Outcomes (20)

AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017Driving Business Insights with a Modern Data Architecture  AWS Summit SG 2017
Driving Business Insights with a Modern Data Architecture AWS Summit SG 2017
 
Building your Datalake on AWS
Building your Datalake on AWSBuilding your Datalake on AWS
Building your Datalake on AWS
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
 
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
 
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
Best Practices Using Big Data on AWS | AWS Public Sector Summit 2017
 
利用 Amazon QuickSight 視覺化分析服務剖析資料
利用 Amazon QuickSight 視覺化分析服務剖析資料利用 Amazon QuickSight 視覺化分析服務剖析資料
利用 Amazon QuickSight 視覺化分析服務剖析資料
 
Tapdata Product Intro
Tapdata Product IntroTapdata Product Intro
Tapdata Product Intro
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
 
Amazon Web Services
Amazon Web ServicesAmazon Web Services
Amazon Web Services
 
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
 
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftData warehousing in the era of Big Data: Deep Dive into Amazon Redshift
Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift
 
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
How Citrix Uses AWS Marketplace Solutions to Accelerate Analytic Workloads on...
 
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
MSC203_How Citrix Uses AWS Marketplace Solutions To Accelerate Analytic Workl...
 
Finding Meaning in the Noise: Understanding Big Data with AWS Analytics
Finding Meaning in the Noise: Understanding Big Data with AWS AnalyticsFinding Meaning in the Noise: Understanding Big Data with AWS Analytics
Finding Meaning in the Noise: Understanding Big Data with AWS Analytics
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and Visualization
 
Big Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of LightBig Data & Analytics - Innovating at the Speed of Light
Big Data & Analytics - Innovating at the Speed of Light
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analytics
 
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
Introducing Amazon Kinesis: Real-time Processing of Streaming Big Data (BDT10...
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 

Kürzlich hochgeladen (20)

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 

Modern Data Architectures for Business Outcomes

  • 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Craig Stires Head of Analytics, Big Data, AI Asia-Pacific Modern Data Architectures for Business Insights at Scale
  • 2. Today's workshop 2:00pm - 2:15pm Overview on using modern data architectures on AWS 2:15pm - 3:40pm Modern data architectures for business insights at scale (Includes Live Demos) 3:40pm – 4:00pm Break 4:00pm - 5:15pm Modern data architectures for real-time analytics and engagement (Includes Live Demos)
  • 3. Overview on using modern data architectures on AWS
  • 4. What is driving the requests for information? - What information is needed? - Where does the source data live? - Freshness - how real-time? What kind of persona are you serving? - Measurable business outcome? - Speed to access / urgency - UI - interactive vs file vs embedded - On-demand vs published
  • 5. Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011 IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares Available for analysis Generated data Data volume - Gap 1990 2000 2010 2020 Should we collect "all the data" and see what's in it?
  • 6. Starting by amassing "all your data" and dumping into a large repository for the data gurus to start finding "insights" is like trying to win the lottery
  • 7. Three big indicators of individual behavior Purchases Movement Influence
  • 8. A platform to build business outcomes from data Purchases Movement Influence Ingest/ Collect Consume/ visualize Store Process/ analyze 1 4 0 9 5 Revenue Lift Market acquisition Customer delight Brand advocacy Inventory optimization Supply chain efficiency ...
  • 9. The AWS Cloud helps remove constraints
  • 10. How have some IT teams stopped being the "VILLIAN"?
  • 11.
  • 12. Starting small is powerful, when you can scale up fast Scaling up your analytics systems With AWS Traditional IT * get a new BI server 20 minutes 3 months upgrade your analytics server to the newest Intel processors and add 16GB memory 15 minutes 2 months add 500TB of storage instant 2 months grow a DWH cluster from 8GB to 1PB 1 hour 8 months build a 1024-node Hadoop cluster 30 minutes unlikely roll out multi-region production environment hours months * actual provisioning times in a well-organized IT division
  • 13. Big Data: • Potentially massive datasets • Iterative, experimental style of data manipulation and analysis • Frequently not a steady-state workload; peaks and valleys • Data is a combination of structured and unstructured data in many formats AWS Cloud: • Virtually unlimited capacity • Iterative, experimental usage cost through on-demand infrastructure • Fully scalable infrastructure for highly variable workloads • Tools & Services for managing structured, unstructured and stream data
  • 14. Let’s talk business outcomes of data analytics!
  • 15. Outcome 1 : Modernize and consolidate • Insights to enhance business applications and create new digital services Outcome 2 : Innovate for new revenues • Personalization, demand forecasting, risk analysis Outcome 3 : Real-time engagement • Interactive customer experience, event-driven automation, fraud detection Outcome 4 : Automate for expansive reach • Automation of business processes and physical infrastructure Driving Business Outcomes via Data Analytics
  • 16. Modern data architectures for business insights at scale
  • 17. Insights to enhance business applications, new digital services Technology: Backend system integration, on-prem data center extension, business application integration, BI provisioning, data lakes, external APIs, access control and logging Common initiatives Insights: 360 view of the business • Legacy data systems migration to enable self-service for business analysts • Integration of all customer data, from orders, payments, interactions • Supplier performance for inventory and vendor management Digitization: Web-service that gives on-demand insights • Delivery of digital content, with behavior tracking, and upsell (or ads) • Ordering system for enterprise customers or consumers Data monetization: Enrich, aggregate, and sell business data • External data enrichment API, including digital marketing platforms • Purchasable data sets of anonymized, domain-enriched insights Outcome 1 : Modernize and Consolidate
  • 18. Ingest ServingData sources Speed (Real-time) Scale (Batch) Modernize and consolidate Insights to enhance business applications, new digital services Enhancing business applications and creating new digital services takes a few steps. Business goals often consist of being an agile, well-run organization, and to stop missing opportunities because people are making decisions without accurate insights. These initiatives are focused on giving important personas fast and secure access to business-relevant insights.
  • 19. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modernize and consolidate Insights to enhance business applications, new digital services Business users External buyers 1. Define personas and use case requirements (including UI) Data analysts
  • 20. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modernize and consolidate Insights to enhance business applications, new digital services Business users External buyers Transactions Web logs / cookies ERP 2. Locate the data sources that have the information to extract Data analysts
  • 21. Fluentd: Open Source Log Collection https://github.com/fluent/fluentd/ • Fluentd is an open source data collector to unify data collection and consumption • Integration into many data sources (App Logs, Syslogs, Twitter etc.) • Direct integration into AWS <source> type tail format apache2 path /var/log/apache2/access_log tag s3.apache.access </source> <match s3.*.*> type s3 s3_bucket myweblogs path logs/ </match>
  • 22. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modernize and consolidate Insights to enhance business applications, new digital services Business users External buyers Transactions Web logs / cookies ERP Ingest AWS Database Migration Service AWS Direct Connect AWS Storage Gateway Internet Interfaces Changed Data 3. Ingest data through incremental or full loads, across secure connections Data analysts
  • 23. A single, large system may perform a single task well, but is often too difficult to adapt and scale
  • 24. A system that is decoupled can adapt to a fast moving business, and can scale up and down with significantly lower barriers
  • 25. Decouple Storage and Compute Traditionally analytical workloads required large databases or data warehouses, with storage and compute close to each other Big Data often benefits from decoupling storage and compute Amazon S3 offers virtually unlimited storage at a per GB/month rate
  • 26. Amazon S3 Highly available object storage 99.999999999% data durability Replicated across 3 facilities Virtually unlimited scale Pay only for usage, no pre-provisioning Event notifications to trigger actions
  • 27. Amazon EMR Fully managed Hadoop Optimized with S3 Autoscaling for elasticity Transient and long running clusters Integration with AWS Spot Market
  • 28. 1 instance x 100 hours = 100 instances x 1 hour (and with Spot Pricing not only faster but also cheaper)
  • 29. Amazon EMR • Amazon EMR supports all common Hadoop Frameworks such as: • Spark, Pig, Hive, Hue, Oozie … • Hbase, Presto, Impala … • Decouples storage from compute • Allows independent scaling • Direct Integration with DynamoDB and S3 Amazon S3Amazon DynamoDB Amazon EMR
  • 30. AWS Glue Managed Transform Engine Job Scheduler Data Catalog Built on Apache Spark Integrated with S3, RDS, Redshift & any JDBC-compliant data store
  • 31.
  • 32. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modernize and consolidate Insights to enhance business applications, new digital services Business users External buyers Transactions Web logs / cookies ERP Ingest AWS Database Migration Service AWS Direct Connect AWS Storage Gateway Internet Interfaces Changed Data 4. Use Hadoop for large scale ETL, data quality, and preparation [*EMRFS] AWS Glue Amazon S3 Raw Data Amazon EMR ETL Data analysts Amazon S3 Clean Data
  • 33. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modernize and consolidate Insights to enhance business applications, new digital services Business users External buyers Transactions Web logs / cookies ERP Ingest AWS Database Migration Service AWS Direct Connect AWS Storage Gateway Internet Interfaces Changed Data 5. Stage all data into centralized, highly available, durable storage for further access AWS Glue Amazon S3 Raw Data Data analysts Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data
  • 34. Fully managed MPP SQL database - fully relational Optimised for analytics Gigabytes to Petabytes Less than 1/10th the cost of traditional Amazon Redshift
  • 35. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modernize and consolidate Insights to enhance business applications, new digital services Business users External buyers Transactions Web logs / cookies ERP Ingest AWS Database Migration Service AWS Direct Connect AWS Storage Gateway Internet Interfaces Changed Data 6. Load semi-structured into Hadoop, structured into the DWH, and application data into managed legacy application databases AWS Glue Amazon S3 Raw Data Amazon EMR Semi-structured Amazon RedShift Data Warehouse Amazon RDS Legacy Apps Data analysts Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data
  • 36. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Speed (Real-time) Ingest ServingData sources Scale (Batch) Modernize and consolidate Insights to enhance business applications, new digital services Business users External buyers Transactions Web logs / cookies ERP Ingest AWS Database Migration Service AWS Direct Connect AWS Storage Gateway Internet Interfaces Changed Data 7. Data is protected through identity and access management and logging AWS Glue Amazon S3 Raw Data Amazon EMR Semi-structured Amazon RedShift Data Warehouse Amazon RDS Legacy Apps Data analysts Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data
  • 37. Fast, cloud-powered, BI service Visualizations and ad-hoc analysis Connectors for AWS and 3rd party sources In-memory calculation engine (SPICE) $9 per user per month Amazon QuickSight
  • 38.
  • 39. AWS Marketplace • Pre-Configured machine images ready to be launched into virtual server instances • Launch applications with 1-Click • Pay software licenses by the hour or bring your own license (BYOL)
  • 40. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Speed (Real-time) Ingest ServingData sources Scale (Batch) Modernize and consolidate Insights to enhance business applications, new digital services Business users External buyers Transactions Web logs / cookies ERP Ingest AWS Database Migration Service AWS Direct Connect AWS Storage Gateway Internet Interfaces Changed Data 8. Data analysts use BI tools of choice to access all serving services AWS Glue Amazon S3 Raw Data Amazon EMR Semi-structured Amazon RedShift Data Warehouse Amazon RDS Legacy Apps Data analysts Amazon QuickSight Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data
  • 41. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Speed (Real-time) Ingest ServingData sources Scale (Batch) Modernize and consolidate Insights to enhance business applications, new digital services Business users External buyers Transactions Web logs / cookies ERP Ingest AWS Database Migration Service AWS Direct Connect AWS Storage Gateway Internet Interfaces Changed Data 9. Business users have enterprise applications enhanced by analytics AWS Glue Amazon S3 Raw Data Amazon EMR Semi-structured Amazon RedShift Data Warehouse Amazon RDS Legacy Apps Data analysts Amazon QuickSight Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data
  • 42. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Speed (Real-time) Ingest ServingData sources Scale (Batch) Modernize and consolidate Insights to enhance business applications, new digital services Business users External buyers Transactions Web logs / cookies ERP Ingest AWS Database Migration Service AWS Direct Connect AWS Storage Gateway Internet Interfaces Changed Data 10. External parties can buy services or data in a governed, secure way AWS Glue Amazon S3 Raw Data Amazon EMR Semi-structured Amazon RedShift Data Warehouse Amazon RDS Legacy Apps Data analysts Amazon QuickSight Amazon API Gateway Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data
  • 43. Speed (Real-time) Ingest ServingData sources Scale (Batch) Modernize and consolidate Insights to enhance business applications, new digital services Business users External buyers Transactions Web logs / cookies ERP Ingest AWS Database Migration Service AWS Direct Connect AWS Storage Gateway Internet Interfaces Changed Data AWS Glue Amazon S3 Raw Data Amazon EMR Semi-structured Amazon RedShift Data Warehouse Amazon RDS Legacy Apps Data analysts Amazon QuickSight Amazon API Gateway Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS
  • 44. Personalization, demand forecasting, risk analysis Technology: Advanced analytics, customer segmentations, high volume transactional data, un/semi- structured data, design of experiment, A/B & hypothesis testing, machine learning Common initiatives Personalization: Refine market approaches based on optimal segments • Offer products to new customers based on clusters of similar individuals • Launch share of wallet initiatives, understanding likely total spend • Targeted marketing to capture interests and increase conversion rates Predict demand: Guide business owners to select the best scenarios • Launch items or promotions at the optimal time to maximize response • Modeling for store assortment, product selection, and merchandizing • New product design, based on known market propensities Risk measurement: Create freedom to act by quantifying exposures • Scenario simulation to encourage investments and new offerings • Supply chain analytics allows for faster confirmation of goods to customers Outcome 2 : Innovate for new revenues
  • 45.
  • 46.
  • 47. Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Driving net new revenues is realized by business teams that have access to skilled analysts, using platforms that can scale up and out, without IT bottlenecks. Organizations start operating based on what they know about their customers, and can approach new ventures in terms of confidence levels. Product launches, campaigns, supply chain management, packaged services, and customized offerings are designed and executed based on predictive models.
  • 48. Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis AWS Cloud TrailAWS IAM Amazon CloudWatch Data analysts Data scientists Business users Engagement platforms AWS KMS 1. Personas involved in generating new revenues are data scientists, data analysts (often embedded), business users, and customers/suppliers
  • 49. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Direct Connect Data analysts Data scientists Business users Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data AWS Glue 2. Advanced analytics are built from a base of traditional data processing Amazon EMR Amazon RedShift Amazon RDS
  • 50. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Direct Connect Data analysts Data scientists Business users Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data AWS Glue 3. On-premise storage and databases are connected and converted Amazon EMR Amazon RedShift Amazon RDS AWS Database Migration Service AWS Storage Gateway
  • 51. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Direct Connect Internet Interfaces Data analysts Data scientists Business users Web logs / cookies Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data AWS Glue 4. Internet-native data sources, like web and mobile, are captured Amazon EMR Amazon RedShift Amazon RDS AWS Database Migration Service AWS Storage Gateway
  • 52. Stream in Real Time: Amazon Kinesis • Real-Time Data Processing over large distributed streams • Elastic capacity that scales to millions of events per second • React In real-time upon incoming stream events • Reliable stream storage replicated across 3 facilities Amazon Kinesis
  • 53. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data AWS Glue 5. Streaming un/semi-structured data feeds, like social and devices are captured Amazon EMR Amazon RedShift Amazon RDS
  • 54. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless AWS Glue 6. Log files and other schemaless data converted to Parquet and staged Amazon EMR Amazon RedShift Amazon RDS
  • 55. Interactive query service to analyze data in Amazon S3 directly using standard SQL No need to move data No infrastructure to setup & manage Fast -- results within seconds Pay for only the queries you run Amazon Athena
  • 56. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis Amazon EMR Amazon ElasticSearch AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless AWS Glue 7. Data analysts explore and visualize un/semi-structured data Amazon RedShift Amazon RDS Amazon Athena
  • 57. Amazon Machine Learning • Easy to use, managed machine learning service built for developers • Machine learning technology based on Amazon’s internal systems • Create models using data stored in Amazon S3, Amazon RDS or Amazon Redshift • Request predictions on batch or real- time Amazon Machine Learning
  • 58. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon Machine Learning Amazon S3 Schemaless AWS Glue 8. Simple analytical models are built against Amazon Machine Learning Amazon EMR Amazon RedShift Amazon RDS Amazon ElasticSearch Amazon Athena
  • 59. Apache Spark • In-memory analytics cluster using RDD (Resilient Distributed Dataset) for fast processing • Spark MLlib offers machine learning out of the box • Apache Spark can read directly from Amazon S3 data = sc.textFile("s3://...") parsedData = data.map(lambda line: array([float(x) for x in line.split(' ')])) model = KMeans.train(parsedData, 2, maxIterations=10, initializationMode="random") model.save(sc, "MyModel") sameModel = KMeansModel.load(sc, "MyModel")
  • 60. Intel® Processor Technologies Intel® AVX – Dramatically increases performance for highly parallel HPC workloads such as life science engineering, data mining, financial analysis, media processing Intel® AES-NI – Enhances security with new encryption instructions that reduce the performance penalty associated with encrypting/decrypting data Intel® Turbo Boost Technology – Increases computing power with performance that adapts to spikes in workloads Intel Transactional Synchronization (TSX) Extensions – Enables execution of transactions that are independent to accelerate throughput P state & C state control – provides granular performance tuning for cores and sleep states to improve overall application performance
  • 61. New X1 Instance - Tons of Memory • Designed for large-scale, in-memory applications in the cloud • Ideal for in-memory databases like SAP HANA and big data processing apps like Spark and Presto • Powered by Intel® Xeon® E7 8880 v3 Haswell processors • Features up to 2TB of memory and up to 128 vCPUs per instance • 8X the memory offered by any other Amazon EC2 instance
  • 62. Machine Learning Algorithms • Classification • Sentiment analysis – Do people like my new product? • Linear Regression • Trend prediction – How much revenue next month? • Clustering • Recommendation - Other people bought this! • Association • Market basket analysis – Bundled products • Neural Networks • Pattern recognition - Speech recognition Amazon Machine Learning Amazon EMR + Spark Mlib GPU Optimized EC2 Instance
  • 63. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon EMR MLlib Amazon S3 Schemaless AWS Glue 9. Complex analytical models are built against EMR (Spark) clusters Amazon EMR Amazon RedShift Amazon RDS Amazon ElasticSearchAmazonML Amazon Athena
  • 64. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon EMR MLlib Amazon S3 Schemaless AWS Glue 10. Deep learning models are built against mxnet clusters Amazon EMR Amazon RedShift Amazon RDS Amazon ElasticSearch Deep Learning AmazonML Amazon Athena
  • 65. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless AWS Glue 11. Predictive models and scored datasets are published to data staging Amazon EMR Amazon RedShift Amazon RDS Amazon ElasticSearchAmazon EMR MLlib Deep Learning AmazonML Amazon Athena
  • 66. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis Amazon EMR Amazon ElasticSearch AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless AWS Glue 12. Analysts use DWH, EMR, ES to find patterns & measure performance Amazon RedShift Amazon RDS Amazon EMR MLlib Deep Learning AmazonML Amazon Athena
  • 67. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless AWS Glue 13. Risk models evaluated to create new products and assess customers Amazon RDS Amazon EMR MLlib Deep Learning AmazonML Amazon Athena
  • 68. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless AWS Glue 14. Demand forecasts loaded into supply chain management systems Amazon EMR MLlib Deep Learning AmazonML Amazon Athena
  • 69. Amazon SNS & Amazon Pinpoint • Amazon SNS is a fully managed, cross-platform mobile push intermediary service • Fully scalable to millions of devices • Amazon Pinpoint allows to created targeted campaigns and measure engagement and results Amazon SNS Apple APNS Google GCM Amazon ADM Windows WNS and MPNS Baidu CP Android Phones and Tablets Apple iPhones and iPads Kindle Fire Devices Android Phones and Tablets in China iOS Windows Phone Devices Amazon SNS
  • 70. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless AWS Glue 15. Personalized offers are broadcast out over notification channels Amazon SNS Amazon EMR MLlib Deep Learning AmazonML Amazon Athena Amazon Pinpoint
  • 71. Ingest ServingData sources Speed (Real-time) Scale (Batch) Innovate for new revenues Personalization, demand forecasting, risk analysis Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless AWS Glue Amazon SNS Amazon EMR MLlib Deep Learning AmazonML Amazon Athena AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Amazon Pinpoint
  • 72. See it live in action!
  • 73. Athena & Quicksight Demo Amazon S3 Amazon Athena Amazon Quicksight Analyze past flight performance data stored in S3 Bureau of Transportation Flight Data Statistics www.transtats.bts.gov Create visualizations from S3 with Athena & Quicksight
  • 74. BREAK Next up: Real-Time Analytics and Engagement
  • 75. Modern data architectures for real-time analytics and engagement
  • 76. Interactive customer experience, event-driven automation, fraud detection Technology: Clickstream/mobile apps/sensor/video (computer vision)/audio (intent comprehension), event detection and pipelining, in-line scoring, serverless compute, computer vision, deep learning Common initiatives Interactive CX: Natural customer journeys with adaptive interfaces • Behavior-based recommendations, improving personalization along the journey • Seamless session transfer across UI, from browser to mobile to physical location • Voice-driven commands, and use of gestures and other natural interfaces Event-driven automation: Full execution of business process driven by an action • Order fulfillment, with real-time update notifications to customer • Fast response to customer complaints/comments over direct or social channels Fraud detection: Protect customer and business w/ real-time anomaly detection • Purchase and payment verification, using behavioral models and location assessment • Application and account opening validation Outcome 3 : Real-time Engagement
  • 77.
  • 78.
  • 79. Personalized content - Account access - Track spending - Check balances - Pay bills - Prevent fraud
  • 80. The Power of Speech: Alexa Alexa, the voice service that powers Echo, provides capabilities, or skills, that enable customers to interact with devices using voice Alexa Skills Kit (ASK) allows everyone to build and publish their own skills Skills can be powered by AWS Lambda
  • 81. Automated Speech Recognition (ASR) Natural Language Processing (NLP) Alexa Skills Kit (ASK) Over 80 services, including Core, Security, Database, Artificial Intelligence, Analytics, Mobile Development
  • 82. See it live in action!
  • 83. Build your own Alexa Skill! Amazon Echo Alexa Skills Kit AWS Lambda Facebook Page
  • 84. Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Provide superior customer service by responding to opportunities in real time. Fulfill requests for products or services in an automated fashion to create a strong competitive advantage over those that are unable to. Assurance becomes a different challenge, when speeds increase, and fraud prevention must be adaptive and fast. Adding another layer of opportunity and complexity is the use of vast streams of data from devices that are measuring location, video, behaviors, environmental conditions, and more.
  • 85. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Data analysts Data scientists Business users Engagement platforms Automation / events 1. Real-time engagement requires personas that develop the analytics, and platforms for engaging and automating processes
  • 86. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless AWS Glue 2. Real-time systems are built from a base of advanced data processing Amazon EMR MLlib Deep Learning AmazonML Amazon Athena
  • 87. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless AWS Glue Amazon Kinesis 3. Events are pipelined through Kinesis, into multiple streams, at scale Amazon EMR MLlib Deep Learning AmazonML Amazon Athena
  • 88. Also possible with Spark Streaming! Amazon Kinesis EMR with Spark Streaming KinesisUtils.createStream(‘twitter-stream’) .filter(_.getText.contains(‘Big Data’)) .countByWindow(Seconds(5)) Counting tweets on a sliding window
  • 89. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon S3 Stream Data Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless Amazon Kinesis 4. Event data is given context and structure in EMR and pushed for batch Amazon EMR AWS Glue Amazon EMR MLlib Deep Learning AmazonML Amazon Athena
  • 90. Amazon Kinesis Firehose • Fully managed data streaming service to ingest and capture data into your storage or data warehouse • Ability to batch load, compress or encrypt streaming data • Elastic to scale to any throughput (no more sharding) • Charged only per GB processed ($0.035 per GB)
  • 91. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon S3 Stream Data Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Amazon Kinesis Firehose Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless Amazon Kinesis 5. Kinesis Firehose pumps events into a DWH for near real-time analysis Amazon EMR Amazon EMR MLlib Deep Learning AmazonML AWS Glue Amazon Athena
  • 92. AWS Lambda • Use AWS Lambda to clean and massage incoming data • Write code to load data sources (S3, DynamoDB) automatically in your data warehouse (e.g. Amazon Redshift) • React in real-time to incoming events in Amazon Kinesis Amazon Lambda Amazon Redshift Amazon Kinesis
  • 93. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon S3 Stream Data Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Amazon Kinesis Firehose Event Scoring AWS Lambda Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless Amazon Kinesis 6. The event is streamed to a scoring server for processing Amazon EMR Amazon EMR MLlib Deep Learning AmazonML AWS Glue Amazon Athena
  • 95. Unlimited Replays Returns an MP3 or audio stream Lightning Fast Response Fully Managed and Low Cost Amazon Polly Turn text into lifelike speech using deep learning technologies to synthesize speech that sounds like a human voice
  • 96. Amazon Polly “The temperature in WA is 75°F” “The temperature in Washington is 75 degrees Fahrenheit” Amazon Polly: Text In, Life-like Speech Out
  • 97. Amazon Lex Conversational interfaces for your applications, powered by the same Natural Language Understanding (NLU) & Automatic Speech Recognition (ASR) models as Alexa Integrated development in AWS console Trigger AWS Lambda functions Multi-step conversations Continually improving ASR & NLU models Enterprise connectors Fully Managed
  • 98. Intents A particular goal that the user wants to achieve Utterances Spoken or typed phrases that invoke your intent Slots Data the user must provide to fulfill the intent Prompts Questions that ask the user to input data Fulfillment The business logic required to fulfill the user’s intent BookHotel
  • 99. Amazon Rekognition Image Recognitions and Analysis powered by Deep Learning which allows to search, verify and organize millions of images Easy to use Batch Analysis Real-time Analysis Continually Improving Low Cost
  • 101. Demographic Data Facial Landmarks Sentiment Expressed Image Quality Brightness: 25.84 Sharpness: 160 General Attributes
  • 102. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon S3 Stream Data Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Amazon Kinesis Firehose Event Scoring Amazon AI AWS Lambda Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless Amazon Kinesis 7. Language, intent, and image processing are run and sent for scoring Amazon EMR Amazon EMR MLlib Deep Learning AmazonML AWS Glue Amazon Athena
  • 103. See it live in action!
  • 104. Serverless Rekognition Demo Serverless website that uses Rekognition to identify faces and classify pictures Amazon S3 AWS Lambda Amazon API Gateway Amazon DynamoDB Amazon Rekognition Mobile CodeFor.Cloud/image
  • 105. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon S3 Stream Data Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Amazon Kinesis Firehose Event Scoring Amazon AI AWS Lambda Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless Amazon Kinesis 8. Simple analytical models are checked on-demand against Amazon ML Amazon EMR Amazon EMR MLlib Deep Learning AmazonML AWS Glue Amazon Athena
  • 106. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon S3 Stream Data Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Amazon Kinesis Firehose Event Scoring Amazon AI AWS Lambda Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless Amazon Kinesis 9. Complex analytical models are scored against coded models (PMML) Amazon EMR Amazon EMR MLlib Deep Learning AmazonML AWS Glue Amazon Athena
  • 107. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon S3 Stream Data Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Amazon Kinesis Firehose Event Scoring Amazon AI AWS Lambda Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless Amazon Kinesis 10. Deep learning models are scored against imported models (eg JSON) Amazon EMR Amazon EMR MLlib Deep Learning AmazonML AWS Glue Amazon Athena
  • 108. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon S3 Stream Data Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS AWS Storage Gateway Amazon Kinesis Firehose Event Scoring Amazon AI AWS Lambda AWS Lambda Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless Amazon Kinesis 11. Scored response to the event is processed to be pushed for action Amazon EMR Amazon EMR MLlib Deep Learning AmazonML AWS Glue Amazon Athena
  • 109. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon S3 Stream Data Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS Amazon DynamoDB AWS Storage Gateway Amazon Kinesis Firehose Event Scoring Amazon AI AWS Lambda AWS Lambda Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless Amazon Kinesis 12. Recommendations are pushed to DynamoDB for low latency serving Amazon EMR Amazon EMR MLlib Deep Learning AmazonML AWS Glue Amazon Athena
  • 110. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon S3 Stream Data Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS Amazon SQS AWS Storage Gateway Amazon Kinesis Firehose Event Scoring Amazon AI AWS Lambda AWS Lambda Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless Amazon Kinesis 12. Actions are pushed to RDS and SQS for business process automation Amazon DynamoDB Amazon EMR Amazon EMR MLlib Deep Learning AmazonML AWS Glue Amazon Athena
  • 111. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Real-time engagement Interactive customer experience, event-driven automation, fraud detection Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon S3 Stream Data Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS Amazon DynamoDB Amazon SQS AWS Storage Gateway Amazon Kinesis Firehose Event Scoring Amazon AI AWS Lambda AWS Lambda Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless Amazon Kinesis Amazon EMR Amazon EMR MLlib Deep Learning AmazonML AWS Glue Amazon Athena
  • 112. See it live in action!
  • 113. Amazon Kinesis Twitter Stream Amazon Lambda Demo: Live Twitter Feed Analysis * https://blog.twitter.com/2013/new-tweets-per-second-record-and-how Twitter Blog* - On a typical day (in 2013): • More than 500 million Tweets sent • Average 5,700 TPS Amazon Elasticsearch Service
  • 114. Automation of self-service, deployment, policy, and quality assurance Technology: Self-service, on-demand provisioning, DevOps, spot pricing, Cloud Formations, security automation, performance monitoring (CW&XR), global rollouts Common initiatives Self-service: • Application catalog or portal for all employees, availability determined by role • Service provisioning backed by automation of policy and governance Agile development: Use of DevOps to allow very few resources to deploy globally • CI/CD for software release, build/test, and deployment automation • Templated infrastructure provisioning, and configuration management • Business rules and policies are "gold coded" to be used for all deployments • Use of Security by Design (SbD) to codify network, O/S, and encryption Comprehensive monitoring: Assurance of SLA and issue remediation • Logging and monitoring of all API calls and executions to ensure SLAs are met • Analysis of performance variance for faster root cause analysis Outcome 4 : Automate for expansive reach
  • 115. AWS Cloud TrailAWS IAM Amazon CloudWatchAWS KMS Ingest ServingData sources Speed (Real-time) Scale (Batch) Automate for expansive reach Automation of self-service, deployment, policy, and quality assurance Transactions AWS Database Migration Service AWS Direct Connect Internet Interfaces Amazon S3 Stream Data Amazon Kinesis Amazon EMR Amazon ElasticSearch Amazon RedShift Amazon RDS Amazon DynamoDB Amazon SQS AWS Storage Gateway Amazon Kinesis Firehose Event Scoring Amazon AI AWS Lambda AWS Lambda Data analysts Data scientists Business users Connected devices Web logs / cookies Social media Engagement platforms Automation / events ERP Amazon S3 Raw Data Amazon S3 Staged Data (Data Lake) Amazon EMR ETL Amazon S3 Clean Data Amazon S3 Schemaless Amazon Kinesis Amazon EMR Amazon EMR MLlib Deep Learning AmazonML AWS Glue Amazon Athena AWS DevOps
  • 116. Next steps for you to be the Big Data hero
  • 117. Sharpen your skills (Singapore) Attend the official AWS Training course organized by AWS Authorized local training partner – Bespoke Training Services (www.bespoketraining.com). Join the AWS Jumpstart (2 hr) session and hear from our customers and partners on how they enabled their teams and successfully deployed on AWS. Also stand a chance to win free seat to the above courses. Point of contact – Gilbert Cheo - gilbert@bespoketraining.com Courses Date Architecting on AWS 28 Feb-2 Mar / 14-16 March System Operations on AWS 22-24 Feb Developing on AWS 4-6 April Big Data on AWS 4-6 April Date Venue AWS Singapore, Church Street, Capital Square, #10-01, Singapore 049481
  • 118. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. http://bit.ly/summitsg April 11 – Marina Bay Sands - Singapore Register Now!
  • 119. You, the Big Data hero! Start with the persona De-couple to scale Experiment and iterate Deploy with automation

Hinweis der Redaktion

  1. 50 mins
  2. 10:04 Rodos comment on available
  3. Amazon EMR simplifies big data processing, providing a managed Hadoop framework that makes it easy, fast, and cost-effective for you to distribute and process vast amounts of your data across dynamically scalable Amazon EC2 instances. The EMR File System allows EMR clusters to efficiently and securely use Amazon S3 as an object store for Hadoop. You can store your data in Amazon S3 and use multiple Amazon EMR clusters to process the same data set. Each cluster can be optimized for a particular workload, which can be more efficient than a single cluster serving multiple workloads with different requirements. For example, you might have one cluster that is optimized for I/O and another that is optimized for CPU, each processing the same data set in Amazon S3. Additionally, by storing your input and output data in Amazon S3, you can shut down clusters when they are no longer needed.  Amazon EMR makes it easy to use spot instances so you can save both time and money. Amazon EMR clusters include 'core nodes' that run HDFS and ‘task nodes’ that do not; task nodes are ideal for Spot because if the Spot price increases and you lose those instances you will not lose data stored in HDFS.  Amazon EMR supports powerful and proven Hadoop tools such as Hive, Pig, HBase, and Impala. Additionally, it can run distributed computing frameworks besides Hadoop MapReduce such as Spark or Presto using bootstrap actions. You can also use Hue and Zeppelin as GUIs for interacting with applications on your cluster.
  4. Amazon EMR simplifies big data processing, providing a managed Hadoop framework that makes it easy, fast, and cost-effective for you to distribute and process vast amounts of your data across dynamically scalable Amazon EC2 instances. The EMR File System allows EMR clusters to efficiently and securely use Amazon S3 as an object store for Hadoop. You can store your data in Amazon S3 and use multiple Amazon EMR clusters to process the same data set. Each cluster can be optimized for a particular workload, which can be more efficient than a single cluster serving multiple workloads with different requirements. For example, you might have one cluster that is optimized for I/O and another that is optimized for CPU, each processing the same data set in Amazon S3. Additionally, by storing your input and output data in Amazon S3, you can shut down clusters when they are no longer needed.  Amazon EMR makes it easy to use spot instances so you can save both time and money. Amazon EMR clusters include 'core nodes' that run HDFS and ‘task nodes’ that do not; task nodes are ideal for Spot because if the Spot price increases and you lose those instances you will not lose data stored in HDFS.  Amazon EMR supports powerful and proven Hadoop tools such as Hive, Pig, HBase, and Impala. Additionally, it can run distributed computing frameworks besides Hadoop MapReduce such as Spark or Presto using bootstrap actions. You can also use Hue and Zeppelin as GUIs for interacting with applications on your cluster.
  5. …..And by the way if you thought that these innovative digital use cases were only happening globally outside, you could NOT be more wrong. A lot of large Indian companies are increasing their digital presence and seeing massive success in those areas in doing those….CLICK
  6. More : https://aws.amazon.com/blogs/aws/ec2-instance-update-x1-sap-hana-t2-nano-websites/