SlideShare ist ein Scribd-Unternehmen logo
1 von 48
4/5/17
Build a Real-Time Streaming
Data Visualization System
with Amazon Kinesis Analytics
Allan MacInnis
Solutions Architect, AWS
What to Expect from the Session
• Streaming data overview
• Amazon Kinesis platform review
• Amazon Kinesis Analytics Overview
• Amazon Kinesis Analytics patterns
• Streaming data end-to-end example and walk-
through
• Amazon Kinesis Analytics Best Practices
Streaming Data Overview
Most data is produced continuously
Mobile Apps Web Clickstream Application Logs
Metering Records IoT Sensors Smart Buildings
[Wed Oct 11 14:32:52
2000] [error] [client
127.0.0.1] client
denied by server
configuration:
/export/home/live/ap/h
tdocs/test
The diminishing value of data
Recent data is highly valuable
• If you act on it in time
• Perishable Insights (M. Gualtieri, Forrester)
Old + Recent data is more valuable
• If you have the means to combine them
Processing real-time, streaming data
• Durable
• Continuous
• Fast
• Correct
• Reactive
• Reliable
What are the key requirements?
Ingest Transform Analyze React Persist
Amazon Kinesis Platform
Overview
Amazon Kinesis makes it easy to work with
real-time streaming data
Amazon Kinesis
Streams
• For Technical Developers
• Collect and stream data
for ordered, replayable,
real-time processing
Amazon Kinesis
Firehose
• For all developers, data
scientists
• Easily load massive
volumes of streaming data
into Amazon S3, Redshift,
ElasticSearch
Amazon Kinesis
Analytics
• For all developers, data
scientists
• Easily analyze data
streams using standard
SQL queries
Amazon Kinesis Streams
• Reliably ingest and durably store streaming data at low
cost
• Build custom real-time applications to process streaming
data
Amazon Kinesis Firehose
• Reliably ingest and deliver batched, compressed, and
encrypted data to S3, Redshift, and Elasticsearch
• Point and click setup with zero administration and
seamless elasticity
Amazon Kinesis Analytics
• Interact with streaming data in real-time using SQL
• Build fully managed and elastic stream processing
applications that process data for real-time visualizations
and alarms
Amazon Kinesis Analytics:
Service Overview
Kinesis Analytics
Pay for only what you use
Automatic elasticity
Standard SQL for analytics
Real-time processing
Easy to use
Use SQL to build real-time applications
Easily write SQL code to process
streaming data
Connect to streaming source
Continuously deliver SQL results
Connect to streaming source
• Streaming data sources include Kinesis
Firehose or Kinesis Streams
• Input formats include JSON, .csv, variable
column, unstructured text
• Each input has a schema; schema is inferred,
but you can edit
• Reference data sources (S3) for data
enrichment
Write SQL code
• Build streaming applications with one-to-many
SQL statements
• Robust SQL support and advanced analytic
functions
• Extensions to the SQL standard to work
seamlessly with streaming data
• Support for at-least-once processing
semantics
Continuously deliver SQL results
• Send processed data to multiple destinations
• S3, Amazon Redshift, Amazon ES (through
Firehose)
• Streams (with AWS Lambda integration for
custom destinations)
• End-to-end processing speed as low as sub-
second
• Separation of processing and data delivery
What are common uses for
Kinesis Analytics?
Generate time series analytics
• Compute key performance indicators over time periods
• Combine with static or historical data in S3 or Amazon Redshift
Analytics
Streams
Firehose
Amazon
Redshift
S3
Streams
Firehose
Custom, real-
time
destinations
Create real-time alarms and notifications
• Build sequences of events from the stream, like user sessions in a
clickstream or app behavior through logs
• Identify events (or a series of events) of interest, and react to the
data through alarms and notifications
Analytics
Streams
Firehose
Streams
Amazon
SNS
Amazon
CloudWatch
Lambda
Feed real-time dashboards
• Validate and transform raw data, and then process to calculate
meaningful statistics
• Send processed data downstream for visualization in BI and
visualization services
Amazon
QuickSight
Analytics
Amazon ES
Amazon
Redshift
Amazon
RDS
Streams
Firehose
Example: Real-time
Dashboard
Real-time Dashboard Demo
amzn.to/bigdata
Example Scenario Requirements
Data to capture every second:
• Total distinct users
• Number of users for each Operating System
• Number of users in each quadrant
Output Requirements
• Update DynamoDB table every second, with each aggregate
values
End-to-End Architecture
Amazon
Kinesis
Stream
Amazon
Kinesis
Analytics
Amazon
Cognito
Amazon
Kinesis
Stream
Amazon
DynamoDB
Amazon
Lambda
Amazon
S3
JavaScript
SDK
Data Input
Source JSON Data
• Once per second, using
JavaScript SDK:
• Unique Cognito ID
(anonymous user)
• OS
• Quadrant
• Data sent to Kinesis
Stream
Amazon
Kinesis
Stream
Amazon
Cognito
Amazon
S3
JavaScript
SDK
{
"recordTime": 1486505943.204,
"cognitoId": "us-east-1:3626e211-d2a3-447b-8231-e1f4e0486f44",
"os": "Android",
"quadrant": "A"
}
How is raw data mapped to a schema?
Amazon Kinesis stream Amazon KinesisAnalytics
cognitoID os quadrant
<guid1> Android A
<guid2> iOS B
Source Data for Kinesis Analytics
{
"recordTime": 1486505943.204,
"cognitoId": "us-east-1:<guid>",
"os": "Android",
"quadrant": "A"
}
How is streaming data accessed with SQL?
STREAM
• Analogous to a TABLE
• Represents continuous data flow
CREATE OR REPLACE STREAM DISTINCT_USER_STREAM(
COGNITO_ID VARCHAR(64),
DEVICE VARCHAR(32),
OS VARCHAR(32),
QUADRANT char(1),
DT TIMESTAMP);
How is streaming data accessed with SQL?
PUMP
• Continuous INSERT query
• Inserts data from one in-application stream to another
CREATE OR REPLACE PUMP "DISTINCT_USER_PUMP" AS
INSERT INTO "DISTINCT_USER_STREAM"
SELECT STREAM DISTINCT
"cognitoId",
...
How do we model our data?
DISTINCT_USERS_STREAM
•COGNITO_ID
•OS
•QUADRANT
•DT
DESTINATION_SQL_STREAM
•UNIQUE_USER_COUNT
•ANDROID_COUNT
•IOS_COUNT
•OTHER_OS_COUNT
•QUADRANT_A_COUNT
•QUADRANT_B_COUNT
•QUADRANT_C_COUNT
•QUADRANT_C_COUNT
SOURCE_STREAM
•cognitoID
•os
•quadrant
Kinesis
stream
Kinesis
Stream
Pump
Kinesis Analytics Application
How do we get distinct user records?
Use PUMP to insert distinct records into in-app STREAM
CREATE OR REPLACE PUMP "DISTINCT_USER_PUMP" AS
INSERT INTO "DISTINCT_USER_STREAM"
SELECT STREAM DISTINCT
"cognitoId",
"device",
"os",
"quadrant",
FLOOR(s.ROWTIME TO SECOND)
FROM "SOURCE_SQL_STREAM_001" s;
DISTINCT_USERS_STREAM
•COGNITO_ID
•OS
•QUADRANT
•DT
SOURCE_STREAM
•cognitoID
•os
•quadrant
How do we aggregate streaming data?
• A common requirement in streaming
analytics is to perform set-based operation(s)
(count, average, max, min,..) over events
that arrive within a specified period of time
• Cannot simply aggregate over an entire table
like typical static database
• How do we define a subset in a potentially infinite
stream?
• Windowing functions!
Windowing Concepts
• Windows can be tumbling or sliding
• Windows are fixed length
Output record will have the timestamp of the end of the window
1 5 4 26 8 6 4
t1 t2 t5 t6t3 t4
Time
Window1 Window2 Window3
Aggregate
Function (Sum)
18 14
Output Events
Comparing Types of Windows
• Output created at the end of the window
• The output of the window will be single event based on the
aggregate function used
Tumbling window
Aggregate per time interval
Sliding window
Windows constantly re-evaluated
How do we aggregate per second?
• Tumbling window, group by time period
CREATE OR REPLACE PUMP "OUTPUT_PUMP" AS
INSERT INTO "DESTINATION_SQL_STREAM"
SELECT STREAM
COUNT(dus.COGNITO_ID) AS UNIQUE_USER_COUNT,
COUNT((CASE WHEN dus.OS = 'Android' THEN COGNITO_ID ELSE null END)) AS ANDROID_COUNT,
COUNT((CASE WHEN dus.OS = 'iOS' THEN COGNITO_ID ELSE null END)) AS IOS_COUNT,
COUNT((CASE WHEN dus.OS = 'Windows Phone' THEN COGNITO_ID ELSE null END)) AS WINDOWS_PHONE_COUNT,
COUNT((CASE WHEN dus.OS = 'other' THEN COGNITO_ID ELSE null END)) AS OTHER_OS_COUNT,
COUNT((CASE WHEN dus.QUADRANT = 'A' THEN COGNITO_ID ELSE null END)) AS QUADRANT_A_COUNT,
COUNT((CASE WHEN dus.QUADRANT = 'B' THEN COGNITO_ID ELSE null END)) AS QUADRANT_B_COUNT,
COUNT((CASE WHEN dus.QUADRANT = 'C' THEN COGNITO_ID ELSE null END)) AS QUADRANT_C_COUNT,
COUNT((CASE WHEN dus.QUADRANT = 'D' THEN COGNITO_ID ELSE null END)) AS QUADRANT_D_COUNT,
ROWTIME
FROM "DISTINCT_USER_STREAM" dus
GROUP BY
FLOOR(dus.ROWTIME TO SECOND);
Output to Kinesis Stream
MENTION_COUNT_STREAM
•UNIQUE_USER_COUNT
•ANDROID_COUNT
•… Amazon
Kinesis Stream
{
"unique_user_count": 96,
"android_count": 50,
"ios_count": 46,
"android_count": 50,
"quadrant_a_count": 80,
"quadrant_b_count ": 10,
"quadrant_c_count ": 3,
"quadrant_d_count ": 3
}
1 record, every second
Processing a Kinesis Streams with AWS Lambda
Shard 1 Shard 2 Shard 3 Shard 4 Shard n
Kinesis Stream
. . .
. . .
• Single instance of Lambda function per shard
• Polls shard 4 times per second
• Lambda function instances created and removed automatically as stream
is scaled
Gets Records
4x per sec
Persist aggregated data in DynamoDB
Amazon
Kinesis Stream
Lambda event
source mapping
Lambda
Function
Amazon
DynamoDB
event.Records.forEach((record) => {
const payload = new Buffer(record.kinesis.data, 'base64').toString('ascii');
var docClient = new AWS.DynamoDB.DocumentClient();
var table = "user-quadrant-data";
var data = JSON.parse(payload);
var params = {
TableName: table,
Item:{
"dataType": "quadrantRollup",
"windowtime": (new Date(data.WINDOW_TIME)).getTime(),
"userCount": data.UNIQUE_USER_COUNT,
"quadrantA": data.QUADRANT_A_COUNT,
"quadrantB": data.QUADRANT_B_COUNT, ...
}
};
docClient.put(params, function(err, data) { ...
Render Dashboard
Amazon
DynamoDB
Amazon
S3
JavaScript
SDK
Static HTML,
JavaScript
Get New Data,
Once per Second
Kinesis Analytics Best
Practices
Managing Applications
Set up Cloudwatch Alarms
• MillisBehindLatest metric tracks how far
behind the application is from the source
• Alarm on MillisBehindLatest metric.
Consider triggering when 1-hour behind, on a
1-minute average. Adjust accordingly for
applications with lower end-to-end processing
needs.
Managing Applications
Increase input parallelism to improve
performance
• By default, a single source in-application
stream is created
• If application is not keeping up with input
stream, consider increasing input parallelism to
create multiple source in-application streams
Managing Applications
Limit number of applications reading from
same source
• Avoid ReadProvisionedThroughputExceeded
exceptions
• For an Amazon Kinesis Streams source, limit
to 2 total applications
• For an Amazon Kinesis Firehose source, limit
to 1 application
Defining Input Schema
• Review and adequately test inferred input
schema
• Manually update schema to handle nested
JSON with greater than 2 levels of depth
• Use SQL functions in your application for
unstructured data
Authoring Application Code
• Avoid time-based windows greater than one
hour
• Keep window sizes small during development
• Use smaller SQL queries, with multiple in-
application streams, rather than a single, large
query
Limits
• Maximum row size in an in-application stream
is 50 KB
• Maximum input parallelism is 10 in-application
streams.
• Each application supports one streaming
source, and one reference data source. The
reference data source can be no larger than 1
GB in size.
Pricing
• Pay only for what you use.
• Charged an hourly rate, based on the average
number of Kinesis Processing Units (KPU)
used to run your application.
• A single KPU provides one vCPU, and 4 GB of
memory.
• $0.11 per KPU-hour (US East).
Thank you

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Amazon Web Services
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsAmazon Web Services
 
Build a Serverless Web Application
Build a Serverless Web ApplicationBuild a Serverless Web Application
Build a Serverless Web ApplicationAmazon Web Services
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure DatabricksJames Serra
 
AWS Lake Formation Deep Dive
AWS Lake Formation Deep DiveAWS Lake Formation Deep Dive
AWS Lake Formation Deep DiveCobus Bernard
 
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduceAmazon Web Services
 
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017
How to build a data lake with aws glue data catalog (ABD213-R)  re:Invent 2017How to build a data lake with aws glue data catalog (ABD213-R)  re:Invent 2017
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017Amazon Web Services
 
비즈니스 리더를 위한 디지털 트랜스포메이션 트렌드 - 김지현, 김영현 AWS 사업개발 매니저 :: AWS re:Invent re:Cap 2021
비즈니스 리더를 위한 디지털 트랜스포메이션 트렌드 - 김지현, 김영현 AWS 사업개발 매니저 :: AWS re:Invent re:Cap 2021비즈니스 리더를 위한 디지털 트랜스포메이션 트렌드 - 김지현, 김영현 AWS 사업개발 매니저 :: AWS re:Invent re:Cap 2021
비즈니스 리더를 위한 디지털 트랜스포메이션 트렌드 - 김지현, 김영현 AWS 사업개발 매니저 :: AWS re:Invent re:Cap 2021Amazon Web Services Korea
 
Introduction to AWS Cloud Computing
Introduction to AWS Cloud ComputingIntroduction to AWS Cloud Computing
Introduction to AWS Cloud ComputingAmazon Web Services
 
Introduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsIntroduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsAmazon Web Services
 
Amazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best PracticesAmazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best PracticesAmazon Web Services
 
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...Amazon Web Services
 
The AWS Big Data Platform – Overview
The AWS Big Data Platform – OverviewThe AWS Big Data Platform – Overview
The AWS Big Data Platform – OverviewAmazon Web Services
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaAmazon Web Services
 

Was ist angesagt? (20)

Introduction to AWS Glue
Introduction to AWS GlueIntroduction to AWS Glue
Introduction to AWS Glue
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis Analytics
 
Build a Serverless Web Application
Build a Serverless Web ApplicationBuild a Serverless Web Application
Build a Serverless Web Application
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
Amazon Kinesis
Amazon KinesisAmazon Kinesis
Amazon Kinesis
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
 
BDA311 Introduction to AWS Glue
BDA311 Introduction to AWS GlueBDA311 Introduction to AWS Glue
BDA311 Introduction to AWS Glue
 
AWS Lake Formation Deep Dive
AWS Lake Formation Deep DiveAWS Lake Formation Deep Dive
AWS Lake Formation Deep Dive
 
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce(BDT208) A Technical Introduction to Amazon Elastic MapReduce
(BDT208) A Technical Introduction to Amazon Elastic MapReduce
 
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017
How to build a data lake with aws glue data catalog (ABD213-R)  re:Invent 2017How to build a data lake with aws glue data catalog (ABD213-R)  re:Invent 2017
How to build a data lake with aws glue data catalog (ABD213-R) re:Invent 2017
 
비즈니스 리더를 위한 디지털 트랜스포메이션 트렌드 - 김지현, 김영현 AWS 사업개발 매니저 :: AWS re:Invent re:Cap 2021
비즈니스 리더를 위한 디지털 트랜스포메이션 트렌드 - 김지현, 김영현 AWS 사업개발 매니저 :: AWS re:Invent re:Cap 2021비즈니스 리더를 위한 디지털 트랜스포메이션 트렌드 - 김지현, 김영현 AWS 사업개발 매니저 :: AWS re:Invent re:Cap 2021
비즈니스 리더를 위한 디지털 트랜스포메이션 트렌드 - 김지현, 김영현 AWS 사업개발 매니저 :: AWS re:Invent re:Cap 2021
 
Introduction to AWS Cloud Computing
Introduction to AWS Cloud ComputingIntroduction to AWS Cloud Computing
Introduction to AWS Cloud Computing
 
Introduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis AnalyticsIntroduction to Amazon Kinesis Analytics
Introduction to Amazon Kinesis Analytics
 
Amazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best PracticesAmazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best Practices
 
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
Effective Data Lakes: Challenges and Design Patterns (ANT316) - AWS re:Invent...
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
The AWS Big Data Platform – Overview
The AWS Big Data Platform – OverviewThe AWS Big Data Platform – Overview
The AWS Big Data Platform – Overview
 
Cost Optimisation on AWS
Cost Optimisation on AWSCost Optimisation on AWS
Cost Optimisation on AWS
 
Real-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS LambdaReal-time Data Processing Using AWS Lambda
Real-time Data Processing Using AWS Lambda
 

Ähnlich wie Build a Real-time Streaming Data Visualization System with Amazon Kinesis Analytics

Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsAdrian Hornsby
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsAmazon Web Services
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsAmazon Web Services
 
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...Amazon Web Services
 
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon KinesisSRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon KinesisAmazon Web Services
 
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Web Services
 
Path to the future #4 - Ingestão, processamento e análise de dados em tempo real
Path to the future #4 - Ingestão, processamento e análise de dados em tempo realPath to the future #4 - Ingestão, processamento e análise de dados em tempo real
Path to the future #4 - Ingestão, processamento e análise de dados em tempo realAmazon Web Services LATAM
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon KinesisAmazon Web Services
 
React Fast by Processing Streaming Data in Real-Time
React Fast by Processing Streaming Data in Real-TimeReact Fast by Processing Streaming Data in Real-Time
React Fast by Processing Streaming Data in Real-TimeAmazon Web Services
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesAmazon Web Services
 
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018Amazon Web Services
 
Getting Started with Real-time Analytics
Getting Started with Real-time AnalyticsGetting Started with Real-time Analytics
Getting Started with Real-time AnalyticsAmazon Web Services
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...Amazon Web Services
 
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016Amazon Web Services
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesAmazon Web Services
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in MotionRuhani Arora
 
Analysing All Your Streaming Data - Level 300
Analysing All Your Streaming Data - Level 300Analysing All Your Streaming Data - Level 300
Analysing All Your Streaming Data - Level 300Amazon Web Services
 
Analyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAnalyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAmazon Web Services
 

Ähnlich wie Build a Real-time Streaming Data Visualization System with Amazon Kinesis Analytics (20)

Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis Analytics
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis Analytics
 
Serverless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis AnalyticsServerless Streaming Data Processing using Amazon Kinesis Analytics
Serverless Streaming Data Processing using Amazon Kinesis Analytics
 
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
AWS re:Invent 2016: Analyzing Streaming Data in Real-time with Amazon Kinesis...
 
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon KinesisSRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
SRV420 Analyzing Streaming Data in Real-time with Amazon Kinesis
 
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
Amazon Kinesis Platform – The Complete Overview - Pop-up Loft TLV 2017
 
Path to the future #4 - Ingestão, processamento e análise de dados em tempo real
Path to the future #4 - Ingestão, processamento e análise de dados em tempo realPath to the future #4 - Ingestão, processamento e análise de dados em tempo real
Path to the future #4 - Ingestão, processamento e análise de dados em tempo real
 
Getting Started with Amazon Kinesis
Getting Started with Amazon KinesisGetting Started with Amazon Kinesis
Getting Started with Amazon Kinesis
 
React Fast by Processing Streaming Data in Real-Time
React Fast by Processing Streaming Data in Real-TimeReact Fast by Processing Streaming Data in Real-Time
React Fast by Processing Streaming Data in Real-Time
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
 
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
Analyzing Streaming Data in Real-time - AWS Summit Cape Town 2018
 
Getting Started with Real-time Analytics
Getting Started with Real-time AnalyticsGetting Started with Real-time Analytics
Getting Started with Real-time Analytics
 
Serverless Real Time Analytics
Serverless Real Time AnalyticsServerless Real Time Analytics
Serverless Real Time Analytics
 
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
AWS April 2016 Webinar Series - Getting Started with Real-Time Data Analytics...
 
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
Getting Started with Amazon Kinesis | AWS Public Sector Summit 2016
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
 
Implementing Real-Time IoT Stream Processing in Azure
Implementing Real-Time IoT Stream Processing in Azure Implementing Real-Time IoT Stream Processing in Azure
Implementing Real-Time IoT Stream Processing in Azure
 
Analysing All Your Streaming Data - Level 300
Analysing All Your Streaming Data - Level 300Analysing All Your Streaming Data - Level 300
Analysing All Your Streaming Data - Level 300
 
Analyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon KinesisAnalyzing Real-time Streaming Data with Amazon Kinesis
Analyzing Real-time Streaming Data with Amazon Kinesis
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...Sebastiano Panichella
 
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptxerickamwana1
 
General Elections Final Press Noteas per M
General Elections Final Press Noteas per MGeneral Elections Final Press Noteas per M
General Elections Final Press Noteas per MVidyaAdsule1
 
GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE
 
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityDon't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityApp Ethena
 
Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08
Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08
Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08LloydHelferty
 
Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...
Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...
Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...soumyapottola
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Sebastiano Panichella
 
Scootsy Overview Deck - Pan City Delivery
Scootsy Overview Deck - Pan City DeliveryScootsy Overview Deck - Pan City Delivery
Scootsy Overview Deck - Pan City Deliveryrishi338139
 
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitycse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitysandeepnani2260
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxRoquia Salam
 

Kürzlich hochgeladen (11)

Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
 
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
 
General Elections Final Press Noteas per M
General Elections Final Press Noteas per MGeneral Elections Final Press Noteas per M
General Elections Final Press Noteas per M
 
GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024
 
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityDon't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
 
Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08
Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08
Sunlight Spectacle 2024 Practical Action Launch Event 2024-04-08
 
Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...
Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...
Understanding Post Production changes (PPC) in Clinical Data Management (CDM)...
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
 
Scootsy Overview Deck - Pan City Delivery
Scootsy Overview Deck - Pan City DeliveryScootsy Overview Deck - Pan City Delivery
Scootsy Overview Deck - Pan City Delivery
 
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitycse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber security
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptx
 

Build a Real-time Streaming Data Visualization System with Amazon Kinesis Analytics

  • 1. 4/5/17 Build a Real-Time Streaming Data Visualization System with Amazon Kinesis Analytics Allan MacInnis Solutions Architect, AWS
  • 2. What to Expect from the Session • Streaming data overview • Amazon Kinesis platform review • Amazon Kinesis Analytics Overview • Amazon Kinesis Analytics patterns • Streaming data end-to-end example and walk- through • Amazon Kinesis Analytics Best Practices
  • 4. Most data is produced continuously Mobile Apps Web Clickstream Application Logs Metering Records IoT Sensors Smart Buildings [Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/h tdocs/test
  • 5. The diminishing value of data Recent data is highly valuable • If you act on it in time • Perishable Insights (M. Gualtieri, Forrester) Old + Recent data is more valuable • If you have the means to combine them
  • 6. Processing real-time, streaming data • Durable • Continuous • Fast • Correct • Reactive • Reliable What are the key requirements? Ingest Transform Analyze React Persist
  • 8. Amazon Kinesis makes it easy to work with real-time streaming data Amazon Kinesis Streams • For Technical Developers • Collect and stream data for ordered, replayable, real-time processing Amazon Kinesis Firehose • For all developers, data scientists • Easily load massive volumes of streaming data into Amazon S3, Redshift, ElasticSearch Amazon Kinesis Analytics • For all developers, data scientists • Easily analyze data streams using standard SQL queries
  • 9. Amazon Kinesis Streams • Reliably ingest and durably store streaming data at low cost • Build custom real-time applications to process streaming data
  • 10. Amazon Kinesis Firehose • Reliably ingest and deliver batched, compressed, and encrypted data to S3, Redshift, and Elasticsearch • Point and click setup with zero administration and seamless elasticity
  • 11. Amazon Kinesis Analytics • Interact with streaming data in real-time using SQL • Build fully managed and elastic stream processing applications that process data for real-time visualizations and alarms
  • 13. Kinesis Analytics Pay for only what you use Automatic elasticity Standard SQL for analytics Real-time processing Easy to use
  • 14. Use SQL to build real-time applications Easily write SQL code to process streaming data Connect to streaming source Continuously deliver SQL results
  • 15. Connect to streaming source • Streaming data sources include Kinesis Firehose or Kinesis Streams • Input formats include JSON, .csv, variable column, unstructured text • Each input has a schema; schema is inferred, but you can edit • Reference data sources (S3) for data enrichment
  • 16. Write SQL code • Build streaming applications with one-to-many SQL statements • Robust SQL support and advanced analytic functions • Extensions to the SQL standard to work seamlessly with streaming data • Support for at-least-once processing semantics
  • 17. Continuously deliver SQL results • Send processed data to multiple destinations • S3, Amazon Redshift, Amazon ES (through Firehose) • Streams (with AWS Lambda integration for custom destinations) • End-to-end processing speed as low as sub- second • Separation of processing and data delivery
  • 18. What are common uses for Kinesis Analytics?
  • 19. Generate time series analytics • Compute key performance indicators over time periods • Combine with static or historical data in S3 or Amazon Redshift Analytics Streams Firehose Amazon Redshift S3 Streams Firehose Custom, real- time destinations
  • 20. Create real-time alarms and notifications • Build sequences of events from the stream, like user sessions in a clickstream or app behavior through logs • Identify events (or a series of events) of interest, and react to the data through alarms and notifications Analytics Streams Firehose Streams Amazon SNS Amazon CloudWatch Lambda
  • 21. Feed real-time dashboards • Validate and transform raw data, and then process to calculate meaningful statistics • Send processed data downstream for visualization in BI and visualization services Amazon QuickSight Analytics Amazon ES Amazon Redshift Amazon RDS Streams Firehose
  • 24. Example Scenario Requirements Data to capture every second: • Total distinct users • Number of users for each Operating System • Number of users in each quadrant Output Requirements • Update DynamoDB table every second, with each aggregate values
  • 26. Data Input Source JSON Data • Once per second, using JavaScript SDK: • Unique Cognito ID (anonymous user) • OS • Quadrant • Data sent to Kinesis Stream Amazon Kinesis Stream Amazon Cognito Amazon S3 JavaScript SDK { "recordTime": 1486505943.204, "cognitoId": "us-east-1:3626e211-d2a3-447b-8231-e1f4e0486f44", "os": "Android", "quadrant": "A" }
  • 27. How is raw data mapped to a schema? Amazon Kinesis stream Amazon KinesisAnalytics cognitoID os quadrant <guid1> Android A <guid2> iOS B Source Data for Kinesis Analytics { "recordTime": 1486505943.204, "cognitoId": "us-east-1:<guid>", "os": "Android", "quadrant": "A" }
  • 28. How is streaming data accessed with SQL? STREAM • Analogous to a TABLE • Represents continuous data flow CREATE OR REPLACE STREAM DISTINCT_USER_STREAM( COGNITO_ID VARCHAR(64), DEVICE VARCHAR(32), OS VARCHAR(32), QUADRANT char(1), DT TIMESTAMP);
  • 29. How is streaming data accessed with SQL? PUMP • Continuous INSERT query • Inserts data from one in-application stream to another CREATE OR REPLACE PUMP "DISTINCT_USER_PUMP" AS INSERT INTO "DISTINCT_USER_STREAM" SELECT STREAM DISTINCT "cognitoId", ...
  • 30. How do we model our data? DISTINCT_USERS_STREAM •COGNITO_ID •OS •QUADRANT •DT DESTINATION_SQL_STREAM •UNIQUE_USER_COUNT •ANDROID_COUNT •IOS_COUNT •OTHER_OS_COUNT •QUADRANT_A_COUNT •QUADRANT_B_COUNT •QUADRANT_C_COUNT •QUADRANT_C_COUNT SOURCE_STREAM •cognitoID •os •quadrant Kinesis stream Kinesis Stream Pump Kinesis Analytics Application
  • 31. How do we get distinct user records? Use PUMP to insert distinct records into in-app STREAM CREATE OR REPLACE PUMP "DISTINCT_USER_PUMP" AS INSERT INTO "DISTINCT_USER_STREAM" SELECT STREAM DISTINCT "cognitoId", "device", "os", "quadrant", FLOOR(s.ROWTIME TO SECOND) FROM "SOURCE_SQL_STREAM_001" s; DISTINCT_USERS_STREAM •COGNITO_ID •OS •QUADRANT •DT SOURCE_STREAM •cognitoID •os •quadrant
  • 32. How do we aggregate streaming data? • A common requirement in streaming analytics is to perform set-based operation(s) (count, average, max, min,..) over events that arrive within a specified period of time • Cannot simply aggregate over an entire table like typical static database • How do we define a subset in a potentially infinite stream? • Windowing functions!
  • 33. Windowing Concepts • Windows can be tumbling or sliding • Windows are fixed length Output record will have the timestamp of the end of the window 1 5 4 26 8 6 4 t1 t2 t5 t6t3 t4 Time Window1 Window2 Window3 Aggregate Function (Sum) 18 14 Output Events
  • 34. Comparing Types of Windows • Output created at the end of the window • The output of the window will be single event based on the aggregate function used Tumbling window Aggregate per time interval Sliding window Windows constantly re-evaluated
  • 35. How do we aggregate per second? • Tumbling window, group by time period CREATE OR REPLACE PUMP "OUTPUT_PUMP" AS INSERT INTO "DESTINATION_SQL_STREAM" SELECT STREAM COUNT(dus.COGNITO_ID) AS UNIQUE_USER_COUNT, COUNT((CASE WHEN dus.OS = 'Android' THEN COGNITO_ID ELSE null END)) AS ANDROID_COUNT, COUNT((CASE WHEN dus.OS = 'iOS' THEN COGNITO_ID ELSE null END)) AS IOS_COUNT, COUNT((CASE WHEN dus.OS = 'Windows Phone' THEN COGNITO_ID ELSE null END)) AS WINDOWS_PHONE_COUNT, COUNT((CASE WHEN dus.OS = 'other' THEN COGNITO_ID ELSE null END)) AS OTHER_OS_COUNT, COUNT((CASE WHEN dus.QUADRANT = 'A' THEN COGNITO_ID ELSE null END)) AS QUADRANT_A_COUNT, COUNT((CASE WHEN dus.QUADRANT = 'B' THEN COGNITO_ID ELSE null END)) AS QUADRANT_B_COUNT, COUNT((CASE WHEN dus.QUADRANT = 'C' THEN COGNITO_ID ELSE null END)) AS QUADRANT_C_COUNT, COUNT((CASE WHEN dus.QUADRANT = 'D' THEN COGNITO_ID ELSE null END)) AS QUADRANT_D_COUNT, ROWTIME FROM "DISTINCT_USER_STREAM" dus GROUP BY FLOOR(dus.ROWTIME TO SECOND);
  • 36. Output to Kinesis Stream MENTION_COUNT_STREAM •UNIQUE_USER_COUNT •ANDROID_COUNT •… Amazon Kinesis Stream { "unique_user_count": 96, "android_count": 50, "ios_count": 46, "android_count": 50, "quadrant_a_count": 80, "quadrant_b_count ": 10, "quadrant_c_count ": 3, "quadrant_d_count ": 3 } 1 record, every second
  • 37. Processing a Kinesis Streams with AWS Lambda Shard 1 Shard 2 Shard 3 Shard 4 Shard n Kinesis Stream . . . . . . • Single instance of Lambda function per shard • Polls shard 4 times per second • Lambda function instances created and removed automatically as stream is scaled Gets Records 4x per sec
  • 38. Persist aggregated data in DynamoDB Amazon Kinesis Stream Lambda event source mapping Lambda Function Amazon DynamoDB event.Records.forEach((record) => { const payload = new Buffer(record.kinesis.data, 'base64').toString('ascii'); var docClient = new AWS.DynamoDB.DocumentClient(); var table = "user-quadrant-data"; var data = JSON.parse(payload); var params = { TableName: table, Item:{ "dataType": "quadrantRollup", "windowtime": (new Date(data.WINDOW_TIME)).getTime(), "userCount": data.UNIQUE_USER_COUNT, "quadrantA": data.QUADRANT_A_COUNT, "quadrantB": data.QUADRANT_B_COUNT, ... } }; docClient.put(params, function(err, data) { ...
  • 41. Managing Applications Set up Cloudwatch Alarms • MillisBehindLatest metric tracks how far behind the application is from the source • Alarm on MillisBehindLatest metric. Consider triggering when 1-hour behind, on a 1-minute average. Adjust accordingly for applications with lower end-to-end processing needs.
  • 42. Managing Applications Increase input parallelism to improve performance • By default, a single source in-application stream is created • If application is not keeping up with input stream, consider increasing input parallelism to create multiple source in-application streams
  • 43. Managing Applications Limit number of applications reading from same source • Avoid ReadProvisionedThroughputExceeded exceptions • For an Amazon Kinesis Streams source, limit to 2 total applications • For an Amazon Kinesis Firehose source, limit to 1 application
  • 44. Defining Input Schema • Review and adequately test inferred input schema • Manually update schema to handle nested JSON with greater than 2 levels of depth • Use SQL functions in your application for unstructured data
  • 45. Authoring Application Code • Avoid time-based windows greater than one hour • Keep window sizes small during development • Use smaller SQL queries, with multiple in- application streams, rather than a single, large query
  • 46. Limits • Maximum row size in an in-application stream is 50 KB • Maximum input parallelism is 10 in-application streams. • Each application supports one streaming source, and one reference data source. The reference data source can be no larger than 1 GB in size.
  • 47. Pricing • Pay only for what you use. • Charged an hourly rate, based on the average number of Kinesis Processing Units (KPU) used to run your application. • A single KPU provides one vCPU, and 4 GB of memory. • $0.11 per KPU-hour (US East).

Hinweis der Redaktion

  1. Narrative: The reality is that most data is produced continuously and is coming at us at lightning speeds due to an explosive growth of real-time data sources. TP: Machine data will make up 40% of our digital universe by 2020 Narrative: Whether it is log data coming from mobile and web applications, purchase data from ecommerce sites, or sensor data from IoT devices, it all delivers information that can help companies learn about what their customers, organization, and business are doing right now. TP: Customer Benefits Improve operational efficiencies, improve customer experiences, new business models Smart building: reduce energy costs, cut maintenance, increase safety and security Smart textiles: monitor skin temperature, monitor stress
  2. Narrative: So how much is this data worth? Well, it depends… Recent data is highly valuable If you act on it in time Perishable Insights (M. Gualtieri, Forrester) Old + Recent data is more valuable If you have the means to combine them Narrative: Processing real-time data as it arrives can let you make decisions much faster and get the most value from your data. But, building your own custom applications to process streaming data is complicated and resource intensive. You need to train or hire developers with the right skillsets, and then wait for months for the applications to be built and fine-tuned, and the operate and scale the application as the business grows. All of this takes lots of time and money, and, at the end of the day, lots of companies just never get there, settle for the status-quo, and live with information that is hours or days old.
  3. Narrative: You need a different set of analytical tools to collect and analyze real-time streaming data than what you have traditionally used for data at rest. With traditional analytics, you gather the information, store it in a database, and analyze it hours, days, or weeks later. Analyzing real-time data requires a different approach. Instead of running database queries on stored data, streaming analytics platforms have to process the data continuously and before the data lands in a database. And streaming data comes in at an incredible rate that can vary up and down all the time. Streaming analytics platforms have to be able to process this data when it arrives, often at speeds of millions and even tens of millions of events per hour. Key requirements of stream processing Durable: Durable ingest so that processing can be repeatable; Continuous - Need to always be processing the latest data Fast: Frequency (micro batches, size of batches, true streaming), and speed (sub-second, minute, hour) Correct: at most once, at least once, and exactly once processing; event time, ingest time, processing time. Reactive: Ability to process and respond in near real-time; feedback mechanisms to send processed data to live applications Reliable: Highly available, fast failovers
  4. Since Amazon Kinesis launch in 2013, the ecosystem evolved and we introduced Kinesis Firehose and Kinesis Analytics. Streams was launched in GA at re:Invent 2014, Firehose at re:Invent 2015, and Analytics was launched in August 2016 We have continuously iterated to make it easier for customers to use streaming data, as well as expand the functionality of real-time processing Together, these three products make up the Amazon Kinesis streaming data platform
  5. Easy administration: Simply create a new stream, and set the desired level of capacity with shards. Scale to match your data throughput rate and volume. Build real-time applications: Perform continual processing on streaming data using Kinesis Client Library (KCL), Apache Spark/Storm, AWS Lambda, and more. Low cost: Cost-efficient for workloads of any scale.
  6. Zero Admin: Capture and deliver streaming data into S3, Redshift, ElasticCache and other AWS destinations without writing an application or managing infrastructure Direct-to-data store integration: Batch, compress, and encrypt streaming data for delivery into S3, and other destinations in as little as 60 secs, set up in minutes Seamless elasticity: Seamlessly scales to match data throughput (feedback: add bullet to discuss why firehose created. Major use case)
  7. Apply SQL on streams: Easily connect to a Kinesis Stream or Firehose Delivery Stream and apply SQL skills. Build real-time applications: Perform continual processing on streaming big data with sub-second processing latencies Easy Scalability : Elastically scales to match data throughput for most workloads Easy and interactive experience: Complete most stream processing use cases in minutes, and easily progress toward sophisticated scenarios
  8. Please stay within brand by using the attached template. I’d recommend being visual – use imagery, font color, bold font, etc. in your slides. Be concise – limit your number of slides and content in them. It’s always good to have a few slides with backup information in case needed. Please also make sure there’s VP/Service Leader approval in place for all the content disclosed in the slides and in your call.
  9. Connect to streaming source Select Kinesis Streams or Firehoses as input data for your SQL code Easily write SQL code to process streaming data - Author applications with a single SQL statement, or build sophisticated applications with multi-step pipelines with advanced analytical functions Continuously deliver SQL results - Configure one to many outputs for your processed data for real-time alerts, visualizations, and batch analytics The are three major components of a streaming application; connecting to an input stream, SQL code, and an output destinations. Inputs include Kinesis Streams and Firehose. Standard SQL code consists of one or more SQL queries which make up an application. Most applications will have one to three SQL statements that perform ETL after initial schematization, and then several SQL queries to generate real-time analytics on the transformed data. This illustrates the point of the “streaming application” concept; you are chaining SQL statements together in serial and parallel through filters, joins, and merges to build a streaming graph within your application. We believe most customers will have between 5-10 SQL statements, but some customers will build sophisticated applications with 50-100 statements. Outputs includes Kinesis Streams and Firehose (S3, Redshift, and Elasticsearch(coming Chicago Summit)). The next outputs will be CloudWatch Metrics and Quicksight, but we have yet to confirm whether these will be place for GA.
  10. complex, nested JSON structures supported Inputs are continuously read, processed, and then written to output destination. This is not like a traditional database; data is continuously processed as opposed to running a query and then seeing a single result set. 1 level of nesting is supported completely; 2-levels of nesting is supported for a single structure. Deeper nesting is supported Deep nesting (2+ levels) is supported by specifying row paths of desired fields Multiple record types are supported through canonical schema and/or by manipulating structures in SQL using string functions
  11. Simple mathematical operators (AVG, STDEV, etc.) String manipulations (SUBSTRING, POSITION) Tumbling, Sliding, and Custom Windows Advanced analytics (random sampling, anomaly detection) Define (streaming) SQL queries against streams. Join streams to other streams and tables. Aggregate queries over a time range or a series of rows. Register specific SQL statements as (stream) views. STREAM is used to represent both input and in-application data that is similar to a continuously updated table Windows are simplified to make them more approachable to the average SQL ROWTIME is a special column for streams to allow user to specify processing windows; this can be used to specify windows by time of processing (recommended) or user event time At least once processing - data will be re-processed until it has been successfully delivered to a configured destination
  12. Delivery speed heavily dependent upon your SQL queries (i.e. streaming ETL will be very fast versus performing 10 minute aggregations) Output formats include JSON and CSV complex, nested JSON structures supported Inputs are continuously read, processed, and then written to output destination. This is not like a traditional database; data is continuously processed as opposed to running a query and then seeing a single result set. 1 level of nesting is supported completely; 2-levels of nesting is supported for a single structure. Deeper nesting is supported Deep nesting (2+ levels) is supported by specifying row paths of desired fields Multiple record types are supported through canonical schema and/or by manipulating structures in SQL using string functions
  13. This type of application is useful in many scenarios, such as: Continuously computing business metrics like the number of purchases, sending results to a data warehouse like Amazon Redshift for visualization with a BIT tool like Amazon Quicksight. OR calculating the number application errors and sending results to Amazon Elasticsearch for visualization with Kibana.   Amazon Kinesis Analytics provides the mechanisms to take raw, real-time data and output processed, meaningful information to a variety of destinations in real-time.
  14. Feedback: break into two, flow for introducing windowing needs revision
  15. Tumbling: Repeat Are non-overlapping An event can belong to only one tumbling window Sliding: Continuously moves forward in time Produces an output only during the occurrence of an event Every window will have at least one event Events can belong to more than one sliding window
  16. Feedback: break into two, flow for introducing windowing needs revision
  17. AWS Lambda is a compute service that runs your code in response to events and automatically manages the compute resources for you, making it easy to build applications that respond quickly to new information. AWS Lambda starts running your code within milliseconds of an event such as an image upload, in-app activity, website click, or output from a connected device. You can also use AWS Lambda to create new back-end services where compute resources are automatically triggered based on custom requests. With AWS Lambda you pay only for the requests served and the compute time required to run your code. Billing is metered in increments of 100 milliseconds, making it cost-effective and easy to scale automatically from a few requests per day to thousands per second.
  18. Feedback: put best practices into context that it’s a fully-managed service