With the introduction of connect and streams API in 2016, Apache Kafka is becoming the defacto solution for anyone looking to build a streaming platform. The community continues to add additional capabilities to make it the complete solution for streaming data.
Join us as we review the latest additions in Apache Kafka 0.10.2. In addition, we’ll cover what’s new in Confluent Enterprise 3.2 that makes it possible for running Kafka at scale.
What's new in Confluent 3.2 and Apache Kafka 0.10.2
1. 1
What’s new in Confluent 3.2?
Clarke Patterson
Sr. Director, Product Marketing
2. 2
Attend the whole series!
Simplify Governance for Streaming Data in Apache Kafka
Date: Thursday, April 6, 2017
Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET
Speaker: Gwen Shapira, Product Manager, Confluent
Using Apache Kafka to Analyze Session Windows
Date: Thursday, March 30, 2017
Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET
Speaker: Michael Noll, Product Manager, Confluent
Monitoring and Alerting Apache Kafka with Confluent Control
Center
Date: Thursday, March 16, 2017
Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET
Speaker: Nick Dearden, Director, Engineering and Product
Data Pipelines Made Simple with Apache Kafka
Date: Thursday, March 23, 2017
Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET
Speaker: Ewen Cheslack-Postava, Engineer, Confluent
https://www.confluent.io/online-talk/online-talk-series-five-steps-to-production-with-apache-kafka/
What’s New in Apache Kafka 0.10.2 and Confluent 3.2
Date: Thursday, March 9, 2017
Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET
Speaker: Clarke Patterson, Senior Director, Product Marketing
3. 3
Key themes for 3.2
Less Effort
Confluent Control Center brings
visibility into the health of a
cluster so it’s easy to surface only
those trouble spots that
count. Confluent makes
operating Kafka a snap.
Monitoring and Alerting in
Confluent Control Center
More Apps
Confluent offers the most robust
set of clients and connectors,
making it easy to onboard more
apps in a streaming platform
.NET client
Bridge to Cloud
S3 Connector
Build real-time streaming
pipelines directly to Amazon with
new S3 connector.
4. 4
Apache KafkaTM Connect API – Streaming Data Capture
JDBC
Mongo
MySQL
Elastic
Cassandra
HDFS
Kafka Connect API
Kafka Pipeline
Connector
Connector
Connector
Connector
Connector
Connector
Sources Sinks
Fault tolerant
Manage hundreds of
data sources and sinks
Preserves data schema
Part of Apache Kafka
project
Integrated within
Confluent Platform’s
Control Center
5. 5
Single Message Transforms for Kafka Connect
Modify events before storing in
Kafka:
• Mask sensitive information
• Add identifiers
• Tag events
• Store lineage
• Remove unnecessary columns
Modify events going out of
Kafka:
• Route high priority events to
faster data stores
• Direct events to different
ElasticSearch indexes
• Cast data types to match
destination
• Remove unnecessary columns
6. 6
Single Message Transforms Use Cases
• Data masking: Mask sensitive information while sending it to Kafka.
• Eg: Capture data from a relational database to Kafka, but the data includes PCI / PII information and your
Kafka cluster is not certified yet. SMT allows
• Event routing: Modify an event destination based on the contents of the event. (applies to events
that need to get written to different database tables)
• Eg: write events from Kafka to Elasticsearch, but each event needs to go to a different index - based on
information in the event itself.
• Event enhancement: Add additional fields to events while replicating.
• Eg: Capture events from multiple data sources to Kafka, and want to include information about the source
of the data in the event.
• Partitioning: Set the key for the event based on event information before it gets written to Kafka.
• Eg: reading records from a database table, partition the records in Kafka based on customer ID)
• Timestamp conversion: Time-based data conversion standardization when integrating different
systems
• Eg: There are many different ways to represent time. Often, Kafka events are read from logs, which use
something like "[2017-01-31 05:21:00,298]" but the key-value store events are being written into prefer
dates as "milliseconds since 1970"
7. 7
Architecture of Kafka Streams API, a Part of Apache Kafka
Kafka
Streams
API
Producer
Kafka Cluster
Topic TopicTopic
Consumer Consumer
Key benefits
• No additional cluster
• Easy to run as a service
• Supports large aggregations and joins
• Security and permissions fully
integrated from Kafka
Example Use Cases
• Microservices
• Continuous queries
• Continuous transformations
• Event-triggered processes
8. 8
Windowing. How do find patterns in the noise?
event-time
Alice
Bob
Dave
… …
… …
… …
9. 9
Tumbling windows answer a different type of question
event-time
Alice
Bob
Dave
… …
… …
… …
5 mins.
Eg: How many downloads did we have per user in the last 5 minutes?”
10. 10
Session windows allow us to group events based on periods of inactivity
event-time
Alice
Bob
Dave
… …
… …
… …
11. 11
Session windows allow us to group events based on periods of inactivity
event-time
Alice
Bob
Dave
… …
… …
… …
Eg: How many shows does Alice watch on average per session?”
Inactivity period
12. 12
Session windows allow us to group events based on periods of inactivity
event-time
Alice
Bob
Dave
… …
… …
… …
Eg: How many shows does Alice watch on average per session?”
18. 18
Confluent 3.2 – C# Client
High performance
Full support of Kafka protocol and
features
Supported fully-featured native C#
client
Integrates with Confluent’s Schema
Registry
Works with any version of Apache
Kafka
High reliability – honors Kafka ack
settings and retries
19. 19
Confluent 3.2 – JMS Client
Supported Kafka client,
implementing the JMS interface
Secure clients with authentication,
authorization and encryption
Integrates with Confluent’s Schema
Registry
High reliability – Supports Kafka
and JMS acknowledgments
Support for all JMS Message
Types, Headers and Properties
20. 20
Confluent 3.2 – Client Security
End-to-end encryption for REST
Proxy
ActiveDirectory integration for C#
client
21. 21
Kafka Connect API Library of Connectors
* Denotes Connectors developed at Confluent and distributed by Confluent. Extensive validation and testing has been performed.
Databases
*
Datastore/File Store
*
Analytics
*
Applications / Other
*
22. 22
CP 3.2 – New Certified & Supported Connectors
S3 Connector
• Write Avro and JSON files
• Date and time based partitions
• Exactly-once delivery
23. 23
Confluent 3.2 – Cluster Health & Administration
Cluster health dashboard
• Monitor the health of your Kafka clusters
and get alerts if any problems occur
• Measure system load, performance,
and operations
• View aggregate statistics or drill down
by broker or topic
Cluster administration
• Monitor topic configurations
24. 24
Feature Benefit Apache Kafka Confluent Open Source Confluent Enterprise
Single message
transformations
Modify single events before storing in Kafka or as they leave Kafka
Session windows Group events in a stream based on session windows
C# client
Simple library that enables streaming application development within the Kafka
framework
Client security Active directory integration for C# and end-to-end encryption for REST proxy
S3 connector Easily write Avro and Parquet files to Amazon S3
JMS client
Central registry for the format of Kafka data – guarantees all data is always
consumable
Cluster health monitoring Monitor the health of Kafka clusters and get alerts when problems occur
Cluster administration Simplify the process of administering a Kafka cluster
What’s new in Confluent 3.2?
25. 25
Feature Benefit Apache Kafka Confluent Open Source Confluent Enterprise
Apache Kafka
High throughput, low latency, high availability, secure distributed streaming
platform
Kafka Connect API Advanced API for connecting external sources/destinations into Kafka
Kafka Streams API
Simple library that enables streaming application development within the Kafka
framework
Additional Clients Supports non-Java clients; C, C++, Python, .NET and several others
REST Proxy Provides universal access to Kafka from any network connected device via HTTP
Schema Registry
Central registry for the format of Kafka data – guarantees all data is always
consumable
Pre-Built Connectors
HDFS, JDBC, Elasticsearch, Amazon S3 and other connectors fully certified
and supported by Confluent
Confluent Control Center Enables easy connector management, monitoring and alerting for a Kafka cluster
Auto Data Balancer Rebalancing data across cluster to remove bottlenecks
Replicator Multi-datacenter replication simplifies and automates MDC Kafka clusters
Support
Enterprise class support to keep your Kafka environment running at top
performance
Community Community 24x7x365
Confluent Completes Kafka
26. 26
Attend the whole series!
Simplify Governance for Streaming Data in Apache Kafka
Date: Thursday, April 6, 2017
Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET
Speaker: Gwen Shapira, Product Manager, Confluent
Using Apache Kafka to Analyze Session Windows
Date: Thursday, March 30, 2017
Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET
Speaker: Michael Noll, Product Manager, Confluent
Monitoring and Alerting Apache Kafka with Confluent Control
Center
Date: Thursday, March 16, 2017
Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET
Speaker: Nick Dearden, Director, Engineering and Product
Data Pipelines Made Simple with Apache Kafka
Date: Thursday, March 23, 2017
Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET
Speaker: Ewen Cheslack-Postava, Engineer, Confluent
https://www.confluent.io/online-talk/online-talk-series-five-steps-to-production-with-apache-kafka/
What’s New in Apache Kafka 0.10.2 and Confluent 3.2
Date: Thursday, March 9, 2017
Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET
Speaker: Clarke Patterson, Senior Director, Product Marketing
27. 27
Why Confluent? More than just enterprise software
Confluent Platform
The only enterprise open
source streaming platform
based entirely on Apache
Kafka
Professional Services
Best practice consultation for
future Kafka deployments and
optimize for performance and
scalability of existing ones
Enterprise Support
24x7 support for the entire
Apache Kafka project, not just
a portion of it
Complete support across the entire adoption lifecycle
Kafka Training
Comprehensive hands-on
courses for developers and
operators from the Apache
Kafka experts
28. 28
Get Started with Apache Kafka Today!
https://www.confluent.io/downloads/
THE place to start with Apache Kafka!
Thoroughly tested and quality
assured
More extensible developer
experience
Easy upgrade path to
Confluent Enterprise
29. 29
Discount code: kafcom17
Use the Apache Kafka community discount code to get $50 off
www.kafka-summit.org
Kafka Summit New York: May 8
Kafka Summit San Francisco: August 28
Presented by