Node.CQ - Creating Real-time Data Mashups with Node.JS and Adobe CQ
1. NODE.CQ
CREATING REAL-TIME DATA MASHUPS WITH
NODE.JS AND ADOBE CQ
Joshua Miller
NASCAR Digital Media
jsmiller@nascar.com
@jo5h | www.jo5h.com
2. PROBLEM SCENARIO
We want to mix authored content from Adobe CQ with Real-
Time Race Data from our Timing and Scoring system.
Combining Slowly Changing Dimensions such as Driver
Team Name, Vehicle Manufacturer Name, Track Information,
etc. with Constantly Changing Metrics such as Last Lap
Speed, Driver Position, Lap Number, etc.
Adobe CQ is great at managing the authored content, but is
less adept at handling the real-time data. The time it takes to
ingest the data and replicate it is too long – the data will have
already changed.
7. WORKING WITH NODE.JS
SHOULD BE LIKE WORKING
WITH BUILDING BLOCKS
Node.JS has a broad and diverse developer community. If
you want to build something with Node, chances are
someone else has already done the same thing.
Before you start building from scratch, look at the packages
that already exist on NPM (http://npmjs.org)
Using NPM (Node Package Manager), you can install
packages that perform the tasks you need to accomplish.
8. ELEMENTS OF A NODE.JS
APPLICATION
Web Server / Framework
• Express
• Flatiron
Logging Service
• Morgan
• Winston
Configuration
• Nconf
• config
Promise Library
• Q
• promise
Built-In Services
• HTTP / HTTPS
• FileSystem
• Crypto
• Events
• Stream
• Etc.
9. NODE.JS GOTCHAS
Some things about Node.JS are a bit different from working with
other technologies.
• NODE.JS IS ASYNCHRONOUS
Getting familiar with JavaScript Promises and Deferred
Libraries or understanding an developing very clear callback
chains is a must for working with Node.JS effectively
• NODE.JS IS A PACKAGE-DRIVEN TECHNOLOGY
Getting comfortable working with a Package Manager (NPM)
is a must for working with Node.JS effectively
• YOUR APPLICATION IS YOUR SERVER
There is no Apache or nginx or IIS to work with. You build
your server, or use a framework like Express or Flatiron
• NODE.JS IS AS FAULT-TOLERANT AS YOU MAKE IT
Building solid functionality with lots of error handling and
good logging is important
10. WTF DID YOU JUST BUILD?
Node.JS is Package-Driven and NPM provides you with a
wealth of resources for working with Node, but be careful
what packages you choose. If you see a package that has
25,000 downloads and a vibrant development
history on GitHub then you’re probably safe.
If you’re the only one that has downloaded this
package this calendar year and the last commit
was made in 2010, you might want to keep
looking for a more popular package.
Just because you have bricks in your bin,
you don’t have to use them all together.
12. USING ADOBE CQ’S REST API
WITH NODE.JS
Adobe CQ is built on top of Apache Sling – a Web Framework
that provides a REST API to CRX - the Java Content
Repository that sits beneath Adobe CQ
You can directly query CRX using simple REST commands
and have the output formatted as JSON
JSON data can be directly consumed by the Node.JS
application independent of your website’s front-end
13. MAKING RESTFUL REQUESTS
TO ADOBE CQ CONTENT
It’s simple enough to extract content using the RESTful API
in Adobe CQ. Take for example Race Data stored at the path:
/content/nascar/lookups/events/sprint-cup-series/2014/
You can easily view this data using the following URL:
http://10.196.135.9:4503/content/nascar/lookups/events/sprint
-cup-series/2014.infinity.json
Note the “infinity” selector in the URL – this can be replaced
with a number indicating the node-depth from which you
wish to return data
http://10.196.135.9:4503/content/nascar/lookups/events/sprint
-cup-series/2014.2.json
14. USING THE NODE-DEPTH
SELECTOR WITH ADOBE CQ
USING THE INFINITY
NODE-DEPTH SELECTOR
USING A NUMERIC NODE-
DEPTH SELECTOR
Returns either all child
nodes at the given path,
or an array of the
available numeric node-
depth selectors if the
structure is deemed too
large.
Returns data from the root
path, and all child nodes
at the node-depth
indicated by the selector.
16. HOW DO WE
USE THE DATA
WITH NODE.JS?
NOW THAT WE HAVE THIS DATA
17. HOW DO WE USE THIS
DATA?
By itself, the data that comes from CQ is only as useful as
the underlying data structure, the power of this data comes
in our ability to use Node.JS to quickly extract the data and
then mash it up with other data sources.
Using Node.JS, not only can we query data from CRX, we can
query data from a number of sources and combine our CRX
data with other feeds to create new data sources.
This enables us to mix authored content from CRX with Real-
Time data from our Timing and Scoring feed to create a new,
single feed that can be used in our Mobile product.
18. HOW IS THE DATA JOINED
INTO A NEW DATA SOURCE?
Creating the feed mashup is not out-of-the-box functionality
for Node.JS – we have to custom-code a method by which to
join feeds together
Node.JS enables us to build an application using the building
blocks we discussed earlier, but also allows us to create new,
custom blocks with which to build
Without too much effort, we have created a package that
allows feeds to be joined together using the same Primary
and Foreign Key relationships you would find in a typical
RDBMS product.
19. HOW IS THE DATA JOINED
INTO ONE FEED?
• Using simple JSON syntax, we can define a new feed that
is comprised of one or more feeds.
• Each feed has a “join” condition that allows a the feed to
be joined to the collection based on a specific JSON node
value.
• Special syntax allows for variable replacement from URL
parameters
• Special syntax allows for values from the new feed to be
used throughout the feed
• Includes custom functions such as Date and String
Formatting
• Includes dependency conditions where field values are
calculated and/or displayed based on the value of other
fields
21. GETTING LIVE DATA FROM
THE RACETRACK
During a race, NASCAR vehicles are monitored via
transponders placed in the cars. As the cars cross over fiber
optic sensors in the track, the data is transmitted to a piece
of software called TimeGear.
TimeGear tracks the speed of each car, its position relative to
the other race cars and feeds this data into the Timing and
Scoring system.
Timing and Scoring provides a feed that is consumed by
Apex, our Mobile Cacher application, which streams the
JSON feed out to Akamai where the data is consumed by
internal applications and third-parties such as Yahoo!, Fox
Sports and ESPN.
22. INTEGRATING OUR REAL-
TIME DATA FEED
Using the same syntax and the same data providers, we can
query our Real-Time race data directly from Timing and Scoring,
or directly from Akamai to reduce the load on the T&S systems.
Without modifying any code, provided a relationship can be
found in the data, we can now merge any JSON data source into
our feed.
This allows us to merge our Real-Time race statistics right into
our authored CQ content, providing a richer and more in-depth
feed for our Mobile application without the delay of first
ingesting the race data into Adobe CQ.
Now that our data is available in a new format, we can provide a
single stream of data to the NASCAR Mobile application,
reducing the number of calls that need to be made from a
mobile device.
23. EXTENDING OUR DATASET
WITH THIRD-PARTY SERVICES
Given the flexibility of this data aggregator, we can now start
to lay new and powerful data layers from disparate source on
top of our existing data without having to store that data in
CQ.
For example, we can pull Real-Time Weather Conditions into
our data based on the zip code of the track. We could pull
track records to note if a driver’s lap speed was the fastest in
the track’s history. We could even pull in Sponsor
information based on the current Race Leader.
We accomplish all of this without the need to add to the
storage requirements of our application, or write custom
aggregators for external content.
25. COULDN’T WE HAVE DONE
THIS USING CQ?
Of course, we could have accomplished the same end-result
using only Adobe CQ and some custom Java code. There are
some real benefits to using Node.JS in this scenario though:
• There is no code to compile and new feeds only require
JSON configuration
• Node.JS is an extremely high-throughput platform. We can
serve hundreds of simultaneous connections per second.
• We reduce the load on our CQ environment by offloading
tasks to an application with fewer hardware requirements
• We don’t use an large, complex web framework to deliver
small streams of data with no user interface requirements
26. IS NODE.JS REALLY THAT
MUCH MORE PERFORMANT?
We have used Node.JS for a number of new tasks here at
NASCAR Digital Media lately and have found it to be
incredibly performant. We recently launched a new RaaS
implementation with Gigya and use Node.JS to authenticate
users.
During our load tests, we found that we could serve in 10
minutes of sustained load, all of the traffic that we expected
the Node service to experience within the entire race season.
In fact, we have found that our load tests typically max-out
not because of Node’s inability to serve more requests, but
because MySQL starts to queue requests, or Gigya begins to
throttle requests-per-second.