Humana, like many companies, is tackling the challenge of creating real-time insights from data that is diverse and rapidly changing. This is our journey of how we used MongoDB to combined traditional batch approaches with streaming technologies to provide continues alerting capabilities from real-time data streams.
Ensuring Technical Readiness For Copilot in Microsoft 365
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-Time Alerting to Improve Business Outcomes
1. From Pharmacist to
Analyst:
Leveraging MongoDB for
Real-Time Alerting to
Improve Business Outcomes
Presented by: David Munguia
Humana: Healthcare Services / Platforms
3. HUMANA HCS PLATFORMS
WHO ARE WE?
• High Performance Rule Based
Analytics
• HEDIS Reporting & Compliance
• Improved Member Outcomes
• Batch and Real-time Deployments
• SpecialtyRx
• Official Kickoff November 2017
• Stream Based Platform
• MongoDB Journey Began Here!
4. HUMANA HCS PLATFORMS: SpecialtyRx
NRT
Pipeline
Analytic
Services
Batch
Driver
Alert
Sink
ETL
Services
Message Bus
• Provide Alerting for Specialty Medications (e.g.
Humera, Embril, etc) at Point-of-Service
• Supports Daily Batch Runs
• Concurrent ETL Jobs
• Modern Streaming Platform
• MongoDB & Apache Spark Central
Technologies
• Language Agnostic (C/C++, Scala, Python)
• Clinical Quality Language Engine (CQL)
7. HUMANA HCS PLATFORMS
“No Battle Plan Survives
Contact with the Enemy.”
Helmuth von Moltke
“Everyone has a plan ‘till you
get punched in the mouth.”
Mike Tyson
8. HUMANA HCS PLATFORMS
LESSON 1 – TRY IT
• I Hated Hearing it!!!
• No Two Use-Cases are the Same
• MongoDB Made it Doable
• Our Mindset
9. HUMANA HCS PLATFORMS
LESSON 2 – USE AVAILABLE RESOURCES
• Befriend your MongoDB
Representative
• Sigfrido Narvaez
• Giovanni DeVita
• Engage MongoDB Professional Services
• Suggest 6 Day Engagement over 2 Visits
• When you have Security Concerns
• Consultants have Undocumented Tools
• MongoDB University
• Stack Overflow
• Github Repositories!
10. HUMANA HCS PLATFORMS
LESSON 3 – MongoDB ≠ Oracle
• SQL-Let it Go!
• JSON is a Blessing
• $lookup is Like C/C++ goto
• Reserved for Special Cases
• We’re not that Special
• MongoDB has Fewer Features
• But, That is Good Thing
• Simplifies ‘Try-ability’
• Plenty of Usable Build Patterns
• Bucket, Subset, Tree & Graph
11. HUMANA HCS PLATFORMS
LESSON 3 – MongoDB ≠ Oracle
• SQL-Let it Go!
• JSON is a Blessing
• $lookup is Like C/C++ goto
• Reserved for Special Cases
• We’re not that Special
• MongoDB has Fewer Features
• That is Good
• Try-ability
• Plenty of Usable Build Patterns
• Bucket, Subset, Tree & Graph
CREATE TABLE member (
member_id NUMBER generated BY DEFAULT AS identity,
first_name VARCHAR2(50) NOT NULL,
last_name VARCHAR2(50) NOT NULL,
PRIMARY KEY(member_id)
);
db.createCollection(‘member’); // assuming member_id = _id
Or
SELECT *
FROM members a,
member_address b
WHERE a.member_id = b.member_id
AND a.member_id = ‘abcd’;
db.patients.find( {_id: ‘abcd’ } );
12. HUMANA HCS PLATFORMS
LESSON 4 – SCHEMAS & INDEXING
• Have a Complete Understanding
of your Access Patterns
• Utilize Covered Queries
• Index Order Matters
• Your First Version will Suck
13. HUMANA HCS PLATFORMS
LESSON 5 – SIZING CONSIDERATIONS
• Stay Clear of 16MB Document Limit
• Keep to Less Than 1MB
• 800B Avg.
• Our Inflection Point: 200M Documents
• Prefer Foreground Indexing
• Background Indexing Can Slow Normal
Operations
• Smaller Index Sizes
14. HUMANA HCS PLATFORMS
LESSON 6 – PRE-SPLITTING
• Pre-Splitting is the Process of Pre-
Chunking and Balancing Data Blocks
• 50% Science, 25% Magic, 25% Voodoo
• Automatic Splitting and Balancing is Slow
and Consumes Resources
• The Choice of Shard Key and Pre-Splitting
Strategy is Crucial
• When do you Pre-Split? Refer to Lesson 1
15. HUMANA HCS PLATFORMS
LESSON 6 – PRE-SPLITTING • Shard Key: member identifier
• Non-Monotonic String Type
• Oracle Remnant of Hash Partitioning
16. • Shard Key: member identifier
• Non-Monotonic String Type
• Oracle Remnant of Hash Partitioning
• Try 1: Naïve member id Splitter
• Divided min/max keys into n bins
• Keys were “clumpy”
HUMANA HCS PLATFORMS
LESSON 6 – PRE-SPLITTING
17. HUMANA HCS PLATFORMS
LESSON 6 – PRE-SPLITTING • Shard Key: member identifier
• Non-Monotonic String Type
• Oracle Remnant
• Try 1: Naïve member id Splitter
• Divided min/max keys into n bins
• Keys were “clumpy”
18. HUMANA HCS PLATFORMS
LESSON 6 – PRE-SPLITTING • Shard Key: member identifier
• Non-Monotonic String Type
• Oracle Remnant
• Try 1: Naïve member id Splitter
• Divided min/max keys into n bins
• Keys were “clumpy”
Still Created 25-40 Splits When Loading
19. HUMANA HCS PLATFORMS
LESSON 6 – PRE-SPLITTING • Shard Key: member identifier
• Non-Monotonic String Type
• Oracle Remnant
• Try 1: Naïve member id Splitter
• Divided min/max keys into n bins
• Keys were “clumpy”
• Try 2: MongoDB did Initial Work (5 days/Col)
1. Established IOPS, Network, Oplog Params
2. Stored _id, min/max for ns(database) in new Collection
3. Used python and “master_chunk” collection to auto-
generate pre-splitting JS scripts for each sharded
collection.
4. Load Data
20. HUMANA HCS PLATFORMS
LESSON 6 – PRE-SPLITTING • Shard Key: member identifier
• Non-Monotonic String Type
• Oracle Remnant
• Try 1: Naïve member id Splitter
• Divided min/max keys into n bins
• Keys were “clumpy”
• Try 2: MongoDB did Initial Work (5 days/Col)
1. Established IOPS, Network, Oplog Params
2. Stored _id, min/max for ns(database) in new Collection
3. Used python and “master_chunk” collection to auto-
generate pre-splitting JS scripts for each sharded
collection.
21. HUMANA HCS PLATFORMS
APACHE SPARK & MONGODB
mongos
CQL
mongos
CQL
[Message Pipeline]
[Message Pipeline]
mongos
[Message Pipeline]
v1
v2
• MongoDB Connector for Apache Spark
• Highly Recommended
• Scalable and Resilient
• Perfect for Many ETL Jobs
• Leverages MongoDB Agg. Pipelines
• Optimized for MongoDB Topology
• Allowed us to Concentrate on Problem
• Thank you Bryan!
22. HUMANA HCS PLATFORMS
APACHE SPARK & MONGODB
mongos
CQL
mongos
CQL
[Message Pipeline]
[Message Pipeline]
mongos
[Message Pipeline]
v1
v2
• MongoDB Connector for Apache Spark
• Scalable and Resilient
• Strong Contender for ETL Jobs
• Leverages MongoDB Agg. Pipelines
• Optimized for MongoDB Topology
• Allowed us to Concentrate on Problem
• Thank you Bryan!
23. HUMANA HCS PLATFORMS
LESSON 7 – TURN THE TABLES
• Exploit New Opportunities
• Active Medications
• Pre-Filtering (CQL)
25. HUMANA HCS PLATFORMS
LESSON 7 – TURN THE TABLES
• Turn Disadvantages into Advantages
• Active Medications
• Pre-Filtering (CQL)
26. HUMANA HCS PLATFORMS
LESSON 7 – TURN THE TABLES
• Turn Disadvantages into Advantages
• Active Medications
• Pre-Filtering (CQL)
def preFilteringPopulation (
patient_collection: String,
engineList: String,
grpcMessageSize: Int,
batchType: String): Boolean = {
//-- Configs assumed to be passed-in through spark-submit
val sparkSession = SparkSession.builder().getOrCreate()
var status = true
try {
var response = null.asInstanceOf[com.tsi.grpc.EngineServer.BatchPopulationResponse]
val engines = engineList.split(",").map(_.trim).toList
var flag = false
var idx = 0
while ( !flag && idx < engines.length ){
var client = new gRpcEngineClient(engines(idx), grpcMessageSize)
response = client.batchPopulation(patient_collection,batchType,rundate_str)
if (response == null || response.status == false) {
println("Pre-filtering process failed with engine=" + engines(idx))
idx += 1
}…
27. HUMANA HCS PLATFORMS: LESSON 7
MongoDB in the Cloud & Other Weirdness
• Use Dedicated IOPS (mongod and ETL Instances)
• Take Advantage of Mutable Resources
• “aws ec2 modify-volume --iops 20000 --volume-id vol-02cbed099c22xxxxx --
profile prof1 --region us-east-1”
• Keep Hyperthreading On
• When you have $match Everything Looks Like a Nail
• C++ Stream Builders Are Awesome
• Must Iterate through Iterator when Using $out
• Setting Write Concerns when NOT Writing (not pretty)
28. HUMANA HCS PLATFORMS: LESSON 7
MongoDB in the Cloud & Other Weirdness
• Use Dedicated IOPS (mongod and ETL Instances)
• Take Advantage of Mutable Resources
• “aws ec2 modify-volume --iops 20000 --volume-id vol-02cbed099c22xxxxx --
profile prof1 --region us-east-1”
• Keep Hyperthreading On
• When you have $match Everything Looks Like a Nail
• C++ Stream Builders Are Awesome
• Must Iterate through Iterator when Using $out
• Setting Write Concerns when NOT Writing (not pretty)
}
],
"Encrypted": true,
"VolumeType": "io1",
"VolumeId": "vol-02cbed099c22xxxxx",
"State": "in-use",
"KmsKeyId": "arn:aws:kms:us-east-1:068955290847:key/8f175f51-60fb-4750-bbfe,
"SnapshotId": "",
"Iops": 6000,
"CreateTime": "2019-01-14T19:24:42.150Z",
"Size": 1500
}
]
}
29. HUMANA HCS PLATFORMS: LESSON 7
MongoDB in the Cloud & Other Weirdness
• Use Dedicated IOPS (mongod and ETL Instances)
• Take Advantage of Mutable Resources
• “aws ec2 modify-volume --iops 20000 --volume-id vol-02cbed099c22xxxxx --
profile prof1 --region us-east-1”
• Keep Hyperthreading On
• When you have $match Everything Looks Like a Nail
• C++ Stream Builders Are Awesome
• Must Iterate through Iterator when Using $out
• Setting Write Concerns when NOT Writing (not pretty)
30. HUMANA HCS PLATFORMS: LESSON 7
MongoDB in the Cloud & Other Weirdness
• Use Dedicated IOPS (mongod and ETL Instances)
• Take Advantage of Mutable Resources
• “aws ec2 modify-volume --iops 20000 --volume-id vol-02cbed099c22xxxxx --
profile prof1 --region us-east-1”
• Keep Hyperthreading On
• When you have $match Everything Looks Like a Nail
• C++ Stream Builders Are Awesome
• Must Iterate through Iterator when Using $out
• Setting Write Concerns when NOT Writing (not pretty)
bsoncxx::builder::stream::document filterBuilder;
filterBuilder
<< "patient_id" << patientId
<< "ndc"
<< bsoncxx::builder::stream::open_document
<< "$in" << targetDrugs
<< bsoncxx::builder::stream::close_document
<< "claim_status" << "P"
<< "date_filled"
<< bsoncxx::builder::stream::open_document
<< "$lte" << endDate
<< bsoncxx::builder::stream::close_document
<< "active_end_date"
<< bsoncxx::builder::stream::open_document
<< "$gte" << startDate
<< bsoncxx::builder::stream::close_document
<< "provider_id"
<< bsoncxx::builder::stream::open_document
<< "$exists" << true
<< "$ne" << ""
<< bsoncxx::builder::stream::close_document;
31. HUMANA HCS PLATFORMS: LESSON 7
MongoDB in the Cloud & Other Weirdness
• Use Dedicated IOPS (mongod and ETL Instances)
• Take Advantage of Mutable Resources
• “aws ec2 modify-volume --iops 20000 --volume-id vol-02cbed099c22xxxxx --
profile prof1 --region us-east-1”
• Keep Hyperthreading On
• When you have $match Everything Looks Like a Nail
• C++ Stream Builders Are Awesome
• Must Iterate through Iterator when Using $out
• Setting Write Concerns when NOT Writing (not pretty)
32. HUMANA HCS PLATFORMS: LESSON 7
MongoDB in the Cloud & Other Weirdness
• Use Dedicated IOPS (mongod and ETL Instances)
• Take Advantage of Mutable Resources
• “aws ec2 modify-volume --iops 20000 --volume-id vol-02cbed099c22xxxxx --
profile prof1 --region us-east-1”
• Keep Hyperthreading On
• When you have $match Everything Looks Like a Nail
• Must Iterate through Iterator when Using $out
• Setting Write Concerns when NOT Writing (not pretty)
• C++ Stream Constructors Are Awesome
33. HUMANA HCS PLATFORMS
SO, WHAT HAPPENED…
• Went Live May 2019 with Full
Membership
• Largest Deployment at Humana
• Over 80 Customer Measures
• Batches Went from 30 to 4 Hours
• Compressed 2 Week ETL to 5 Days
34. HUMANA HCS PLATFORMS
CLOSING
• Large Projects are Daunting, But
You Can Do It
• Have a “try it” attitude
• You Have Many Resources
• Let Go of SQL (Spark--I Know)
• Understand Your Access Patterns
• Pre-Splitting
• Turn your Disadvantages to
Advantages
35. ACKNOWLEDGEMENTS
• Humana
• Christopher Jaramillo
• Sathish John
• Raj Basavaraju
• Walter von Westphalen
• MongoDB
• Sigfrido Narvaez
• Giovanni DeVita
• MongoDB and SoCal Chapter
Team Members that made this possible: