SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
Process Mining meets
Causal Machine Learning:
Discovering Causal Rules
from Event logs
Zahra Dasht Bozorgi, Irene Teinemaa, Marlon Dumas,
Marcello La Rosa, and Artem Polyvyanyy
1
2nd International Conference on Process Mining (ICPM 2020)
Padua, Italy, 4-9 October 2020
Introduction
An approach that addresses the following:
• Discover case-level treatment
recommendations to increase the positive
outcome rate of a process;
• Identify subsets of cases to which a
recommendation should be applied;
• Estimate the causal effect and the incremental
Return-on-Investment (ROI) of a treatment.
2
Preliminaries
What is Causal Inference?
Inferring the effects of any
treatment/policy/intervention/etc.
Examples:
• Effect of a treatment on a disease
• Effect of climate change policy on emissions
• Effect of an action on a business process outcome
3
Notation:
Y denotes outcome
T denotes treatment assignment
X denotes case attributes
Preliminaries
How do we measure causal effects?
Taking the difference between potential outcomes
Example: Loan application Process
Treatment/intervention: calling the customer after offer
Outcome: whether customer selects a loan offer or not
4
World 1
E[Y1] E[Y0]
World 2
Average Treatment Effect:
ATE = E[Y1 – Y0]
Conditional Average Treatment Effect:
CATE = E[Y1 – Y0|X= x]
CATE is also known as Uplift
in marketing literature.
Preliminaries
id T Y Y1 Y0 Y1 – Y0
1 0 0 ? 0 ?
2 1 1 1 ? ?
3 1 0 0 ? ?
4 0 0 ? 0 ?
5 0 1 ? 1 ?
6 1 1 1 ? ?
5
Fundamental Problem of Causal Inference:
Missing Data Problem
Solution: Randomised Experiment
• Expensive
• Time-consuming
• Unethical
Next best solution: Observational Study
Aim
6
Event Log Treatment/Intervention Causal Rules
Proposed Framework
7
Candidate Treatment Identification
Aim: generating recommendations automatically
Method: Action rule mining
• Extension of classification rules
• Suggests actions to change class label
• Based on support
Action Rule: r = [(a: a1) ∧ (b: b1 → b2)] ⇒ [y: y1 → y2]
8
Controllable
Uncontrollable
Input: Case Attributes
Candidate Treatment Identification
9
Rule: r = [(LoanGoal: Home Improvement) ∧ (NumOfTerms: 48 → 84)] ⇒ [Selected: 0 → 1]
with support = 0.05
Pre-condition Action Outcome
Example:
In loan applications where the loan goal is home improvement, changing the number of payback months (terms)
from 48 to 84 months will increase the chance of the customer selecting the loan offer.
Example: Loan application process
Attributes: Loan Goal (Uncontrollable)
Number of terms (Controllable)
Causal Rules Discovery
Aim: Discovering subgroups where an action has a
high causal effect on the outcome
Method: Uplift Trees1
Uplift tree: predicts which cases will have a
positive outcome because a treatment is applied.
101. P. Rzepakowski and S. Jaroszewicz, “Decision trees for uplift modelling with
single and multiple treatments,”Knowl. Inf. Syst., vol. 32, no. 2,2012.
Uplift Tree
11
Splitting criterion:
D(.) a divergence measure
PT probability distribution of the outcome in the treatment group
PC probability distribution of the outcome in the control group
Divergence Metrics:
Kullback-Leibler divergence
Squared Euclidean distance
Chi-squared divergence
Uplift Tree
E[Y|T=1, X=x] – E[Y|T=0, X=x]
Recall:
Uplift = E[Y1 | X=x] – E[Y0 | X=x]
The reason is Confounding.
12
≠
Confounding
Confounding is the presence of a variable that affects both treatment assignment and the outcome.
How do we address confounding in uplift trees?
Normalisation
13
NT and NC are the number of cases in the treatment and control group respectively.
N is the total number of cases. A is a test. And H(.) is entropy.
Example Causal Rule
Example rule:
• Action: [NumOfTerms: 48 → 84]
• Objective: [Selected: 0 → 1]
• Sub-group:
a) Customers whose loan goal is not existing loan takeover AND
b) have a credit score less than 920 AND
c) their offer includes a monthly cost greater than 149 AND
d) and the first withdrawal amount is less than 8304
14
Ranking Rules using Cost-benefit Model
Parameters:
• v: value (benefit) of a positive outcome
• c: impression cost for a treatment
• u: uplift of applying a treatment
• n: size of the treated group
Net value of applying the recommendation:
Net = n × (u × v - c)
15
Dataset and Experimental setup
• The approach was implemented in Python 3.7 using
the ActionRules package for generating candidate
treatments and the CausalML package for
constructing uplift trees.
• We ran the action rule discovery algorithm on this
dataset and discovered 24 rules containing 17
distinct recommendations.
• The uplift tree was constructed for each of the 17
recommendations resulting in 8 causal rules.
• Details of the recommendations can be found in the
paper
16
BPI Challenge 2017:
• 31,509 applications (cases)
• 42,995 offers
• Outcome is whether customer selects the offer
Uncontrollable attributes:
• Application Type
• Loan Goal
• Credit Score
• Requested Amount
Controllable attributes:
• Number of Offers
• Number of Payback Terms
• Monthly Cost
• Initial Withdrawal Amount
Comparison of Results
• Increase number of term
from 48 months to more than 120 months
for customers whose credit scores are between 899 and 943 and their first withdrawal amount
is less than 8,304.
• Decrease the duration of the process
Delays are mostly due to unresponsive customers
17
Future Work
• Incorporating other types of recommendation
• Addressing other goals, such as reduction of cycle time and waste
• Adjusting for unobserved confounding effects
• Conducting complimentary evaluations such as randomised experiments
18
Thank you
Any Questions?
Zahra Dasht Bozorgi
zdashtbozorg@student.unimelb.edu.au
School of Computing and Information Systems
University of Melbourne

Weitere ähnliche Inhalte

Was ist angesagt?

オークションの仕組み
オークションの仕組みオークションの仕組み
オークションの仕組みYosuke YASUDA
 
シンギュラリティを知らずに機械学習を語るな
シンギュラリティを知らずに機械学習を語るなシンギュラリティを知らずに機械学習を語るな
シンギュラリティを知らずに機械学習を語るなhoxo_m
 
教師なしオブジェクトマッチング(第2回ステアラボ人工知能セミナー)
教師なしオブジェクトマッチング(第2回ステアラボ人工知能セミナー)教師なしオブジェクトマッチング(第2回ステアラボ人工知能セミナー)
教師なしオブジェクトマッチング(第2回ステアラボ人工知能セミナー)STAIR Lab, Chiba Institute of Technology
 
ポーカーAIの最新動向 20171031
ポーカーAIの最新動向 20171031ポーカーAIの最新動向 20171031
ポーカーAIの最新動向 20171031Jun Okumura
 
【論文紹介】 Attention Based Spatial-Temporal Graph Convolutional Networks for Traf...
【論文紹介】 Attention Based Spatial-Temporal Graph Convolutional Networks for Traf...【論文紹介】 Attention Based Spatial-Temporal Graph Convolutional Networks for Traf...
【論文紹介】 Attention Based Spatial-Temporal Graph Convolutional Networks for Traf...ddnpaa
 
金融情報における時系列分析
金融情報における時系列分析金融情報における時系列分析
金融情報における時系列分析Fujio Toriumi
 
Marketing 08 ランダム化比較試験による仮説検証
Marketing 08 ランダム化比較試験による仮説検証Marketing 08 ランダム化比較試験による仮説検証
Marketing 08 ランダム化比較試験による仮説検証Meiji University / 明治大学
 
自然言語処理による議論マイニング
自然言語処理による議論マイニング自然言語処理による議論マイニング
自然言語処理による議論マイニングNaoaki Okazaki
 
Towards Performant Video Recognition
Towards Performant Video RecognitionTowards Performant Video Recognition
Towards Performant Video Recognitioncvpaper. challenge
 
SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向
SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向
SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向SSII
 
古典的見解を越えたオーバーフィッティングの先の世界
古典的見解を越えたオーバーフィッティングの先の世界古典的見解を越えたオーバーフィッティングの先の世界
古典的見解を越えたオーバーフィッティングの先の世界西岡 賢一郎
 
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksDeep Learning JP
 
[DL輪読会]Are Sixteen Heads Really Better than One?
[DL輪読会]Are Sixteen Heads Really Better than One?[DL輪読会]Are Sixteen Heads Really Better than One?
[DL輪読会]Are Sixteen Heads Really Better than One?Deep Learning JP
 
2020 08 05_dl_DETR
2020 08 05_dl_DETR2020 08 05_dl_DETR
2020 08 05_dl_DETRharmonylab
 
画像キャプションと動作認識の最前線 〜データセットに注目して〜(第17回ステアラボ人工知能セミナー)
画像キャプションと動作認識の最前線 〜データセットに注目して〜(第17回ステアラボ人工知能セミナー)画像キャプションと動作認識の最前線 〜データセットに注目して〜(第17回ステアラボ人工知能セミナー)
画像キャプションと動作認識の最前線 〜データセットに注目して〜(第17回ステアラボ人工知能セミナー)STAIR Lab, Chiba Institute of Technology
 
Transformerを用いたAutoEncoderの設計と実験
Transformerを用いたAutoEncoderの設計と実験Transformerを用いたAutoEncoderの設計と実験
Transformerを用いたAutoEncoderの設計と実験myxymyxomatosis
 
レコメンドシステムの社会実装
レコメンドシステムの社会実装レコメンドシステムの社会実装
レコメンドシステムの社会実装西岡 賢一郎
 
時系列予測にTransformerを使うのは有効か?
時系列予測にTransformerを使うのは有効か?時系列予測にTransformerを使うのは有効か?
時系列予測にTransformerを使うのは有効か?Fumihiko Takahashi
 
Kaggle meetup #3 instacart 2nd place solution
Kaggle meetup #3 instacart 2nd place solutionKaggle meetup #3 instacart 2nd place solution
Kaggle meetup #3 instacart 2nd place solutionKazuki Onodera
 

Was ist angesagt? (20)

オークションの仕組み
オークションの仕組みオークションの仕組み
オークションの仕組み
 
シンギュラリティを知らずに機械学習を語るな
シンギュラリティを知らずに機械学習を語るなシンギュラリティを知らずに機械学習を語るな
シンギュラリティを知らずに機械学習を語るな
 
教師なしオブジェクトマッチング(第2回ステアラボ人工知能セミナー)
教師なしオブジェクトマッチング(第2回ステアラボ人工知能セミナー)教師なしオブジェクトマッチング(第2回ステアラボ人工知能セミナー)
教師なしオブジェクトマッチング(第2回ステアラボ人工知能セミナー)
 
ポーカーAIの最新動向 20171031
ポーカーAIの最新動向 20171031ポーカーAIの最新動向 20171031
ポーカーAIの最新動向 20171031
 
【論文紹介】 Attention Based Spatial-Temporal Graph Convolutional Networks for Traf...
【論文紹介】 Attention Based Spatial-Temporal Graph Convolutional Networks for Traf...【論文紹介】 Attention Based Spatial-Temporal Graph Convolutional Networks for Traf...
【論文紹介】 Attention Based Spatial-Temporal Graph Convolutional Networks for Traf...
 
金融情報における時系列分析
金融情報における時系列分析金融情報における時系列分析
金融情報における時系列分析
 
Marketing 08 ランダム化比較試験による仮説検証
Marketing 08 ランダム化比較試験による仮説検証Marketing 08 ランダム化比較試験による仮説検証
Marketing 08 ランダム化比較試験による仮説検証
 
自然言語処理による議論マイニング
自然言語処理による議論マイニング自然言語処理による議論マイニング
自然言語処理による議論マイニング
 
Towards Performant Video Recognition
Towards Performant Video RecognitionTowards Performant Video Recognition
Towards Performant Video Recognition
 
GPUいらずの高速動画異常検知
GPUいらずの高速動画異常検知GPUいらずの高速動画異常検知
GPUいらずの高速動画異常検知
 
SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向
SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向
SSII2020 [OS2-03] 深層学習における半教師あり学習の最新動向
 
古典的見解を越えたオーバーフィッティングの先の世界
古典的見解を越えたオーバーフィッティングの先の世界古典的見解を越えたオーバーフィッティングの先の世界
古典的見解を越えたオーバーフィッティングの先の世界
 
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
[DL輪読会]EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
 
[DL輪読会]Are Sixteen Heads Really Better than One?
[DL輪読会]Are Sixteen Heads Really Better than One?[DL輪読会]Are Sixteen Heads Really Better than One?
[DL輪読会]Are Sixteen Heads Really Better than One?
 
2020 08 05_dl_DETR
2020 08 05_dl_DETR2020 08 05_dl_DETR
2020 08 05_dl_DETR
 
画像キャプションと動作認識の最前線 〜データセットに注目して〜(第17回ステアラボ人工知能セミナー)
画像キャプションと動作認識の最前線 〜データセットに注目して〜(第17回ステアラボ人工知能セミナー)画像キャプションと動作認識の最前線 〜データセットに注目して〜(第17回ステアラボ人工知能セミナー)
画像キャプションと動作認識の最前線 〜データセットに注目して〜(第17回ステアラボ人工知能セミナー)
 
Transformerを用いたAutoEncoderの設計と実験
Transformerを用いたAutoEncoderの設計と実験Transformerを用いたAutoEncoderの設計と実験
Transformerを用いたAutoEncoderの設計と実験
 
レコメンドシステムの社会実装
レコメンドシステムの社会実装レコメンドシステムの社会実装
レコメンドシステムの社会実装
 
時系列予測にTransformerを使うのは有効か?
時系列予測にTransformerを使うのは有効か?時系列予測にTransformerを使うのは有効か?
時系列予測にTransformerを使うのは有効か?
 
Kaggle meetup #3 instacart 2nd place solution
Kaggle meetup #3 instacart 2nd place solutionKaggle meetup #3 instacart 2nd place solution
Kaggle meetup #3 instacart 2nd place solution
 

Ähnlich wie Process Mining Meets Causal Machine Learning: Discovering Causal Rules From Event Logs

Prescriptive Process Monitoring for Cost-Aware Cycle Time Reduction
Prescriptive Process Monitoring for Cost-Aware Cycle Time ReductionPrescriptive Process Monitoring for Cost-Aware Cycle Time Reduction
Prescriptive Process Monitoring for Cost-Aware Cycle Time ReductionMarlon Dumas
 
Learning When to Treat Business Processes: Prescriptive Process Monitoring wi...
Learning When to Treat Business Processes: Prescriptive Process Monitoring wi...Learning When to Treat Business Processes: Prescriptive Process Monitoring wi...
Learning When to Treat Business Processes: Prescriptive Process Monitoring wi...Marlon Dumas
 
Pp ts for machine learning
Pp ts for machine learningPp ts for machine learning
Pp ts for machine learningWrushali Mendre
 
[DSC Adria 23] Mirjana Pejic Bach Data mining approach to internal fraud in a...
[DSC Adria 23] Mirjana Pejic Bach Data mining approach to internal fraud in a...[DSC Adria 23] Mirjana Pejic Bach Data mining approach to internal fraud in a...
[DSC Adria 23] Mirjana Pejic Bach Data mining approach to internal fraud in a...DataScienceConferenc1
 
QA_Chapter_01_Dr_B_Dayal_Overview.pptx
QA_Chapter_01_Dr_B_Dayal_Overview.pptxQA_Chapter_01_Dr_B_Dayal_Overview.pptx
QA_Chapter_01_Dr_B_Dayal_Overview.pptxTeshome62
 
Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...
Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...
Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...Eesti Pank
 
Quality Improvement And Cos Reduction
Quality Improvement And Cos ReductionQuality Improvement And Cos Reduction
Quality Improvement And Cos ReductionHenmaidi Alfian
 
Presentation lecture 2nd quantitative techniques
Presentation lecture 2nd quantitative techniquesPresentation lecture 2nd quantitative techniques
Presentation lecture 2nd quantitative techniquesDr.ammara khakwani
 
Analyzing Market Trends And Developments
Analyzing Market Trends And DevelopmentsAnalyzing Market Trends And Developments
Analyzing Market Trends And DevelopmentsShannon Wright
 
Assessing Impacts – Methodology in Practice, Cara Maguire
Assessing Impacts – Methodology in Practice, Cara MaguireAssessing Impacts – Methodology in Practice, Cara Maguire
Assessing Impacts – Methodology in Practice, Cara MaguireOECD Governance
 
Impact Evaluation: Balancing Rigor with Reality
Impact Evaluation: Balancing Rigor with RealityImpact Evaluation: Balancing Rigor with Reality
Impact Evaluation: Balancing Rigor with RealityDonna Smith-Moncrieffe
 
Organizational development research
Organizational development researchOrganizational development research
Organizational development researchAhsan Ali
 
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRY
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRYPROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRY
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRYIJITCA Journal
 
What it takes to build a model for detecting patients that defaults from medi...
What it takes to build a model for detecting patients that defaults from medi...What it takes to build a model for detecting patients that defaults from medi...
What it takes to build a model for detecting patients that defaults from medi...Olga Zinkevych
 
PROCESS OF ORGANIZATION DEVELOPMENT
PROCESS OF ORGANIZATION DEVELOPMENTPROCESS OF ORGANIZATION DEVELOPMENT
PROCESS OF ORGANIZATION DEVELOPMENTshagun jain
 
121220012 incentives in the chinese construction industry
121220012 incentives in the chinese construction industry121220012 incentives in the chinese construction industry
121220012 incentives in the chinese construction industryAshutosh Patekar
 

Ähnlich wie Process Mining Meets Causal Machine Learning: Discovering Causal Rules From Event Logs (20)

Prescriptive Process Monitoring for Cost-Aware Cycle Time Reduction
Prescriptive Process Monitoring for Cost-Aware Cycle Time ReductionPrescriptive Process Monitoring for Cost-Aware Cycle Time Reduction
Prescriptive Process Monitoring for Cost-Aware Cycle Time Reduction
 
Learning When to Treat Business Processes: Prescriptive Process Monitoring wi...
Learning When to Treat Business Processes: Prescriptive Process Monitoring wi...Learning When to Treat Business Processes: Prescriptive Process Monitoring wi...
Learning When to Treat Business Processes: Prescriptive Process Monitoring wi...
 
Pp ts for machine learning
Pp ts for machine learningPp ts for machine learning
Pp ts for machine learning
 
[DSC Adria 23] Mirjana Pejic Bach Data mining approach to internal fraud in a...
[DSC Adria 23] Mirjana Pejic Bach Data mining approach to internal fraud in a...[DSC Adria 23] Mirjana Pejic Bach Data mining approach to internal fraud in a...
[DSC Adria 23] Mirjana Pejic Bach Data mining approach to internal fraud in a...
 
QA_Chapter_01_Dr_B_Dayal_Overview.pptx
QA_Chapter_01_Dr_B_Dayal_Overview.pptxQA_Chapter_01_Dr_B_Dayal_Overview.pptx
QA_Chapter_01_Dr_B_Dayal_Overview.pptx
 
Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...
Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...
Mykola Herasymovych: Optimizing Acceptance Threshold in Credit Scoring using ...
 
Quality Improvement And Cos Reduction
Quality Improvement And Cos ReductionQuality Improvement And Cos Reduction
Quality Improvement And Cos Reduction
 
Presentation lecture 2nd quantitative techniques
Presentation lecture 2nd quantitative techniquesPresentation lecture 2nd quantitative techniques
Presentation lecture 2nd quantitative techniques
 
action research model
action research modelaction research model
action research model
 
Analyzing Market Trends And Developments
Analyzing Market Trends And DevelopmentsAnalyzing Market Trends And Developments
Analyzing Market Trends And Developments
 
Centricity EDI Product Overview
Centricity EDI Product OverviewCentricity EDI Product Overview
Centricity EDI Product Overview
 
Assessing Impacts – Methodology in Practice, Cara Maguire
Assessing Impacts – Methodology in Practice, Cara MaguireAssessing Impacts – Methodology in Practice, Cara Maguire
Assessing Impacts – Methodology in Practice, Cara Maguire
 
Impact Evaluation: Balancing Rigor with Reality
Impact Evaluation: Balancing Rigor with RealityImpact Evaluation: Balancing Rigor with Reality
Impact Evaluation: Balancing Rigor with Reality
 
Organizational development research
Organizational development researchOrganizational development research
Organizational development research
 
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRY
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRYPROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRY
PROVIDING A METHOD FOR DETERMINING THE INDEX OF CUSTOMER CHURN IN INDUSTRY
 
What it takes to build a model for detecting patients that defaults from medi...
What it takes to build a model for detecting patients that defaults from medi...What it takes to build a model for detecting patients that defaults from medi...
What it takes to build a model for detecting patients that defaults from medi...
 
2011 NCJA presentation
2011 NCJA presentation2011 NCJA presentation
2011 NCJA presentation
 
PROCESS OF ORGANIZATION DEVELOPMENT
PROCESS OF ORGANIZATION DEVELOPMENTPROCESS OF ORGANIZATION DEVELOPMENT
PROCESS OF ORGANIZATION DEVELOPMENT
 
121220012 incentives in the chinese construction industry
121220012 incentives in the chinese construction industry121220012 incentives in the chinese construction industry
121220012 incentives in the chinese construction industry
 
Credit iconip
Credit iconipCredit iconip
Credit iconip
 

Mehr von Marlon Dumas

How GenAI will (not) change your business?
How GenAI will (not)  change your business?How GenAI will (not)  change your business?
How GenAI will (not) change your business?Marlon Dumas
 
Walking the Way from Process Mining to AI-Driven Process Optimization
Walking the Way from Process Mining to AI-Driven Process OptimizationWalking the Way from Process Mining to AI-Driven Process Optimization
Walking the Way from Process Mining to AI-Driven Process OptimizationMarlon Dumas
 
Discovery and Simulation of Business Processes with Probabilistic Resource Av...
Discovery and Simulation of Business Processes with Probabilistic Resource Av...Discovery and Simulation of Business Processes with Probabilistic Resource Av...
Discovery and Simulation of Business Processes with Probabilistic Resource Av...Marlon Dumas
 
Can I Trust My Simulation Model? Measuring the Quality of Business Process Si...
Can I Trust My Simulation Model? Measuring the Quality of Business Process Si...Can I Trust My Simulation Model? Measuring the Quality of Business Process Si...
Can I Trust My Simulation Model? Measuring the Quality of Business Process Si...Marlon Dumas
 
Business Process Optimization: Status and Perspectives
Business Process Optimization: Status and PerspectivesBusiness Process Optimization: Status and Perspectives
Business Process Optimization: Status and PerspectivesMarlon Dumas
 
Why am I Waiting Data-Driven Analysis of Waiting Times in Business Processes
Why am I Waiting Data-Driven Analysis of Waiting Times in Business ProcessesWhy am I Waiting Data-Driven Analysis of Waiting Times in Business Processes
Why am I Waiting Data-Driven Analysis of Waiting Times in Business ProcessesMarlon Dumas
 
Augmented Business Process Management
Augmented Business Process ManagementAugmented Business Process Management
Augmented Business Process ManagementMarlon Dumas
 
Process Mining and Data-Driven Process Simulation
Process Mining and Data-Driven Process SimulationProcess Mining and Data-Driven Process Simulation
Process Mining and Data-Driven Process SimulationMarlon Dumas
 
Modeling Extraneous Activity Delays in Business Process Simulation
Modeling Extraneous Activity Delays in Business Process SimulationModeling Extraneous Activity Delays in Business Process Simulation
Modeling Extraneous Activity Delays in Business Process SimulationMarlon Dumas
 
Business Process Simulation with Differentiated Resources: Does it Make a Dif...
Business Process Simulation with Differentiated Resources: Does it Make a Dif...Business Process Simulation with Differentiated Resources: Does it Make a Dif...
Business Process Simulation with Differentiated Resources: Does it Make a Dif...Marlon Dumas
 
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints
Prescriptive Process Monitoring Under Uncertainty and Resource ConstraintsPrescriptive Process Monitoring Under Uncertainty and Resource Constraints
Prescriptive Process Monitoring Under Uncertainty and Resource ConstraintsMarlon Dumas
 
Robotic Process Mining
Robotic Process MiningRobotic Process Mining
Robotic Process MiningMarlon Dumas
 
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?Marlon Dumas
 
Learning Accurate Business Process Simulation Models from Event Logs via Auto...
Learning Accurate Business Process Simulation Models from Event Logs via Auto...Learning Accurate Business Process Simulation Models from Event Logs via Auto...
Learning Accurate Business Process Simulation Models from Event Logs via Auto...Marlon Dumas
 
Process Mining: A Guide for Practitioners
Process Mining: A Guide for PractitionersProcess Mining: A Guide for Practitioners
Process Mining: A Guide for PractitionersMarlon Dumas
 
Process Mining for Process Improvement.pptx
Process Mining for Process Improvement.pptxProcess Mining for Process Improvement.pptx
Process Mining for Process Improvement.pptxMarlon Dumas
 
Data-Driven Analysis of Batch Processing Inefficiencies in Business Processes
Data-Driven Analysis of  Batch Processing Inefficiencies  in Business ProcessesData-Driven Analysis of  Batch Processing Inefficiencies  in Business Processes
Data-Driven Analysis of Batch Processing Inefficiencies in Business ProcessesMarlon Dumas
 
Optimización de procesos basada en datos
Optimización de procesos basada en datosOptimización de procesos basada en datos
Optimización de procesos basada en datosMarlon Dumas
 
Process Mining and AI for Continuous Process Improvement
Process Mining and AI for Continuous Process ImprovementProcess Mining and AI for Continuous Process Improvement
Process Mining and AI for Continuous Process ImprovementMarlon Dumas
 
Mine Your Simulation Model: Automated Discovery of Business Process Simulatio...
Mine Your Simulation Model: Automated Discovery of Business Process Simulatio...Mine Your Simulation Model: Automated Discovery of Business Process Simulatio...
Mine Your Simulation Model: Automated Discovery of Business Process Simulatio...Marlon Dumas
 

Mehr von Marlon Dumas (20)

How GenAI will (not) change your business?
How GenAI will (not)  change your business?How GenAI will (not)  change your business?
How GenAI will (not) change your business?
 
Walking the Way from Process Mining to AI-Driven Process Optimization
Walking the Way from Process Mining to AI-Driven Process OptimizationWalking the Way from Process Mining to AI-Driven Process Optimization
Walking the Way from Process Mining to AI-Driven Process Optimization
 
Discovery and Simulation of Business Processes with Probabilistic Resource Av...
Discovery and Simulation of Business Processes with Probabilistic Resource Av...Discovery and Simulation of Business Processes with Probabilistic Resource Av...
Discovery and Simulation of Business Processes with Probabilistic Resource Av...
 
Can I Trust My Simulation Model? Measuring the Quality of Business Process Si...
Can I Trust My Simulation Model? Measuring the Quality of Business Process Si...Can I Trust My Simulation Model? Measuring the Quality of Business Process Si...
Can I Trust My Simulation Model? Measuring the Quality of Business Process Si...
 
Business Process Optimization: Status and Perspectives
Business Process Optimization: Status and PerspectivesBusiness Process Optimization: Status and Perspectives
Business Process Optimization: Status and Perspectives
 
Why am I Waiting Data-Driven Analysis of Waiting Times in Business Processes
Why am I Waiting Data-Driven Analysis of Waiting Times in Business ProcessesWhy am I Waiting Data-Driven Analysis of Waiting Times in Business Processes
Why am I Waiting Data-Driven Analysis of Waiting Times in Business Processes
 
Augmented Business Process Management
Augmented Business Process ManagementAugmented Business Process Management
Augmented Business Process Management
 
Process Mining and Data-Driven Process Simulation
Process Mining and Data-Driven Process SimulationProcess Mining and Data-Driven Process Simulation
Process Mining and Data-Driven Process Simulation
 
Modeling Extraneous Activity Delays in Business Process Simulation
Modeling Extraneous Activity Delays in Business Process SimulationModeling Extraneous Activity Delays in Business Process Simulation
Modeling Extraneous Activity Delays in Business Process Simulation
 
Business Process Simulation with Differentiated Resources: Does it Make a Dif...
Business Process Simulation with Differentiated Resources: Does it Make a Dif...Business Process Simulation with Differentiated Resources: Does it Make a Dif...
Business Process Simulation with Differentiated Resources: Does it Make a Dif...
 
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints
Prescriptive Process Monitoring Under Uncertainty and Resource ConstraintsPrescriptive Process Monitoring Under Uncertainty and Resource Constraints
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints
 
Robotic Process Mining
Robotic Process MiningRobotic Process Mining
Robotic Process Mining
 
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
 
Learning Accurate Business Process Simulation Models from Event Logs via Auto...
Learning Accurate Business Process Simulation Models from Event Logs via Auto...Learning Accurate Business Process Simulation Models from Event Logs via Auto...
Learning Accurate Business Process Simulation Models from Event Logs via Auto...
 
Process Mining: A Guide for Practitioners
Process Mining: A Guide for PractitionersProcess Mining: A Guide for Practitioners
Process Mining: A Guide for Practitioners
 
Process Mining for Process Improvement.pptx
Process Mining for Process Improvement.pptxProcess Mining for Process Improvement.pptx
Process Mining for Process Improvement.pptx
 
Data-Driven Analysis of Batch Processing Inefficiencies in Business Processes
Data-Driven Analysis of  Batch Processing Inefficiencies  in Business ProcessesData-Driven Analysis of  Batch Processing Inefficiencies  in Business Processes
Data-Driven Analysis of Batch Processing Inefficiencies in Business Processes
 
Optimización de procesos basada en datos
Optimización de procesos basada en datosOptimización de procesos basada en datos
Optimización de procesos basada en datos
 
Process Mining and AI for Continuous Process Improvement
Process Mining and AI for Continuous Process ImprovementProcess Mining and AI for Continuous Process Improvement
Process Mining and AI for Continuous Process Improvement
 
Mine Your Simulation Model: Automated Discovery of Business Process Simulatio...
Mine Your Simulation Model: Automated Discovery of Business Process Simulatio...Mine Your Simulation Model: Automated Discovery of Business Process Simulatio...
Mine Your Simulation Model: Automated Discovery of Business Process Simulatio...
 

Kürzlich hochgeladen

Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)Data & Analytics Magazin
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
 

Kürzlich hochgeladen (17)

Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
 

Process Mining Meets Causal Machine Learning: Discovering Causal Rules From Event Logs

  • 1. Process Mining meets Causal Machine Learning: Discovering Causal Rules from Event logs Zahra Dasht Bozorgi, Irene Teinemaa, Marlon Dumas, Marcello La Rosa, and Artem Polyvyanyy 1 2nd International Conference on Process Mining (ICPM 2020) Padua, Italy, 4-9 October 2020
  • 2. Introduction An approach that addresses the following: • Discover case-level treatment recommendations to increase the positive outcome rate of a process; • Identify subsets of cases to which a recommendation should be applied; • Estimate the causal effect and the incremental Return-on-Investment (ROI) of a treatment. 2
  • 3. Preliminaries What is Causal Inference? Inferring the effects of any treatment/policy/intervention/etc. Examples: • Effect of a treatment on a disease • Effect of climate change policy on emissions • Effect of an action on a business process outcome 3 Notation: Y denotes outcome T denotes treatment assignment X denotes case attributes
  • 4. Preliminaries How do we measure causal effects? Taking the difference between potential outcomes Example: Loan application Process Treatment/intervention: calling the customer after offer Outcome: whether customer selects a loan offer or not 4 World 1 E[Y1] E[Y0] World 2 Average Treatment Effect: ATE = E[Y1 – Y0] Conditional Average Treatment Effect: CATE = E[Y1 – Y0|X= x] CATE is also known as Uplift in marketing literature.
  • 5. Preliminaries id T Y Y1 Y0 Y1 – Y0 1 0 0 ? 0 ? 2 1 1 1 ? ? 3 1 0 0 ? ? 4 0 0 ? 0 ? 5 0 1 ? 1 ? 6 1 1 1 ? ? 5 Fundamental Problem of Causal Inference: Missing Data Problem Solution: Randomised Experiment • Expensive • Time-consuming • Unethical Next best solution: Observational Study
  • 8. Candidate Treatment Identification Aim: generating recommendations automatically Method: Action rule mining • Extension of classification rules • Suggests actions to change class label • Based on support Action Rule: r = [(a: a1) ∧ (b: b1 → b2)] ⇒ [y: y1 → y2] 8 Controllable Uncontrollable Input: Case Attributes
  • 9. Candidate Treatment Identification 9 Rule: r = [(LoanGoal: Home Improvement) ∧ (NumOfTerms: 48 → 84)] ⇒ [Selected: 0 → 1] with support = 0.05 Pre-condition Action Outcome Example: In loan applications where the loan goal is home improvement, changing the number of payback months (terms) from 48 to 84 months will increase the chance of the customer selecting the loan offer. Example: Loan application process Attributes: Loan Goal (Uncontrollable) Number of terms (Controllable)
  • 10. Causal Rules Discovery Aim: Discovering subgroups where an action has a high causal effect on the outcome Method: Uplift Trees1 Uplift tree: predicts which cases will have a positive outcome because a treatment is applied. 101. P. Rzepakowski and S. Jaroszewicz, “Decision trees for uplift modelling with single and multiple treatments,”Knowl. Inf. Syst., vol. 32, no. 2,2012.
  • 11. Uplift Tree 11 Splitting criterion: D(.) a divergence measure PT probability distribution of the outcome in the treatment group PC probability distribution of the outcome in the control group Divergence Metrics: Kullback-Leibler divergence Squared Euclidean distance Chi-squared divergence
  • 12. Uplift Tree E[Y|T=1, X=x] – E[Y|T=0, X=x] Recall: Uplift = E[Y1 | X=x] – E[Y0 | X=x] The reason is Confounding. 12 ≠
  • 13. Confounding Confounding is the presence of a variable that affects both treatment assignment and the outcome. How do we address confounding in uplift trees? Normalisation 13 NT and NC are the number of cases in the treatment and control group respectively. N is the total number of cases. A is a test. And H(.) is entropy.
  • 14. Example Causal Rule Example rule: • Action: [NumOfTerms: 48 → 84] • Objective: [Selected: 0 → 1] • Sub-group: a) Customers whose loan goal is not existing loan takeover AND b) have a credit score less than 920 AND c) their offer includes a monthly cost greater than 149 AND d) and the first withdrawal amount is less than 8304 14
  • 15. Ranking Rules using Cost-benefit Model Parameters: • v: value (benefit) of a positive outcome • c: impression cost for a treatment • u: uplift of applying a treatment • n: size of the treated group Net value of applying the recommendation: Net = n × (u × v - c) 15
  • 16. Dataset and Experimental setup • The approach was implemented in Python 3.7 using the ActionRules package for generating candidate treatments and the CausalML package for constructing uplift trees. • We ran the action rule discovery algorithm on this dataset and discovered 24 rules containing 17 distinct recommendations. • The uplift tree was constructed for each of the 17 recommendations resulting in 8 causal rules. • Details of the recommendations can be found in the paper 16 BPI Challenge 2017: • 31,509 applications (cases) • 42,995 offers • Outcome is whether customer selects the offer Uncontrollable attributes: • Application Type • Loan Goal • Credit Score • Requested Amount Controllable attributes: • Number of Offers • Number of Payback Terms • Monthly Cost • Initial Withdrawal Amount
  • 17. Comparison of Results • Increase number of term from 48 months to more than 120 months for customers whose credit scores are between 899 and 943 and their first withdrawal amount is less than 8,304. • Decrease the duration of the process Delays are mostly due to unresponsive customers 17
  • 18. Future Work • Incorporating other types of recommendation • Addressing other goals, such as reduction of cycle time and waste • Adjusting for unobserved confounding effects • Conducting complimentary evaluations such as randomised experiments 18
  • 19. Thank you Any Questions? Zahra Dasht Bozorgi zdashtbozorg@student.unimelb.edu.au School of Computing and Information Systems University of Melbourne