Two-day course delivered at the Chinese Business Process Management (BPM) Summer School in Jinan, China, 23-24 August 2018. The course introduces a range of techniques, tools, and algorithms for process monitoring and mining.
3. 1. Any process is better than no process
2. A good process is better than a bad process
3. Even a good process can be improved
4. Any good process eventually becomes a bad process
• […unless continuously cared for]
• Michael Hammer
Back to basics…
3
8. Operational process dashboards
• Aimed at process workers & operational managers
• Emphasis on monitoring (detect-and-respond), e.g.:
- Work-in-progress
- Problematic cases – e.g. overdue/at-risk cases
- Resource load
9. • Aimed at process owners / managers
• Emphasis on analysis and management
• E.g. detecting bottlenecks
• Typical process performance indicators
• Cycle times
• Error rates
• Resource utilization
Tactical dashboards
11. • Aimed at executives & managers
• Emphasis on linking process performance to strategic
objectives
Strategic dashboards
12. Manage
Unplanned
Outages
Manage
Emergencies &
Disasters
Manage Work
Programming &
Resourcing
Manage
Procurement
Customer
Satisfaction
0.5 0.55 - 0.2
Customer
Complaint
0.6 - - 0.5
Customer
Feedback
0.4 - - 0.8
Connection Less
Than Agreed Time
0.3 0.6 0.7 -
Key Performance
Process
Strategic Performance Dashboard
@ Australian Utilities Provider
13. Process: Manage Emergencies & Disasters
Process: Manage Procurement
Process: Manage Unplanned Outages
Overall Process Performance
Financial People
Customer
Excellence
Operational
Excellence
Risk
Management
Health
& Safety
Customer
Satisfaction
Customer
Complaint
Customer
Rating (%)
Customer
Loyalty Index
Average Time
Spent on Plan
1st Layer
Key Result
Area
2nd Layer
Key Performance
Satisfied
Customer Index
Market
Share (%)
3rd & 4th Layer
Process Performance
Measures
0.65
0.6 0.7
0.7 0.6 0.8
0.4 0.8
0.5 0.4 0.5 0.8 0.4
0.54
0.58
0.67
14. Sketch operational and tactical process monitoring
dashboards for CVS Pharmacy’s prescription
fulfillment process.
Consider the viewpoints of each stakeholder in the
process.
Teamwork
17. Process Mining
17
/
event log
discovered model
Process
Discovery
Conformance
Checking
Variants
Analysis
Difference
diagnostics
Performance
Mining
input model
Enhanced model
event log’
22. Process Maps
• A process map of an event log is a graph where:
• Each activity is represented by one node
• An arc from activity A to activity B means that B is directly
followed by A in at least one trace in the log
• Arcs in a process map can be annotated with:
• Absolute frequency: how many times B directly follows A?
• Relative frequency: in what percentage of times when A is
executed, it is directly followed by B?
• Time: What is the average time between the occurrence of A
and the occurrence of B?
22
24. Process Maps – Exercise
Case
ID Task Name Originator Timestamp
Case
ID Task Name Originator Timestamp
1 File Fine Anne 20-07-2004 14:00:00 3 Reminder John 21-08-2004 10:00:00
2 File Fine Anne 20-07-2004 15:00:00 2 Process Payment system 22-08-2004 09:05:00
1 Send Bill system 20-07-2004 15:05:00 2 Close case system 22-08-2004 09:06:00
2 Send Bill system 20-07-2004 15:07:00 4 Reminder John 22-08-2004 15:10:00
3 File Fine Anne 21-07-2004 10:00:00 4 Reminder Mary 22-08-2004 17:10:00
3 Send Bill system 21-07-2004 14:00:00 4 Process Payment system 29-08-2004 14:01:00
4 File Fine Anne 22-07-2004 11:00:00 4 Close Case system 29-08-2004 17:30:00
4 Send Bill system 22-07-2004 11:10:00 3 Reminder John 21-09-2004 10:00:00
1
Process
Payment system 24-07-2004 15:05:00 3 Reminder John 21-10-2004 10:00:00
1 Close Case system 24-07-2004 15:06:00 3 Process Payment system 25-10-2004 14:00:00
2 Reminder Mary 20-08-2004 10:00:00 3 Close Case system 25-10-2004 14:01:00
24
25. Process Maps in Disco
• Disco (and other commercial process mining tools) use
process maps as the main visualization technique for
event logs
• These tools also provide three types of operations:
1. Abstract the process map:
• Show only most frequent activities
• Show only most frequent arcs
2. Filter the traces in the event log…
25
26. Types of filters
• Event filters
• Retain only events that fulfil a given condition (e.g. all events
of type “Create purchase order”)
• Performance filter
• Retain traces that have a duration above or below a given
value
• Event pair filter (a.k.a. “follower” filter)
• Retain traces where there is a pair of events that fulfil a given
condition (e.g. “Create invoice” followed by “Create purchase
order”)
• Endpoint filter
• Retain traces that start with or finish with an event that fulfils
a given condition
26
27. Process Maps in Disco
• Disco (and other commercial process mining tools) use
process maps as the main visualization technique for
event logs
• These tools also provide three types of operations:
1. Abstract the process map:
• Show only most frequent activities
• Show only most frequent arcs
2. Filter the traces in the event log
3. Enhance the process map
27
28. Process Map Enhancement
• Nodes and arcs in a process map can be color-
coded or thickness-coded to capture:
• Frequency: How often a given task or a given directly-
follows relation occurs?
• Time performance: processing times, waiting times,
cycles times of tasks
• More advanced tools support enhancement by other
attributes, e.g. cost, revenue, etc. if the data is available.
28
30. Using Disco, answer the following questions on the
PurchasingExample log:
• How many cases had to settle a dispute with the
purchasing agent?
• Is there a difference in cycle time for the cases that
had to settle a dispute with the purchasing agent,
compared to the ones that did not? Make sure you
only compare cases that actually reach the endpoint
‘Pay invoice’
• Are there any cases where the invoice is released and
authorized by the same resource? And if so, who is
doing this most often?
Exercise
Exercise by Anne Rozinat, Fluxicon
31. Consider the dataset of a refund process from an electronics manufacturer.
Customer complaints and the inspection of individual cases indicate that this
process suffers from inefficiencies and overly long cycle times. Assume that only
cases that have reached the ‘Order completed’ event are finished.
Questions:
1. Is it a problem if you take the average cycle time of all cases, also the ones
that have not finished yet?
2. In general, which channel(s) have the biggest problems with missing
documents that need to be requested from the customer?
3. How many customers have received a refund without the product being
received by the electronics manufacturing company? This should not happen
in this process.
4. Has a customer ever received a double payment? This should not happen in
this process.
To complete this exercise use the log of RefundProcess.fbt
One more exercise: Refund process
32. Process Maps - Limitations
• Process maps over-generalize: some paths of a
process map might not exist and might not make
sense
• Example: Draw the process map of [ abc, adc, afce, afec ]
and check which traces it can recognize, for which there is
no support in the event log.
• Process maps make it difficult to distinguish
conditional branching, parallelism, and loops.
• See previous example… or a simpler one: [abcd, acbd]
• Solution: automated BPMN process discovery
• More on this tomorrow…
33
33. Process Mining
34
/
event log
discovered model
Process
Discovery
Conformance
Checking
Variants
Analysis
Difference
diagnostics
Performance
Mining
input model
Enhanced model
event log’
34. • Dotted charts
• One line per trace, each line contains points, one point per event
• Each event type is mapped to a colour
• Position of the point denotes its occurrence time (in a relative scale)
• Birds-eye view of the timing of different events (e.g. activity end times), but does
not allow one to see the “processing” times of activities
• Timeline diagrams
• One line per trace, each line contains segments capturing the start and end of tasks
• Captures process time (unlike dotted charts)
• Not scalable for large event logs – good to show “representative” traces
• Performance-enhanced process maps
• Process maps where nodes are colour-coded w.r.t a performance measure. Nodes
may represent activities (default option)
• But they may represent resources and then arcs denote hands-offs between
resources
Process Performance Mining
39. Exercise
• Consider the following event log of a telephone
repair process: http://tinyurl.com/repairLogs
• What are the bottlenecks in this process?
• Which task has the longest waiting time and which one
has the longest processing time?
40
40. Process Mining
41
/
event log
discovered model
Process
Discovery
Conformance
Checking
Variants
Analysis
Difference
diagnostics
Performance
Mining
input model
Enhanced model
event log’
41. Given two logs, find the differences and root causes for
variation or deviance between the two logs
Variants Analysis
≠
42. Case Study: Variants Analysis at Suncorp
OK
OK Good
Bad Expected
Performance
Line
43. Simple claims and quick Simple claims and slow
Variants Analysis via Process Map
Comparison
?
S. Suriadi et al.: Understanding Process Behaviours in a Large Insurance Company in Australia: A Case Study. CAiSE 2013
44. Variants analysis - Exercise
We consider a process for handling health insurance claims, for which
we have extracted two event logs, namely L1 and L2. Log L1 contains
all the cases executed in 2011, while L2 contains all cases executed in
2012. The logs are available in the book’s companion website or
directly at: http://tinyurl.com/InsuranceLogs
Based on these logs, answer the following questions using a process
mining tool:
1. What is the cycle time of each log?
2. Where are the bottlenecks (highest waiting times) in each of the
two logs and how do these bottlenecks differ?
3. Describe the differences between the frequency of tasks and the
order in which tasks are executed in 2011 (L1) versus 2012 (L2).
Hint: If you are using process maps, you should consider using the
abstraction slider in your tool to hide some of the most
infrequent arcs so as to make the maps more readable
45
45. Process Mining
46
/
event log
discovered model
Process
Discovery
Conformance
Checking
Variants
Analysis
Difference
diagnostics
Performance
Mining
input model
Enhanced model
event log’
47. Conformance Checking:
Unfitting vs. Additional Behavior
Unfitting behaviour:
• Task C is optional (i.e. may be skipped) in the log
Additional behavior:
• The cycle including IGDF is not observed in the log
Event log:
ABCDEH
ACBDEH
ABCDFH
ACBDFH
ABDEH
ABDFH
48. Conformance Checking in Apromore
49
Full demo at:
https://www.youtube.com/watch?v=3d00pORc9X8
49. Open-source tools: Apromore
(apromore.org)• Open-source BPM analytics platform as Software as a Service
• Focus is on end users (business analytics and operations managers), not on data
scientists
• Over 40 plugins
!
!
50. Key features
• Repository of process models and event logs (BPMN, AML, XPDL, EPML, AML, YAWL, XES, MXML)
• Offers a range of features along the BPM lifecycle:
From logs
• Automated discovery of BPMN models
• Filter noise from log
• Visualize log
• Mine process stages
From models
• Structure model
On logs
• Animate logs
• Compare model-log, log-log
• Detect and characterize drifts
• Measure log complexity
• Mine process performance
On models
• Measure model complexity
• Compare model-model
• Detect clones
• Search similar models
• Simulate model
On models
• Merge model variants
• Configure model with
questionnaire
From logs
• Animate logs
• Compare model-log, log-log
• Detect and characterize drifts
• Mine process performance
• Predict outcomes and
performance (via Nirdizati)
51. Access Apromore
You can access it in the cloud or download and install a standalone version
Cloud-version
• Node 1(Estonia): http://apromore.cs.ut.ee
• Node 2 (Australia): http://apromore.qut.edu.au
Standalone
• One-click: a lightweight version of Apromore. Simply unzip and run from
localhost
• Full-fledged: for developers and advanced users, this distribution gives you
full control over Apromore
Source code
• Apromore’s source code is open-source, licensed under LGPL 3.0
• The code can be accessed from GitHub
52. ProM: the very first process mining tool
• 600+ plug-ins available for the whole process mining
spectrum
• Open source license
• Download it from www.processmining.org
53. Nirdizati: predictive process monitoring
(nirdizati.com)• Predict process outcome (e.g. “Is this loan offer going to be rejected?”)
• Predict process performance (e.g. “Will this claim take longer than 5 days to
be handled?”)
• Predict future events (e.g. “What activity is likely to be executed next? And
after that?”)
55. BPMN-Based Process Mining
56
/
event log
discovered model
Process
Discovery
Conformance
Checking
Variants
Analysis
Difference
diagnostics
Performance
Mining
input model
Enhanced model
event log’
56. Conformance Checking:
Unfitting vs. Additional Behavior
Unfitting behaviour:
• Task C is optional (i.e. may be skipped) in the log
Additional behavior:
• The cycle including IGDF is not observed in the log
Event log:
ABCDEH
ACBDEH
ABCDFH
ACBDFH
ABDEH
ABDFH
58. Accuracy of Automatically Discovered
Process Models
• Fitness: To what extent the behaviour observed in the
event log fits the process model?
• No unfitting behaviour Fitness = 1
• Precision: How much additional behaviour the process
model allows that is not observed in the event log
• No additional behaviour Precision = 1
• Generalization (of an algorithm): If we have a (partial)
event log of a process, to what extent the discovery
algorithm produces models that fit the behaviour of the
process that is not observed in the log
59
59. Measuring Fitness
• Replay
• Replay each trace against the model
• When a parsing error occurs, repair it locally
• Keep track of the “parsing error”
• Does not calculate an exact distance measure!
• Optimal Trace Alignment
• For each trace in the model t, find the trace t’ of the
process model such that the string-edit distance of t and
t’ is minimal
• Use the string-edit distances
• Calculates a “distance” between log and model
60
81. Accuracy of automatically
discovered process models
The accuracy of an automatically discovered process models consists of three
quality dimensions:
1. Fitness: the discovered model should allow for the behavior seen in the
event log.
A model has a perfect fitness if all traces in the log can be replayed from the
beginning to the end.
82. Accuracy of process models
The accuracy of an automatically discovered process models consists of three
quality dimensions:
1. Fitness
2. Precision (avoid underfitting): the discovered model should not allow for
behavior completely unrelated to what was seen in the event log.
83. The accuracy of an automatically discovered process models consists of three
quality dimensions:
1. Fitness:
2. Precision (avoid underfitting)
3. Generalization (avoid overfitting): the discovered model should generalize
the example behavior seen in the event log.
Accuracy of process models
96. Computing fitness: basic
approach
L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l> 360}
A “basic approach” to compute fitness is to count the fraction of cases that can be
“parsed completely” (i.e., the proportion of cases corresponding to firing sequences
leading from [start] to [end]).
97. Computing fitness: basic
approach
L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l> 360}
A “basic approach” to compute fitness is to count the fraction of cases that can be
“parsed completely” (i.e., the proportion of cases corresponding to firing sequences
leading from [start] to [end]).
Fitness = 0.97
98. Computing fitness: Event-based
approach
• In the simple fitness computation, we stopped replaying a trace
once we encounter a problem and mark it as non-fitting.
• An event-based approach to calculate fitness consists of just
continue replaying the trace on the model and:
• record all situations where a transition is forced to fire without being
enabled, i.e., we count all missing tokens.
• record the tokens that remain at the end.
• Use of four counters:
• p = produced tokens
• c = consumed tokens
• m = missing tokens
• r = remaining tokens
106. Computing fitness: Event-based
approach
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
I J K L
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 0
107. Computing fitness: Event-based
approach
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
I J K L
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 1
p = 0
c = 0
m = 1
r = 0
108. Computing fitness: Event-based
approach
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
I J K L
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 1
p = 0
c = 1
m = 1
r = 0
p = 1
c = 0
m = 0
r = 0
p = 1
c = 0
m = 0
r = 0
109. Computing fitness: Event-based
approach
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
J K L
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 1
p = 0
c = 1
m = 1
r = 0
p = 1
c = 0
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 0
110. Computing fitness: Event-based
approach
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
K L
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 1
p = 0
c = 1
m = 1
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 0
p = 1
c = 0
m = 0
r = 0
111. Computing fitness: Event-based
approach
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
L
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 1
p = 0
c = 1
m = 1
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 0
112. Computing fitness: Event-based
approach
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
L
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 1
p = 0
c = 1
m = 1
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 0
113. Computing fitness: Event-based
approach
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 1
p = 0
c = 1
m = 1
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 0
114. Computing fitness: Event-based
approach
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 1
p = 0
c = 1
m = 1
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 12
c = 12
m = 1
r = 1
122. Computing fitness: Event-based
approach
L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360}
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 1
p = 0
c = 0
m = 0
r = 0
123. Computing fitness: Event-based
approach
L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360}
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 1
m = 0
r = 0
p = 1
c = 0
m = 0
r = 1
p = 0
c = 0
m = 0
r = 0
p = 12
c = 11
m = 0
r = 1
129. Computing fitness at log level
L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360}
Number of occurrences of a specific trace in
the log (e.g., if a trace σ appears 200 times in
the log, L(σ) will be equal to 200 )
130. Computing fitness at log level
L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360}
p = 13
c = 13
m = 0
r = 0
p = 12
c = 12
m = 1
r = 1
p = 13
c = 13
m = 0
r = 0
p = 12
c = 11
m = 0
r = 1
131. Computing fitness at log level
L= { <a,b,i,j,k,l>10, <a,b,g,j,k,i,l>140, <a,f,g,j,i,k>5, <a,f,g,i,j,k,l>360}
p = 13
c = 13
m = 0
r = 0
p = 12
c = 12
m = 1
r = 1
p = 13
c = 13
m = 0
r = 0
p = 12
c = 11
m = 0
r = 1
Fitness = 0.998
132. Calculating precision
• Precision = 1 the behaviour allowed by the model
is contained or equal to the behavior in the log
• Precision close to 0 None of the behaviour in the
model is observed in the log
• Precision can be calculated as a “difference”
between a state space representing the behaviour of
the model, and a state space representing the
behaviour of the log
• Adriano Augusto et al. “Abstract-and-Compare: A Family
of Scalable Precision Measures for Automated Process
Discovery”. In Proceedings of BPM’2018
133
142. α-algorithm: the Origin of
Process Discovery
van der Aalst, W. M. P. and Weijters, A. J. M. M. and Maruster,
L. (2003). Workflow Mining: Discovering process models
from event logs, IEEE Transactions on Knowledge and Data
Engineering
143. α-algorithm
Basic Idea: Ordering relations
• Direct succession:
x>y iff for some case
x is directly followed
by y.
• Causality: xy iff
x>y and not y>x.
• Parallel: x||y iff x>y
and y>x
• Unrelated: x#y iff
not x>y and not y>x.
case 1 : task A
case 2 : task A
case 3 : task A
case 3 : task B
case 1 : task B
case 1 : task C
case 2 : task C
case 4 : task A
case 2 : task B
...
A>B
A>C
B>C
B>D
C>B
C>D
E>F
AB
AC
BD
CD
EF
B||C
C||B
ABCD
ACBD
EF
179. Limitations of alpha miner
Completeness
All possible traces of the process (model)
need to be in the log
Short loops
c>b and b>c implies c||b and b||c
instead of cb and bc
Self-loops
b>b and not b>b implies bb (impossible!)
181. Little Thumb to Deal with Noise
van der Aalst, W. M. P. and Weijters, A. J. M. M. (2003).
Rediscovering workflow models from event-based data
using little thumb, Integrated Computer-Aided Engineering
190. Process Model discovered with
Inductive Miner
• Structured by construction
• Based on process tree
191. Process Discovery Algorithms:
The Two Worlds
High-Fitness
High-Precision
High complexity
Heuristic Miner
Fodina Miner
High-Fitness
Low Precision
Low-Complexity
Inductive Miner
Evolutionary
Tree Miner
192. Split Miner
Augusto, A. and Conforti, R. and Dumas, M. and La Rosa, M.
(2017). Split Miner: Discovering Accurate and Simple
Business Process Models from Event Logs. ICDM 2017
195. From Event Log to Process Model in 5
Steps
196
Directly-Follows
Graph and
Loops Discovery
Filtering
Concurrency
Discovery
Splits
Discovery
Joins
Discovery
Event
Log
Process
Model
196. Trace #obs
a » b » c » g » e » h 10
a » b » c » f » g » h 10
a » b » d » g » e » h 10
a » b » d » e » g » h 10
a » b » e » c » g » h 10
a » b » e » d » g » h 10
a » c » b » e » g » h 10
a » c » b » f » g » h 10
a » d » b » e » g » h 10
a » d » b » f » g » h 10
197
Directly-Follows
Graph and
Loops Discovery
Concurrency
DiscoveryEvent Log Filtering
Splits
Discovery
Joins
Discovery
Process
Model
197. Trace #obs
a » b » c » g » e » h 10
a » b » c » f » g » h 10
a » b » d » g » e » h 10
a » b » d » e » g » h 10
a » b » e » c » g » h 10
a » b » e » d » g » h 10
a » c » b » e » g » h 10
a » c » b » f » g » h 10
a » d » b » e » g » h 10
a » d » b » f » g » h 10
198
Directly-Follows
Graph and
Loops Discovery
Concurrency
DiscoveryEvent Log Filtering
Splits
Discovery
Joins
Discovery
Process
Model
206. Topics not covered in this class
• Event log filtering
• Removing anomalous or infrequent behaviour from an
event log
• Business process drift detection
• Detecting changes in a business process (over time)
using event logs
• Predictive process monitoring
• Predicting the outcome or a future property of a process
based on an event log containing completed cases, and
an incomplete case
207