Event processing systems have evolved from early reactive systems to now support predictive capabilities using techniques like machine learning. Key challenges for these systems include handling uncertain and incomplete event data, performing predictive event processing through graphical model techniques, leveraging machine learning to automate pattern discovery and modeling, and evolving from purely reactive systems to proactively take actions based on predictions. Addressing these challenges will help make event processing systems more intelligent and adaptive.
Event processing – State of the art and research challenges AAAI 2011
1. Event processing – State of the art and research challenges AAAI 2011 Tutorial, San Francisco, August 7 th , 2011 Opher Etzion ( [email_address] ) Yagil Engel ( [email_address] )
3. Imagine that… A driver gets notification on the car screen: the person crossing the street is an Alzheimer patient out of his regular route, he lives in 5 King Street. A national park gets information on all cars heading to the park from the car computer; can open more parking lots and notify cars that the park will be closed.
4. Agenda Introduction and roots of event processing Players and architecture of event processing Current state of the art in event processing Challenges in event processing systems Summary I II III IV V
9. What ’s new? The analog: moving from files to DBMS In recent years – architectures, abstractions, and dedicated commercial products emerge to support functionality that was traditionally carried out within regular programming. For some applications it is an improvement in TCO; for others is breaking the cost-effectiveness barrier.
10. What is an event – three views An event is anything that happens, or is contemplated as happening. The happening view The state change view An event is a state of change of anything The detectable condition view An event is a detectable condition that can trigger a notification
12. Many times we react to combination of events within a context The house sensor detects that the child did not arrive home within 2 hours from the scheduled end of classes for the day I want to be notified when my own investment portfolio is down 5% since the start of the trading Day ; have an agent call me when I am available , send SMS when I am in a meeting , and Email when I am out of office .
14. What we actually want to react to are – situations TOLL VILOATOR FRUSTRATED CUSTOMER Sometimes the situation is determined by detecting that some pattern occurred in the Flowing events. Toll violation Frustrated customer Sometimes the events can approximate or indicate with some certainty that the situation has occurred
19. Event processing and Data stream management? Aliases? One of them subset of the other? Totally unrelated concepts?
20. Ancestor: Temporal databases There is a substantial temporal nature to event processing. Recently – also spatial and spatio-temporal functions are being added
25. II: Players and architecture of event processing
26. Event Driven Architecture Event driven architecture: asynchronous, decoupled; each component is autonomic.
27. Fast Flower Delivery Flower Store Van Driver Ranking and Reporting System Bid Request Delivery Bid Assignments, Bid alerts, Assign Alerts Control System GPS Location Location Service Location Driver ’s Guild Ranking and reports Delivery confirmation Pick Up confirmation Ranked drivers / automatic assignment Bid System Store Preferences Delivery Request Assignment System Manual Assignment Assignment Assignments, Pick Up Alert Delivery Alert http://www.ep-ts.com/EventProcessingInAction
28. Event Processing Agent Context Event Channel Event Consumer Event Type Event Producer Global State The seven Building blocks
43. Filter EPA A filter EPA is an EPA that performs filtering only, and has no matching or derivation steps, so it does not transform the input event .
46. Pattern detection example Pattern name: Manual Assignment Preparation Pattern Type: relative N highest Context: Bid Interval Relevant event types: Delivery Bid Pattern parameter: N = 5; value = Ranking Cardinality: Single deferred Find the five highest bids within the bid interval Taken from the Fast Flower Delivery use case
47.
48. Context has three distinct roles (which may be combined) Partition the incoming events The events that relate to each customer are processed separately Grouping events together Different processing for Different context partitions Determining the processing Grouping together events that happened in the same hour at the same location
49. Context Definition A context is a named specification of conditions that groups event instances so that they can be processed in a related way. It assigns each event instance to one or more context partitions . A context may have one or more context dimensions. Temporal Spatial State Oriented Segmentation Oriented
51. Context Types Examples Spatial State Oriented Temporal Context “ Every day between 08:00 and 10:00 AM ” “ A week after borrowing a disk” “ A time window bounded by TradingDayStart and TradingDayEnd events ” “ 3 miles from the traffic accident location ” “ Within an authorized zone in a manufactory ” “ All Children 2-5 years old” “ All platinum customers” “ Airport security level is red” “ Weather is stormy” Segmentation Oriented
59. Microsoft Streaminsights var topfive = (from window in inputStream.Snapshot() from e in window orderby e.f ascending, e.i descending select e).Take(5); var avgCount = from v in inputStream group v by v.i % 4 into eachGroup from window in eachGroup.Snapshot() select new { avgNumber = window.Avg(e => e.number) };
60. Esper EPL – FFD Example /* * Not delivered up after 10 mins (600 secs) of the request target delivery time */ insert into AlertW(requestId, message, driver, timestamp) select a.requestId, "not delivered", a.driver, current_timestamp() from pattern[ every a=Assignment (timer:interval(600 + (a.deliveryTime-current_timestamp)/1000) and not DeliveryConfirmation(requestId = a.requestId) and not NoOneToReceiveMSG(requestId = a.requestId)) ];
65. Performance benchmarks There is a large variance among applications, thus a collection of benchmarks should be devised, and each application should be classified to a benchmark Some classification criteria: Application complexity Filtering rate Required Performance metrics
66. Performance benchmarks – cont. Adi A., Etzion O. Amit - the situation manager. The VLDB Journal – The International Journal on Very Large Databases. Volume 13 Issue 2, 2004. Mendes M., Bizarro P., Marques P. Benchmarking event processing systems: current state and future directions. WOSP/SIPEW 2010: 259-260 . Previous studies indicate that there is a major performance degradation as application complexity increases.
67. Throughput Input throughput output throughput Processing throughput Measures: number of input events that the system can digest within a given time interval Measures: Total processing times / # of event processed within a given time interval Measures: # of events that were emitted to consumers within a given time interval
68. Latency latency In the E2E level it is defined as the elapsed time FROM the time-point when the producer emits an input event TO the time-point when the consumer receives and output event The latency definition But – input event may not result in output event: It may be filtered out, participate in a pattern but does not result in pattern detection, or participates in deferred operation (e.g. aggregation) Similar definitions for the EPA level, or path level
69.
70. Scalability in event processing: various dimensions # of producers # of input events # of EPA types # of concurrent runtime instances # of concurrent runtime contexts Internal state size # of consumers # of derived events Processing complexity
71. Scalability solutions Significant progress in scalability enablers that provides feasibility for a system based on large scale event sources, event quantities, computations and actuators Smart placements of processing elements with dynamic load balancing Fault tolerance techniques enable trustable automatic processing Virtualization (scale-in) Use of parallel processing – multi-core and GPU processors – without extra programming efforts
75. Uncertain situations False positive: The pattern is matched; The real-world situation does not occur False negative: The pattern is not matched; The real-world situation occurs
102. Proactive Event-Driven Computing (1) predict (states, events) Real-time decision Proactive action events Event processing (filter, transform, match patterns) events Detect / Derive Predict Decide Act events Proactive event-driven computing is a new paradigm aimed at predicting the occurrence of problems or opportunities before they occur, and changing the course of actions to mitigate or leverage them
103. Energy Scenario Detect Predict Decide Act Consumption Level Production Level State Generator Failure Generator Fixed Weather Forecast (sun, wind, temp, storm) Consumption Forecast Production Forecast Outage Prediction Many Failed Generators Prediction Call for Urgent Generators Fix Activate Expensive Diesel Generators Declare “Peak Hours” for Tomorrow Activate Rolling Blackout
104. Detect Monitor shipment progress and various related alerts (traffic, cargo handling time at airport, carriers being late) Predict According to current route, the shipment will be 3 hours late and we will incur high penalty Decide Find alternative route which (given new condition) is faster than previous route Act Generate cargo reservations, reroute shipment Critical Shipment Logistics
105. Personal reschedule Detect I got out of the house 20 Minutes late; there are three spots of traffic congestion on the way to the office; it is raining; and I have an important meeting in 25 minutes! Predict I am not going to get to the meeting, not even close! Decide Check whether there is a qualified person for this meeting that can replace me and has lower priority task for the duration of this meeting and reschedule his/her other obligations; Alternatively, check if there Is another time-slot later on the day for which the meeting can be rescheduled and get a decision! Act Notify all involved on their reschedule.
106. Electric car – battery replacement overload Detect Tracking the cars driving within a certain area and their battery status. Predict In 2 hours the service stations in the area will be out of charged batteries. Decide Whether there are available spare batteries nearby that can be shipped via car, or a helicopter need to be dispatched to ship batteries from the central store. Act Load batteries on selected means of transportation and start the journey! Background: A company leases electric cars that can drive up to 100 miles; it provides both personal and public battery charge spots, and robotic battery replacement service stations as part of the lease.
107. Portfolio tuning Detect Track corporate actions, news, exchange prices, and rumors about all securities in my portfolio Predict My portfolio is going to exceed my personal risk limit within 1 hour Decide Mark the securities to be sold and best timing to sell, find an alternative to buy that retain the risk limit. Act Buy/Sell orders
108.
109.
110.
111.
112.
113. Correctness The ability of a developer to create correct implementation for all cases (including the boundaries) Observation: A substantial amount of effort is invested today in many of the tools to workaround the inability of the language to easily create correct solutions
114. Some correctness topics The right interpretation of language constructs The right order of events The right classification of events to windows
115. The right interpretation of language constructs – example All (E1, E2) – what do we mean? A customer both sells and buys the same security in value of more than $1M within a single day Deal fulfillment: Package arrival and payment arrival 6/3 10:00 7/3 11:00 8/3 11:00 8/3 14:00
116. Fine tuning of the semantics (I) When should the derived event be emitted? When the Pattern is matched ? At the window end?
117. Fine tuning of the semantics (II) How many instances of derived events should be emitted? Only once? Every time there is a match ?
118. Fine tuning of the semantics (III) What happens if the same event happens several times? Only one – first, last, higher/lower value on some predicate? All of them participate in a match?
119. Fine tuning of the semantics (IV) Can we consume or reuse events that participate in a match?
120.
121.
122. Ordering in a distributed environment - possible issues Even if the occurrence time of an event is accurate, it might arrive after some processing has already been done If we used occurrence time of an event as reported by the sources it might not be accurate, due to clock accuracy in the source Most systems order event by detection time – but events may switch their order on the way
123. Clock accuracy in the source Clock synchronization Time server, example: http:// tf.nist.gov/service/its.htm
124.
125. Retrospective compensation Out of order event Recalculation Retraction of previous EPA results Not always possible!
126. Classification to windows - scenario Calculate Statistics for each Player (aggregate per quarter) Calculate Statistics for each Team (aggregate per quarter) Window classification: Player statistics are calculated at the end of each quarter Team statistics are calculated at the end of each quarter based on the players events arrived within the same quarter All instances of player statistics that occur within a quarter window must be classified to the same window, even if they are derived after the window termination.
128. Event processing is an emerging technology Potential for mutually beneficial interaction with AI Make the next generation A vehicle to substantially Change the world Already attracted coverage of analysts and all major software vendors Event Patterns It barely scratched the surface Of its potential