SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Downloaden Sie, um offline zu lesen
Towards Interactive Network Forensics and
            Incident Response

             Matthias Vallentin
             UC Berkeley / ICSI
            vallentin@icir.org

              Boundary Tech Talk
               San Francisco, CA

              November 17, 2011
Motivation

What do the following activities have in common?
    Network troubleshooting
    Incident response
    Network forensics




                                                   2 / 36
Motivation

What do the following activities have in common?
    Network troubleshooting
    Incident response
    Network forensics

 → Data-intensive analysis of past activity
 → Interactive response times often critical




                                                   2 / 36
Motivation

What do the following activities have in common?
    Network troubleshooting
    Incident response
    Network forensics

 → Data-intensive analysis of past activity
 → Interactive response times often critical




“How to build a platform that efficiently supports these activities?”



                                                                2 / 36
Outline



1. Incident Response and Network Forensics



2. Operational Network Monitoring using Bro



3. Building an Interactive Analytics Platform




                                                3 / 36
About


4th -year PhD student at UC Berkeley, advised by Vern Paxson
Working with researchers at ICSI/ICIR and the AMPlab
Interests
   Large-scale network intrusion detection
   High-performance traffic analysis
   Network forensics and incident response
 → with strong operational emphasis
Projects
    The Bro network security monitor
    VAST: Visibility Across Space and Time
    HILTI: High-Level Intermediate Language for Traffic Inspection




                                                                   4 / 36
Outline



1. Incident Response and Network Forensics



2. Operational Network Monitoring using Bro



3. Building an Interactive Analytics Platform




                                                5 / 36
Use Case #1: Classic Incident Response


    Goal: fast and comprehensive analysis of security incidents
   Often begins with an external piece of intelligence
        “IP X serves malware over HTTP”
        “This MD5 hash is malware”
        “Connections to 128.11.5.0/27 at port 42000 are malicious”
   Analysis style: Ad-hoc, interactive, several refinements/adaptions
   Typical operations
        Filter: project, select
        Aggregate: mean, sum, quantile, min/max, histogram, top-k,
        unique
⇒ Concrete starting point, then widen scope (bottom-up)




                                                                       6 / 36
Use Case #2: Network Troubleshooting

    Goal: find root cause of component failure
   Often no specific hint, merely symptomatic feedback
       “I can’t access my Gmail”
   Typical operations
       Zoom: slice activity at different granularities
            Time: seconds, minutes, days, . . .
            Space: layer 2/3/4/7, host, subnet, port, URL, . . .
       Study time series data of activity aggregates
       Find abnormal activity
            “Today we see 20% less outbound DNS compared to yesterday”
            Infer dependency graphs: use joint behavior from past to asses present
            impact [KMV+ 09]
            Judicious machine learning [SP10]
⇒ No concrete starting point, narrow scope (top-down)


                                                                               7 / 36
Use Case #3: Combating Insider Abuse

    Goal: uncover policy violations of personnel
   Analysis procedure: connect the dots
   Insider attack:
        Chain of authorized actions, hard to detect individually
        E.g., data exfiltration
          1. User logs in to internal machine
          2. Copies sensitive document to local machine
          3. Sends document to third party via email
   Typical operations
        Compare activity profiles
             “Jon never logs in to our backup machine at 3am”
             “Seth accessed 10x more files on our servers today”
⇒ Relate temporally distant events, behavior-based detection



                                                                   8 / 36
Outline



1. Incident Response and Network Forensics



2. Operational Network Monitoring using Bro



3. Building an Interactive Analytics Platform




                                                9 / 36
Basic Network Monitoring



               Internet             Tap          Local Network




                                   Monitor




Sites
        UC Berkeley (10 Gbps, 50,000 hosts)
        NCSA, IL (8×10 Gbps, 10,000 hosts)
        LBNL, Berkeley (10 Gbps, 12,000 hosts)
        ICSI, Berkeley (100 Mbps, 250 hosts)
        AirJaldi, India (10 Mbps, 500 hosts)


                                                                 10 / 36
High-Performance Network Monitoring:
     The NIDS Cluster [VSL+ 07]

      Internet                    Tap                Local Network




                                Frontend




                 Worker   ...   Worker     ...   Worker



                                                 Proxy

                                Manager



   Packets
   Logs
   State                         User


                                                                     11 / 36
The Bro Cluster


                                   Internet              Tap               Local Network


We run it operationally at:                            Frontend
    UC Berkeley (26 workers)
    LBNL (15 workers)
                                                                                 Proxy
    NCSA (10 4-core workers)                  Worker   Worker     Worker

Runs at numerous large sites:                                                    Proxy




                                                          ...




                                                                                   ...
    Industry                                  Worker   Worker     Worker
                                                                                 Proxy
    Academia
    Government
                                Packets
                                Logs
                                                       Manager
                                State




                                                                                         12 / 36
The Bro Network Security Monitor

    Fundamentally different from other IDS
    Real-time network analysis framework    User Interface
    Policy-neutral at the core                         Logs        Notifications
    Highly stateful
                                                       Script Interpreter

Key components
                                                                  Events
 1. Event engine
         TCP stream reassembly
                                                        Event Engine
         Protocol analysis
         Policy-neutral
                                                                  Packets
 2. Script interpreter
         “Domain-specific Python”                             Network
         Generate extensive logs
         Apply site policy


                                                                                  13 / 36
From Packets to High-Level Descriptions of Activity

Event declaration
 type connection: record { orig: addr, resp: addr, ... }

 event connection_established(c: connection)
 event http_request(c: connection, method: string, URI: string)
 event http_reply(c: connection, status: string, data: string)




                                                                  14 / 36
From Packets to High-Level Descriptions of Activity

Event declaration
 type connection: record { orig: addr, resp: addr, ... }

 event connection_established(c: connection)
 event http_request(c: connection, method: string, URI: string)
 event http_reply(c: connection, status: string, data: string)



Event instantiation
 connection_established({127.0.0.1, 128.32.244.172, ... })
 http_request({127.0.0.1, 128.32.244.172, ..}, "GET", "/index.html")
 http_reply({127.0.0.1, 128.32.244.172, ..}, "200", "<!DOCTYPE ht..")
 http_request({127.0.0.1, 128.32.244.172, ..}, "GET", "/favicon.ico")
 http_reply({127.0.0.1, 128.32.244.172, ..}, "200", "xBExEFx..")
 connection_established({127.0.0.1, 128.32.112.224, ... })



                                                                    14 / 36
Event Extraction with Bro
Event and data model
     Rich-typed: first-class networking types (addr, port, subnet, . . . )
     Deep: across the whole network stack
     Fine-grained: detailed protocol-level information
     Expressive: nested data with container types (aka. semi-structured)


 Messages       Application             http_request, smtp_reply, ssl_certificate


 Byte stream     Transport              new_connection, udp_request


 Packets       (Inter)Network           new_packet, packet_contents


 Frames            Link                 arp_request, arp_reply




                                                                             15 / 36
After the Fact: Bro Logs
      Policy-neutral by default: no notion of good or bad
             Forensic investigations highly benefit from unbiased information
      Flexible output formats: ASCII, binary, DB, custom

% more conn.log
#fields ts        id.orig_h         id.orig_p   id.resp_h         id.resp_p   proto   service   duration   obytes ..
1144876741.1198   192.150.186.169   53115       82.94.237.218     80          tcp     http      16.14929   435
1144876612.6063   192.150.186.169   53090       198.189.255.82    80          tcp     http      4.437460   8661
1144876596.5597   192.150.186.169   53051       193.203.227.129   80          tcp     http      0.372440   461
1144876606.7789   192.150.186.169   53082       198.189.255.73    80          tcp     http      0.597711   337
1144876741.4693   192.150.186.169   53116       82.94.237.218     80          tcp     http      16.02667   3027
1144876745.6102   192.150.186.169   53117       66.102.7.99       80          tcp     http      1.004346   422
1144876605.6847   192.150.186.169   53075       207.151.118.143   80          tcp     http      0.029663   347



% more http.log
#fields ts        id.orig_h         id.orig_p   host               uri                     status_code     user_agent ..
1144876741.6335   192.150.186.169   53116       docs.python.org    /lib/lib.css            200             Mozilla/5.0
1144876742.1687   192.150.186.169   53116       docs.python.org    /icons/previous.png     304             Mozilla/5.0
1144876741.2838   192.150.186.169   53115       docs.python.org    /lib/lib.html           200             Mozilla/5.0
1144876742.3337   192.150.186.169   53116       docs.python.org    /icons/up.png           304             Mozilla/5.0
1144876742.3337   192.150.186.169   53116       docs.python.org    /icons/next.png         304             Mozilla/5.0
1144876742.3337   192.150.186.169   53116       docs.python.org    /icons/contents.png     304             Mozilla/5.0
1144876742.3337   192.150.186.169   53116       docs.python.org    /icons/modules.png      304             Mozilla/5.0
1144876742.3338   192.150.186.169   53116       docs.python.org    /icons/index.png        304             Mozilla/5.0
1144876745.6144   192.150.186.169   53117       www.google.com     /                       200             Mozilla/5.0


                                                                                                                 16 / 36
After the Fact: Bro Logs




                           17 / 36
Log Analysis


What do we do with Bro logs?
    Process (ad-hoc analysis)
    Summarize (time series data, histogram/top-k, quantile)
    Correlate (machine learning, statistical tests)
    Age (elevate old data into higher levels of abstraction)
    Visualize




                                                               18 / 36
Log Analysis


What do we do with Bro logs?
    Process (ad-hoc analysis)
    Summarize (time series data, histogram/top-k, quantile)
    Correlate (machine learning, statistical tests)
    Age (elevate old data into higher levels of abstraction)
    Visualize
How do we do it?
    All eggs in one basket
         SIEM: Splunk, ArcSight, NarusInsight, . . . $$$
         VAST
    In-situ processing
         Tools of the trade (awk, sort, uniq, . . . )
         MapReduce / Hadoop




                                                               18 / 36
Outline



1. Incident Response and Network Forensics



2. Operational Network Monitoring using Bro



3. Building an Interactive Analytics Platform




                                                19 / 36
From Ephemeral to Persistent Activity
   Bro events
                                                        User Interface
      Policy-neutral activity
      Ephemeral                                                       Logs        Notifications

      Only inside the Bro process
                                                                      Script Interpreter
    → Can I haz access?
   Broccoli                             3rd-party                                Events
                                       Application
       Send/Receive Bro events




                                                               Comm
                                                     Events
                                        Broccoli                           Event Engine
       Written in C
       Language bindings
                                                                                 Packets
              Ruby
              Python
                                                                          Network
              Perl
→ Send-them-while-they-are-hot

                (Broccoli = Bro client communications library)


                                                                                                 20 / 36
From Ephemeral to Persistent Activity


                               Bro


 Apache


                     Events   Query   Result
 Broccoli   Events



 OpenSSH

                                               Query


            Events                             Result
                                                        User
 Broccoli




                                                               21 / 36
Today’s Open-Source Solutions for Analytics




                                              22 / 36
Caveats in Real-Time Analytics


1. Getting poor performance
       Batch processing (MapReduce)
       Architectural flaws (inflexible MQ)
       Bloated runtime (Java)
2. Losing domain-specific context
       Typing
       Nesting
       Causality

                        “Can we do better?”




                                                23 / 36
Inspiration

1. Dremel
         Columnar storage
         Nested data model
2. Bigtable
         Sharding: distributed tablets
3. GFS
         Single master with meta data
         Locate chunks via master
4. Sawzall
         Aggregators: collection, sample, sum, maximum, quantile, top-k,
         unique
5. FastBit
         Bitmap indexes “work” for high-cardinality attributes



                                                                     24 / 36
Design Philosophy Touch Stones [Lam11]
Storage
    Keep data sorted → reduce seeks, easy random entry
    Shard with access locality → minimize involved nodes
    Store data in columns → don’t waste I/O
    Use append-only disk format → avoid expensive index updates




                                                                  25 / 36
Design Philosophy Touch Stones [Lam11]
Storage
    Keep data sorted → reduce seeks, easy random entry
    Shard with access locality → minimize involved nodes
    Store data in columns → don’t waste I/O
    Use append-only disk format → avoid expensive index updates

Compute
    Use disk appropriately → large sequential reads
    Trade CPU for I/O → type-specific, aggressive compression
    Use pipelined parallelism → hide latency
    Ship compute to data → aggregation serving tree




                                                                  25 / 36
Design Philosophy Touch Stones [Lam11]
Storage
    Keep data sorted → reduce seeks, easy random entry
    Shard with access locality → minimize involved nodes
    Store data in columns → don’t waste I/O
    Use append-only disk format → avoid expensive index updates

Compute
    Use disk appropriately → large sequential reads
    Trade CPU for I/O → type-specific, aggressive compression
    Use pipelined parallelism → hide latency
    Ship compute to data → aggregation serving tree

Query
    Make it user-friendly → declarative query interface
    Provide query hooks → support complex analysis
                                                                  25 / 36
VAST: Visibility Across Space and Time


Visibility
     Deep understanding of the data
     Visualization: you know how to do that already. . .
Across space:
     Unify heterogeneous data formats
     One query language
             Apache logs, SSH logs, Bro events, sensor data, . . .
Across time:
  1. From the ancient past (old historical data)
  2. To subscribing to data that may arrive in the future




                                                                     26 / 36
Queries


Two types
 1. Search: historical query
 2. Feed: live query
 → use case: crawl archive first, then make query permanent
Unify two ends of a spectrum
                           Live                  Historical
      Operation            Push                      Pull
      Latency          O(|Xresult |)             O(|Xdata |)
      Data location     In-memory           Disk (ideally cached)
      Flexibility       Predefined            Ad-hoc, adjustable
      Cost            Pay-As-You-Go              Lumpsum




                                                                    27 / 36
VAST: Architecture Overview


                                     Ingest          Query
Distributed architecture
    Elasticity via MQ middle layer
                                                     Store
    Few component dependencies
DFS: fault-tolerance, replication
Archive: key-value store
    Contains serialized events       Archive

Index: sharded column-store
                                                     Index
    Compressed bitmap indexes
In-memory store
    Caches tablets (LRU)
                                               DFS
    Flushes in batches




                                                             28 / 36
VAST: Ingestion Architecture


                                                       Store
                                      Event
                                                      Indexer
                                      Router
1. Events arrive at Event Router
   1.1 Assign UUID x                                       write
                                      put
   1.2 Put (x, event) in archive                      Tablets
   1.3 Forward event to Indexer
                                                           ripe?
2. Indexer writes event into tablet
                                                       Tablet
   and updates indexes                                Manager

3. Tablet Manager flushes “ripe”                            flush

   tablets                            Archive

       Capacity (space/rows)
                                                      Tablets
       Lifetime
                                                       Index




                                                DFS

                                                                   29 / 36
VAST: Query Architecture


                                                       Store
                                      Query            Query
                                     Manager           Proxy

                                                            query
1. User or NIDS issues query         get
                                                      Tablets
2. Query Manager distributes it
   to relevant nodes                                        LRU

3. Tablet Manager load tablets                         Tablet
                                                      Manager
4. Query Proxy hits index                                   flush
   a Returns direct result           Archive         load

   b Returns set of UUIDs
                                                      Tablets

                                                       Index




                                               DFS

                                                                    30 / 36
Bitmap Indexes


                                        Data        Bitmap Index
                                               b0     b1   b2      b3
Column cardinality: # distinct values
                                         2     0      0    1       0
One bitmap bi for each value i
                                         1     0      1    0       0
Sparse, but compressible
                                         2     0      0    1       0
    WAH [WOSN01]
    COMPAX [FSV10]                       0     1      0    0       0
    Consice [CDP10]                      0     1      0    0       0
Can operate on compressed bitmaps
                                         1     0      1    0       0
    No need to decompress
                                         3     0      0    0       1




                                                                   31 / 36
Conclusion




1. Motivation: incident response, network troubleshooting, insider abuse
2. The Bro network security monitor
       High-performance network monitoring
       Expressive representation of activity
       Publish/subscribe event model
3. Design sketch of a distributed analytics platform




                                                                     32 / 36
Questions?




             33 / 36
References I

A. Colantonio and R. Di Pietro.
Concise: Compressed ’n’ Composable Integer Set.
Information Processing Letters, 110(16):644–650, 2010.
Francesco Fusco, Marc Ph. Stoecklin, and Michail Vlachos.
NET-FLi: On-the-fly Compression, Archiving and Indexing of
Streaming Network Traffic.
Proceedings of the VLDB Endowment, 3:1382–1393, September 2010.
Srikanth Kandula, Ratul Mahajan, Patrick Verkaik, Sharad Agarwal,
Jitendra Padhye, and Paramvir Bahl.
Detailed Diagnosis in Enterprise Networks.
In Proceedings of the ACM SIGCOMM 2009 Conference on Data
Communication, SIGCOMM ’09, pages 243–254, New York, NY, USA,
2009. ACM.


                                                             34 / 36
References II


Andrew Lamb.
Building Blocks for Large Analytic Systems.
In 5th Extremely Large Databases Conference, XLDB ’11, Menlo Park,
California, October 2011.
Robin Sommer and Vern Paxson.
Outside the Closed World: On Using Machine Learning for Network
Intrusion Detection.
In Proceedings of the 2010 IEEE Symposium on Security and Privacy,
SP ’10, pages 305–316, Washington, DC, USA, 2010. IEEE Computer
Society.




                                                               35 / 36
References III


Matthias Vallentin, Robin Sommer, Jason Lee, Craig Leres, Vern
Paxson, and Brian Tierney.
The NIDS Cluster: Scalably Stateful Network Intrusion Detection on
Commodity Hardware.
In Proceedings of the 10th International Conference on Recent
Advances in Intrusion Detection, RAID’07, pages 107–126.
Springer-Verlag, September 2007.
Kesheng Wu, Ekow J. Otoo, Arie Shoshani, and Henrik Nordberg.
Notes on Design and Implementation of Compressed Bit Vectors.
Technical Report LBNL-3161, Lawrence Berkeley National Laboratory,
Berkeley, CA, USA, 94720, 2001.




                                                                36 / 36

Weitere ähnliche Inhalte

Was ist angesagt?

Cloud-forensics
Cloud-forensicsCloud-forensics
Cloud-forensicsanupriti
 
Cloud Computing Forensic Science
 Cloud Computing Forensic Science  Cloud Computing Forensic Science
Cloud Computing Forensic Science David Sweigert
 
SECURITY CONSIDERATION IN PEER-TO-PEER NETWORKS WITH A CASE STUDY APPLICATION
SECURITY CONSIDERATION IN PEER-TO-PEER NETWORKS WITH A CASE STUDY APPLICATIONSECURITY CONSIDERATION IN PEER-TO-PEER NETWORKS WITH A CASE STUDY APPLICATION
SECURITY CONSIDERATION IN PEER-TO-PEER NETWORKS WITH A CASE STUDY APPLICATIONIJNSA Journal
 
DDOS ATTACK DETECTION ON INTERNET OF THINGS USING UNSUPERVISED ALGORITHMS
DDOS ATTACK DETECTION ON INTERNET OF THINGS USING UNSUPERVISED ALGORITHMSDDOS ATTACK DETECTION ON INTERNET OF THINGS USING UNSUPERVISED ALGORITHMS
DDOS ATTACK DETECTION ON INTERNET OF THINGS USING UNSUPERVISED ALGORITHMSijfls
 
Statistical analysis of HTTPS reachability
Statistical analysis of HTTPS reachabilityStatistical analysis of HTTPS reachability
Statistical analysis of HTTPS reachabilityAPNIC
 
A Steganography-based Covert Keylogger
A Steganography-based Covert KeyloggerA Steganography-based Covert Keylogger
A Steganography-based Covert KeyloggerCSCJournals
 
Layered Approach for Preprocessing of Data in Intrusion Prevention Systems
Layered Approach for Preprocessing of Data in Intrusion Prevention SystemsLayered Approach for Preprocessing of Data in Intrusion Prevention Systems
Layered Approach for Preprocessing of Data in Intrusion Prevention SystemsEditor IJCATR
 
Network Security: Experiment of Network Health Analysis At An ISP
Network Security: Experiment of Network Health Analysis At An ISPNetwork Security: Experiment of Network Health Analysis At An ISP
Network Security: Experiment of Network Health Analysis At An ISPCSCJournals
 
How to detect middleboxes guidelines on a methodology
How to detect middleboxes guidelines on a methodologyHow to detect middleboxes guidelines on a methodology
How to detect middleboxes guidelines on a methodologycsandit
 
Burning Down the Haystack to Find the Needle: Security Analytics in Action
Burning Down the Haystack to Find the Needle:  Security Analytics in ActionBurning Down the Haystack to Find the Needle:  Security Analytics in Action
Burning Down the Haystack to Find the Needle: Security Analytics in ActionJosh Sokol
 
USING A DEEP UNDERSTANDING OF NETWORK ACTIVITIES FOR SECURITY EVENT MANAGEMENT
USING A DEEP UNDERSTANDING OF NETWORK ACTIVITIES FOR SECURITY EVENT MANAGEMENTUSING A DEEP UNDERSTANDING OF NETWORK ACTIVITIES FOR SECURITY EVENT MANAGEMENT
USING A DEEP UNDERSTANDING OF NETWORK ACTIVITIES FOR SECURITY EVENT MANAGEMENTIJNSA Journal
 
Network Attack and Intrusion Prevention System
Network Attack and  Intrusion Prevention System Network Attack and  Intrusion Prevention System
Network Attack and Intrusion Prevention System Deris Stiawan
 
AUTHENTICATION USING TRUST TO DETECT MISBEHAVING NODES IN MOBILE AD HOC NETWO...
AUTHENTICATION USING TRUST TO DETECT MISBEHAVING NODES IN MOBILE AD HOC NETWO...AUTHENTICATION USING TRUST TO DETECT MISBEHAVING NODES IN MOBILE AD HOC NETWO...
AUTHENTICATION USING TRUST TO DETECT MISBEHAVING NODES IN MOBILE AD HOC NETWO...IJNSA Journal
 
[❤PDF❤] The Basics of Digital Forensics The Primer for Getting Started in Dig...
[❤PDF❤] The Basics of Digital Forensics The Primer for Getting Started in Dig...[❤PDF❤] The Basics of Digital Forensics The Primer for Getting Started in Dig...
[❤PDF❤] The Basics of Digital Forensics The Primer for Getting Started in Dig...AngelinaJacobs2
 

Was ist angesagt? (19)

Cloud-forensics
Cloud-forensicsCloud-forensics
Cloud-forensics
 
Cloud Computing Forensic Science
 Cloud Computing Forensic Science  Cloud Computing Forensic Science
Cloud Computing Forensic Science
 
4777.team c.final
4777.team c.final4777.team c.final
4777.team c.final
 
SECURITY CONSIDERATION IN PEER-TO-PEER NETWORKS WITH A CASE STUDY APPLICATION
SECURITY CONSIDERATION IN PEER-TO-PEER NETWORKS WITH A CASE STUDY APPLICATIONSECURITY CONSIDERATION IN PEER-TO-PEER NETWORKS WITH A CASE STUDY APPLICATION
SECURITY CONSIDERATION IN PEER-TO-PEER NETWORKS WITH A CASE STUDY APPLICATION
 
DDOS ATTACK DETECTION ON INTERNET OF THINGS USING UNSUPERVISED ALGORITHMS
DDOS ATTACK DETECTION ON INTERNET OF THINGS USING UNSUPERVISED ALGORITHMSDDOS ATTACK DETECTION ON INTERNET OF THINGS USING UNSUPERVISED ALGORITHMS
DDOS ATTACK DETECTION ON INTERNET OF THINGS USING UNSUPERVISED ALGORITHMS
 
Statistical analysis of HTTPS reachability
Statistical analysis of HTTPS reachabilityStatistical analysis of HTTPS reachability
Statistical analysis of HTTPS reachability
 
Blug Talk
Blug TalkBlug Talk
Blug Talk
 
A Steganography-based Covert Keylogger
A Steganography-based Covert KeyloggerA Steganography-based Covert Keylogger
A Steganography-based Covert Keylogger
 
Layered Approach for Preprocessing of Data in Intrusion Prevention Systems
Layered Approach for Preprocessing of Data in Intrusion Prevention SystemsLayered Approach for Preprocessing of Data in Intrusion Prevention Systems
Layered Approach for Preprocessing of Data in Intrusion Prevention Systems
 
Network Security: Experiment of Network Health Analysis At An ISP
Network Security: Experiment of Network Health Analysis At An ISPNetwork Security: Experiment of Network Health Analysis At An ISP
Network Security: Experiment of Network Health Analysis At An ISP
 
How to detect middleboxes guidelines on a methodology
How to detect middleboxes guidelines on a methodologyHow to detect middleboxes guidelines on a methodology
How to detect middleboxes guidelines on a methodology
 
Paper1
Paper1Paper1
Paper1
 
Burning Down the Haystack to Find the Needle: Security Analytics in Action
Burning Down the Haystack to Find the Needle:  Security Analytics in ActionBurning Down the Haystack to Find the Needle:  Security Analytics in Action
Burning Down the Haystack to Find the Needle: Security Analytics in Action
 
Network Miner Network forensics
Network Miner Network forensicsNetwork Miner Network forensics
Network Miner Network forensics
 
Ew25914917
Ew25914917Ew25914917
Ew25914917
 
USING A DEEP UNDERSTANDING OF NETWORK ACTIVITIES FOR SECURITY EVENT MANAGEMENT
USING A DEEP UNDERSTANDING OF NETWORK ACTIVITIES FOR SECURITY EVENT MANAGEMENTUSING A DEEP UNDERSTANDING OF NETWORK ACTIVITIES FOR SECURITY EVENT MANAGEMENT
USING A DEEP UNDERSTANDING OF NETWORK ACTIVITIES FOR SECURITY EVENT MANAGEMENT
 
Network Attack and Intrusion Prevention System
Network Attack and  Intrusion Prevention System Network Attack and  Intrusion Prevention System
Network Attack and Intrusion Prevention System
 
AUTHENTICATION USING TRUST TO DETECT MISBEHAVING NODES IN MOBILE AD HOC NETWO...
AUTHENTICATION USING TRUST TO DETECT MISBEHAVING NODES IN MOBILE AD HOC NETWO...AUTHENTICATION USING TRUST TO DETECT MISBEHAVING NODES IN MOBILE AD HOC NETWO...
AUTHENTICATION USING TRUST TO DETECT MISBEHAVING NODES IN MOBILE AD HOC NETWO...
 
[❤PDF❤] The Basics of Digital Forensics The Primer for Getting Started in Dig...
[❤PDF❤] The Basics of Digital Forensics The Primer for Getting Started in Dig...[❤PDF❤] The Basics of Digital Forensics The Primer for Getting Started in Dig...
[❤PDF❤] The Basics of Digital Forensics The Primer for Getting Started in Dig...
 

Ähnlich wie Matthias Vallentin - Towards Interactive Network Forensics and Incident Response, Boundary Tech Talks November 17, 2011

LO-PHI: Low-Observable Physical Host Instrumentation for Malware Analysis
LO-PHI: Low-Observable Physical Host Instrumentation for Malware AnalysisLO-PHI: Low-Observable Physical Host Instrumentation for Malware Analysis
LO-PHI: Low-Observable Physical Host Instrumentation for Malware AnalysisPietro De Nicolao
 
Collecting and analyzing network-based evidence
Collecting and analyzing network-based evidenceCollecting and analyzing network-based evidence
Collecting and analyzing network-based evidenceCSITiaesprime
 
Novetta Cyber Analytics
Novetta Cyber AnalyticsNovetta Cyber Analytics
Novetta Cyber AnalyticsNovetta
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009Ian Foster
 
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentApache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentHostedbyConfluent
 
A comparative study of social network analysis tools
A comparative study of social network analysis toolsA comparative study of social network analysis tools
A comparative study of social network analysis toolsDavid Combe
 
Making Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxMaking Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxTrayan Iliev
 
Proactive ops for container orchestration environments
Proactive ops for container orchestration environmentsProactive ops for container orchestration environments
Proactive ops for container orchestration environmentsDocker, Inc.
 
Evolution from EDA to Data Mesh: Data in Motion
Evolution from EDA to Data Mesh: Data in MotionEvolution from EDA to Data Mesh: Data in Motion
Evolution from EDA to Data Mesh: Data in Motionconfluent
 
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN AlgorithmIRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN AlgorithmIRJET Journal
 
Triple DES Model
Triple DES ModelTriple DES Model
Triple DES ModelJean Arnett
 
Nt1310 Data Link Layer
Nt1310 Data Link LayerNt1310 Data Link Layer
Nt1310 Data Link LayerMeghan Howard
 
Cloud Camp Milan 2K9 Telecom Italia: Where P2P?
Cloud Camp Milan 2K9 Telecom Italia: Where P2P?Cloud Camp Milan 2K9 Telecom Italia: Where P2P?
Cloud Camp Milan 2K9 Telecom Italia: Where P2P?Gabriele Bozzi
 
CloudCamp Milan 2009: Telecom Italia
CloudCamp Milan 2009: Telecom ItaliaCloudCamp Milan 2009: Telecom Italia
CloudCamp Milan 2009: Telecom ItaliaGabriele Bozzi
 
OSS Presentation Keynote by Hal Stern
OSS Presentation Keynote by Hal SternOSS Presentation Keynote by Hal Stern
OSS Presentation Keynote by Hal SternOpenStorageSummit
 
DoS Forensic Exemplar Comparison to a Known Sample
DoS Forensic Exemplar Comparison to a Known SampleDoS Forensic Exemplar Comparison to a Known Sample
DoS Forensic Exemplar Comparison to a Known SampleCSCJournals
 
EFFICIENT IDENTIFICATION AND REDUCTION OF MULTIPLE ATTACKS ADD VICTIMISATION ...
EFFICIENT IDENTIFICATION AND REDUCTION OF MULTIPLE ATTACKS ADD VICTIMISATION ...EFFICIENT IDENTIFICATION AND REDUCTION OF MULTIPLE ATTACKS ADD VICTIMISATION ...
EFFICIENT IDENTIFICATION AND REDUCTION OF MULTIPLE ATTACKS ADD VICTIMISATION ...IRJET Journal
 
Anomaly Detection at Scale
Anomaly Detection at ScaleAnomaly Detection at Scale
Anomaly Detection at ScaleJeff Henrikson
 

Ähnlich wie Matthias Vallentin - Towards Interactive Network Forensics and Incident Response, Boundary Tech Talks November 17, 2011 (20)

LO-PHI: Low-Observable Physical Host Instrumentation for Malware Analysis
LO-PHI: Low-Observable Physical Host Instrumentation for Malware AnalysisLO-PHI: Low-Observable Physical Host Instrumentation for Malware Analysis
LO-PHI: Low-Observable Physical Host Instrumentation for Malware Analysis
 
Collecting and analyzing network-based evidence
Collecting and analyzing network-based evidenceCollecting and analyzing network-based evidence
Collecting and analyzing network-based evidence
 
Novetta Cyber Analytics
Novetta Cyber AnalyticsNovetta Cyber Analytics
Novetta Cyber Analytics
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009
 
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentApache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
 
A comparative study of social network analysis tools
A comparative study of social network analysis toolsA comparative study of social network analysis tools
A comparative study of social network analysis tools
 
Making Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFluxMaking Machine Learning Easy with H2O and WebFlux
Making Machine Learning Easy with H2O and WebFlux
 
Proactive ops for container orchestration environments
Proactive ops for container orchestration environmentsProactive ops for container orchestration environments
Proactive ops for container orchestration environments
 
Sideband_SB_020316
Sideband_SB_020316Sideband_SB_020316
Sideband_SB_020316
 
Evolution from EDA to Data Mesh: Data in Motion
Evolution from EDA to Data Mesh: Data in MotionEvolution from EDA to Data Mesh: Data in Motion
Evolution from EDA to Data Mesh: Data in Motion
 
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN AlgorithmIRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
 
Triple DES Model
Triple DES ModelTriple DES Model
Triple DES Model
 
Nt1310 Data Link Layer
Nt1310 Data Link LayerNt1310 Data Link Layer
Nt1310 Data Link Layer
 
Cloud Camp Milan 2K9 Telecom Italia: Where P2P?
Cloud Camp Milan 2K9 Telecom Italia: Where P2P?Cloud Camp Milan 2K9 Telecom Italia: Where P2P?
Cloud Camp Milan 2K9 Telecom Italia: Where P2P?
 
CloudCamp Milan 2009: Telecom Italia
CloudCamp Milan 2009: Telecom ItaliaCloudCamp Milan 2009: Telecom Italia
CloudCamp Milan 2009: Telecom Italia
 
OSS Presentation Keynote by Hal Stern
OSS Presentation Keynote by Hal SternOSS Presentation Keynote by Hal Stern
OSS Presentation Keynote by Hal Stern
 
DoS Forensic Exemplar Comparison to a Known Sample
DoS Forensic Exemplar Comparison to a Known SampleDoS Forensic Exemplar Comparison to a Known Sample
DoS Forensic Exemplar Comparison to a Known Sample
 
Grid Computing
Grid ComputingGrid Computing
Grid Computing
 
EFFICIENT IDENTIFICATION AND REDUCTION OF MULTIPLE ATTACKS ADD VICTIMISATION ...
EFFICIENT IDENTIFICATION AND REDUCTION OF MULTIPLE ATTACKS ADD VICTIMISATION ...EFFICIENT IDENTIFICATION AND REDUCTION OF MULTIPLE ATTACKS ADD VICTIMISATION ...
EFFICIENT IDENTIFICATION AND REDUCTION OF MULTIPLE ATTACKS ADD VICTIMISATION ...
 
Anomaly Detection at Scale
Anomaly Detection at ScaleAnomaly Detection at Scale
Anomaly Detection at Scale
 

Kürzlich hochgeladen

Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncObject Automation
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxYounusS2
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataSafe Software
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 

Kürzlich hochgeladen (20)

Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation Inc
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptx
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 

Matthias Vallentin - Towards Interactive Network Forensics and Incident Response, Boundary Tech Talks November 17, 2011

  • 1. Towards Interactive Network Forensics and Incident Response Matthias Vallentin UC Berkeley / ICSI vallentin@icir.org Boundary Tech Talk San Francisco, CA November 17, 2011
  • 2. Motivation What do the following activities have in common? Network troubleshooting Incident response Network forensics 2 / 36
  • 3. Motivation What do the following activities have in common? Network troubleshooting Incident response Network forensics → Data-intensive analysis of past activity → Interactive response times often critical 2 / 36
  • 4. Motivation What do the following activities have in common? Network troubleshooting Incident response Network forensics → Data-intensive analysis of past activity → Interactive response times often critical “How to build a platform that efficiently supports these activities?” 2 / 36
  • 5. Outline 1. Incident Response and Network Forensics 2. Operational Network Monitoring using Bro 3. Building an Interactive Analytics Platform 3 / 36
  • 6. About 4th -year PhD student at UC Berkeley, advised by Vern Paxson Working with researchers at ICSI/ICIR and the AMPlab Interests Large-scale network intrusion detection High-performance traffic analysis Network forensics and incident response → with strong operational emphasis Projects The Bro network security monitor VAST: Visibility Across Space and Time HILTI: High-Level Intermediate Language for Traffic Inspection 4 / 36
  • 7. Outline 1. Incident Response and Network Forensics 2. Operational Network Monitoring using Bro 3. Building an Interactive Analytics Platform 5 / 36
  • 8. Use Case #1: Classic Incident Response Goal: fast and comprehensive analysis of security incidents Often begins with an external piece of intelligence “IP X serves malware over HTTP” “This MD5 hash is malware” “Connections to 128.11.5.0/27 at port 42000 are malicious” Analysis style: Ad-hoc, interactive, several refinements/adaptions Typical operations Filter: project, select Aggregate: mean, sum, quantile, min/max, histogram, top-k, unique ⇒ Concrete starting point, then widen scope (bottom-up) 6 / 36
  • 9. Use Case #2: Network Troubleshooting Goal: find root cause of component failure Often no specific hint, merely symptomatic feedback “I can’t access my Gmail” Typical operations Zoom: slice activity at different granularities Time: seconds, minutes, days, . . . Space: layer 2/3/4/7, host, subnet, port, URL, . . . Study time series data of activity aggregates Find abnormal activity “Today we see 20% less outbound DNS compared to yesterday” Infer dependency graphs: use joint behavior from past to asses present impact [KMV+ 09] Judicious machine learning [SP10] ⇒ No concrete starting point, narrow scope (top-down) 7 / 36
  • 10. Use Case #3: Combating Insider Abuse Goal: uncover policy violations of personnel Analysis procedure: connect the dots Insider attack: Chain of authorized actions, hard to detect individually E.g., data exfiltration 1. User logs in to internal machine 2. Copies sensitive document to local machine 3. Sends document to third party via email Typical operations Compare activity profiles “Jon never logs in to our backup machine at 3am” “Seth accessed 10x more files on our servers today” ⇒ Relate temporally distant events, behavior-based detection 8 / 36
  • 11. Outline 1. Incident Response and Network Forensics 2. Operational Network Monitoring using Bro 3. Building an Interactive Analytics Platform 9 / 36
  • 12. Basic Network Monitoring Internet Tap Local Network Monitor Sites UC Berkeley (10 Gbps, 50,000 hosts) NCSA, IL (8×10 Gbps, 10,000 hosts) LBNL, Berkeley (10 Gbps, 12,000 hosts) ICSI, Berkeley (100 Mbps, 250 hosts) AirJaldi, India (10 Mbps, 500 hosts) 10 / 36
  • 13. High-Performance Network Monitoring: The NIDS Cluster [VSL+ 07] Internet Tap Local Network Frontend Worker ... Worker ... Worker Proxy Manager Packets Logs State User 11 / 36
  • 14. The Bro Cluster Internet Tap Local Network We run it operationally at: Frontend UC Berkeley (26 workers) LBNL (15 workers) Proxy NCSA (10 4-core workers) Worker Worker Worker Runs at numerous large sites: Proxy ... ... Industry Worker Worker Worker Proxy Academia Government Packets Logs Manager State 12 / 36
  • 15. The Bro Network Security Monitor Fundamentally different from other IDS Real-time network analysis framework User Interface Policy-neutral at the core Logs Notifications Highly stateful Script Interpreter Key components Events 1. Event engine TCP stream reassembly Event Engine Protocol analysis Policy-neutral Packets 2. Script interpreter “Domain-specific Python” Network Generate extensive logs Apply site policy 13 / 36
  • 16. From Packets to High-Level Descriptions of Activity Event declaration type connection: record { orig: addr, resp: addr, ... } event connection_established(c: connection) event http_request(c: connection, method: string, URI: string) event http_reply(c: connection, status: string, data: string) 14 / 36
  • 17. From Packets to High-Level Descriptions of Activity Event declaration type connection: record { orig: addr, resp: addr, ... } event connection_established(c: connection) event http_request(c: connection, method: string, URI: string) event http_reply(c: connection, status: string, data: string) Event instantiation connection_established({127.0.0.1, 128.32.244.172, ... }) http_request({127.0.0.1, 128.32.244.172, ..}, "GET", "/index.html") http_reply({127.0.0.1, 128.32.244.172, ..}, "200", "<!DOCTYPE ht..") http_request({127.0.0.1, 128.32.244.172, ..}, "GET", "/favicon.ico") http_reply({127.0.0.1, 128.32.244.172, ..}, "200", "xBExEFx..") connection_established({127.0.0.1, 128.32.112.224, ... }) 14 / 36
  • 18. Event Extraction with Bro Event and data model Rich-typed: first-class networking types (addr, port, subnet, . . . ) Deep: across the whole network stack Fine-grained: detailed protocol-level information Expressive: nested data with container types (aka. semi-structured) Messages Application http_request, smtp_reply, ssl_certificate Byte stream Transport new_connection, udp_request Packets (Inter)Network new_packet, packet_contents Frames Link arp_request, arp_reply 15 / 36
  • 19. After the Fact: Bro Logs Policy-neutral by default: no notion of good or bad Forensic investigations highly benefit from unbiased information Flexible output formats: ASCII, binary, DB, custom % more conn.log #fields ts id.orig_h id.orig_p id.resp_h id.resp_p proto service duration obytes .. 1144876741.1198 192.150.186.169 53115 82.94.237.218 80 tcp http 16.14929 435 1144876612.6063 192.150.186.169 53090 198.189.255.82 80 tcp http 4.437460 8661 1144876596.5597 192.150.186.169 53051 193.203.227.129 80 tcp http 0.372440 461 1144876606.7789 192.150.186.169 53082 198.189.255.73 80 tcp http 0.597711 337 1144876741.4693 192.150.186.169 53116 82.94.237.218 80 tcp http 16.02667 3027 1144876745.6102 192.150.186.169 53117 66.102.7.99 80 tcp http 1.004346 422 1144876605.6847 192.150.186.169 53075 207.151.118.143 80 tcp http 0.029663 347 % more http.log #fields ts id.orig_h id.orig_p host uri status_code user_agent .. 1144876741.6335 192.150.186.169 53116 docs.python.org /lib/lib.css 200 Mozilla/5.0 1144876742.1687 192.150.186.169 53116 docs.python.org /icons/previous.png 304 Mozilla/5.0 1144876741.2838 192.150.186.169 53115 docs.python.org /lib/lib.html 200 Mozilla/5.0 1144876742.3337 192.150.186.169 53116 docs.python.org /icons/up.png 304 Mozilla/5.0 1144876742.3337 192.150.186.169 53116 docs.python.org /icons/next.png 304 Mozilla/5.0 1144876742.3337 192.150.186.169 53116 docs.python.org /icons/contents.png 304 Mozilla/5.0 1144876742.3337 192.150.186.169 53116 docs.python.org /icons/modules.png 304 Mozilla/5.0 1144876742.3338 192.150.186.169 53116 docs.python.org /icons/index.png 304 Mozilla/5.0 1144876745.6144 192.150.186.169 53117 www.google.com / 200 Mozilla/5.0 16 / 36
  • 20. After the Fact: Bro Logs 17 / 36
  • 21. Log Analysis What do we do with Bro logs? Process (ad-hoc analysis) Summarize (time series data, histogram/top-k, quantile) Correlate (machine learning, statistical tests) Age (elevate old data into higher levels of abstraction) Visualize 18 / 36
  • 22. Log Analysis What do we do with Bro logs? Process (ad-hoc analysis) Summarize (time series data, histogram/top-k, quantile) Correlate (machine learning, statistical tests) Age (elevate old data into higher levels of abstraction) Visualize How do we do it? All eggs in one basket SIEM: Splunk, ArcSight, NarusInsight, . . . $$$ VAST In-situ processing Tools of the trade (awk, sort, uniq, . . . ) MapReduce / Hadoop 18 / 36
  • 23. Outline 1. Incident Response and Network Forensics 2. Operational Network Monitoring using Bro 3. Building an Interactive Analytics Platform 19 / 36
  • 24. From Ephemeral to Persistent Activity Bro events User Interface Policy-neutral activity Ephemeral Logs Notifications Only inside the Bro process Script Interpreter → Can I haz access? Broccoli 3rd-party Events Application Send/Receive Bro events Comm Events Broccoli Event Engine Written in C Language bindings Packets Ruby Python Network Perl → Send-them-while-they-are-hot (Broccoli = Bro client communications library) 20 / 36
  • 25. From Ephemeral to Persistent Activity Bro Apache Events Query Result Broccoli Events OpenSSH Query Events Result User Broccoli 21 / 36
  • 26. Today’s Open-Source Solutions for Analytics 22 / 36
  • 27. Caveats in Real-Time Analytics 1. Getting poor performance Batch processing (MapReduce) Architectural flaws (inflexible MQ) Bloated runtime (Java) 2. Losing domain-specific context Typing Nesting Causality “Can we do better?” 23 / 36
  • 28. Inspiration 1. Dremel Columnar storage Nested data model 2. Bigtable Sharding: distributed tablets 3. GFS Single master with meta data Locate chunks via master 4. Sawzall Aggregators: collection, sample, sum, maximum, quantile, top-k, unique 5. FastBit Bitmap indexes “work” for high-cardinality attributes 24 / 36
  • 29. Design Philosophy Touch Stones [Lam11] Storage Keep data sorted → reduce seeks, easy random entry Shard with access locality → minimize involved nodes Store data in columns → don’t waste I/O Use append-only disk format → avoid expensive index updates 25 / 36
  • 30. Design Philosophy Touch Stones [Lam11] Storage Keep data sorted → reduce seeks, easy random entry Shard with access locality → minimize involved nodes Store data in columns → don’t waste I/O Use append-only disk format → avoid expensive index updates Compute Use disk appropriately → large sequential reads Trade CPU for I/O → type-specific, aggressive compression Use pipelined parallelism → hide latency Ship compute to data → aggregation serving tree 25 / 36
  • 31. Design Philosophy Touch Stones [Lam11] Storage Keep data sorted → reduce seeks, easy random entry Shard with access locality → minimize involved nodes Store data in columns → don’t waste I/O Use append-only disk format → avoid expensive index updates Compute Use disk appropriately → large sequential reads Trade CPU for I/O → type-specific, aggressive compression Use pipelined parallelism → hide latency Ship compute to data → aggregation serving tree Query Make it user-friendly → declarative query interface Provide query hooks → support complex analysis 25 / 36
  • 32. VAST: Visibility Across Space and Time Visibility Deep understanding of the data Visualization: you know how to do that already. . . Across space: Unify heterogeneous data formats One query language Apache logs, SSH logs, Bro events, sensor data, . . . Across time: 1. From the ancient past (old historical data) 2. To subscribing to data that may arrive in the future 26 / 36
  • 33. Queries Two types 1. Search: historical query 2. Feed: live query → use case: crawl archive first, then make query permanent Unify two ends of a spectrum Live Historical Operation Push Pull Latency O(|Xresult |) O(|Xdata |) Data location In-memory Disk (ideally cached) Flexibility Predefined Ad-hoc, adjustable Cost Pay-As-You-Go Lumpsum 27 / 36
  • 34. VAST: Architecture Overview Ingest Query Distributed architecture Elasticity via MQ middle layer Store Few component dependencies DFS: fault-tolerance, replication Archive: key-value store Contains serialized events Archive Index: sharded column-store Index Compressed bitmap indexes In-memory store Caches tablets (LRU) DFS Flushes in batches 28 / 36
  • 35. VAST: Ingestion Architecture Store Event Indexer Router 1. Events arrive at Event Router 1.1 Assign UUID x write put 1.2 Put (x, event) in archive Tablets 1.3 Forward event to Indexer ripe? 2. Indexer writes event into tablet Tablet and updates indexes Manager 3. Tablet Manager flushes “ripe” flush tablets Archive Capacity (space/rows) Tablets Lifetime Index DFS 29 / 36
  • 36. VAST: Query Architecture Store Query Query Manager Proxy query 1. User or NIDS issues query get Tablets 2. Query Manager distributes it to relevant nodes LRU 3. Tablet Manager load tablets Tablet Manager 4. Query Proxy hits index flush a Returns direct result Archive load b Returns set of UUIDs Tablets Index DFS 30 / 36
  • 37. Bitmap Indexes Data Bitmap Index b0 b1 b2 b3 Column cardinality: # distinct values 2 0 0 1 0 One bitmap bi for each value i 1 0 1 0 0 Sparse, but compressible 2 0 0 1 0 WAH [WOSN01] COMPAX [FSV10] 0 1 0 0 0 Consice [CDP10] 0 1 0 0 0 Can operate on compressed bitmaps 1 0 1 0 0 No need to decompress 3 0 0 0 1 31 / 36
  • 38. Conclusion 1. Motivation: incident response, network troubleshooting, insider abuse 2. The Bro network security monitor High-performance network monitoring Expressive representation of activity Publish/subscribe event model 3. Design sketch of a distributed analytics platform 32 / 36
  • 39. Questions? 33 / 36
  • 40. References I A. Colantonio and R. Di Pietro. Concise: Compressed ’n’ Composable Integer Set. Information Processing Letters, 110(16):644–650, 2010. Francesco Fusco, Marc Ph. Stoecklin, and Michail Vlachos. NET-FLi: On-the-fly Compression, Archiving and Indexing of Streaming Network Traffic. Proceedings of the VLDB Endowment, 3:1382–1393, September 2010. Srikanth Kandula, Ratul Mahajan, Patrick Verkaik, Sharad Agarwal, Jitendra Padhye, and Paramvir Bahl. Detailed Diagnosis in Enterprise Networks. In Proceedings of the ACM SIGCOMM 2009 Conference on Data Communication, SIGCOMM ’09, pages 243–254, New York, NY, USA, 2009. ACM. 34 / 36
  • 41. References II Andrew Lamb. Building Blocks for Large Analytic Systems. In 5th Extremely Large Databases Conference, XLDB ’11, Menlo Park, California, October 2011. Robin Sommer and Vern Paxson. Outside the Closed World: On Using Machine Learning for Network Intrusion Detection. In Proceedings of the 2010 IEEE Symposium on Security and Privacy, SP ’10, pages 305–316, Washington, DC, USA, 2010. IEEE Computer Society. 35 / 36
  • 42. References III Matthias Vallentin, Robin Sommer, Jason Lee, Craig Leres, Vern Paxson, and Brian Tierney. The NIDS Cluster: Scalably Stateful Network Intrusion Detection on Commodity Hardware. In Proceedings of the 10th International Conference on Recent Advances in Intrusion Detection, RAID’07, pages 107–126. Springer-Verlag, September 2007. Kesheng Wu, Ekow J. Otoo, Arie Shoshani, and Henrik Nordberg. Notes on Design and Implementation of Compressed Bit Vectors. Technical Report LBNL-3161, Lawrence Berkeley National Laboratory, Berkeley, CA, USA, 94720, 2001. 36 / 36