SlideShare a Scribd company logo
1 of 80
Download to read offline
Persistent Data Structures

    Living in a world where nothing
   changes but everything evolves
                  - or -
A complete idiot's guide to immutability
Java                            Haskell



                            vs




● Warm, soft and cute            ● Strange, unfamiliar alien
● Imperative                     ● Purely functional
● Object oriented                ● Everything is different
● Just like good old             ● Shocking news! It's not
  Basic, but with classes          like Basic!
Haskell does not have variables!
Imagine a dialect of Java where everything is final by default
  class LinkedList {
   class Node {
     final Node next, prev;
     final Object value;
   }

      final Node head, tail;

      void add(final Object v) {
        for (final Node n = head; n != null; n = n.next) {
        ...
        }
      }
  }


   All fields, parameters and variables are automatically
 immutable, the final is implied everywhere, and there is no
                      way to get rid of it
Haskell does not have variables!
Imagine a dialect of Java where everything is final by default
  class LinkedList {
   class Node {
     final Node next, prev;
     final Object value;
   }
                                         It does for me!
      final Node head, tail;

      void add(final doesn't make
             But it Object v) {
                      sense!
        for (final Node n = head; n != null; n = n.next) {
        ...
        }         It won't work!
      }
  }


        All fields, parameters and variables are automatically
               immutable, the final is implied everywhere
What is a variable?

var·y/ˈve(ə)rē/
vary, varied, varying

 ● — verb (used with object)
Definition: to change or alter, as in form, appearance,
character, or substance

 ● — verb (used without object)
Definition: to undergo change in appearance, form, substance,
character, etc

 ● — synonyms:
modify, mutate
"Variables" in Haskell

 ● Must be assigned once declared

   YES: int a = 1;          NO: int a;

 ● Cannot be reassigned

   YES: final int a = 1; NO: a = 2;

These are mathematical variables, not imperative ones!
When everything is immutable

There is no notion of time:

 ● Functions take old values, produce new values, nothing is
   changed in-place
 ● It does not matter when a function was called, it only
   matters what arguments it was called with

There is no notion of identity:

 ● Everything is a value, complex data structures are values
   too
 ● There is no way to tell if a == b, only if a.equals(b)
 ● In other words, values are never identical to each other, but
   may be equal
I want my linked list!

Basic terminology:

 ● Ephemeral data structure — everything that is not
   persistent. Most Java data structures (lists, sets, etc.) are
   ephemeral.

 ● Persistent data structure — immutable data structure with
   history. No in-place modifications. Operations on it create
   new versions. Older versions are always available. That. Is.
   Simple.

 ● The persistence property has nothing to do with persistent
   storage, like disks! This is a completely different story.
I want my linked list!

 ● In imperative languages, like Java, most data structures are
   ephemeral by default
Designing persistent data structures is somewhat awkward and
not always efficient

 ● In purely functional languages, like Haskell, all data
   structures are automatically persistent!
There is just no other way to make data structures
History of updates




      Making update to a persistent DS instance
always creates a new instance that contains this update.
         The current version is left unmodified.
Why should I bother?

      Is it fun? Hell yeah!




 But is it practical? Let's see!
The free lunch is over!
"The biggest sea change in software development
 since the OO revolution is knocking at the door,
  and its name is Concurrency." — Herb Sutter


                                      A commodity
                                        hardware
                                       (my laptop)




The need for writing correct multi-threaded code
           is constantly increasing
Concurrent data structures are hard!

Want a concurrent ephemeral linked list?
Here are some implementation strategies:

 ● Coarse-grained synchronization
 ● Fine-grained synchronization
 ● Optimistic synchronization
 ● Lazy synchronization
All lock-based — no composition, deadlocks, etc

 ● Non-blocking synchronization in different flavors
And you need the size of a list you are in trouble!
Concurrent data structures are hard!

● Making mutable concurrent data structures requires inter-
  thread coordination within these structures

● Locks and atomic references all over the place

● Decades of research by academia with many attempts

● Sophisticated algorithms that are hard to reason about, test
  and prove

● Several different ways to solve the same problems, each
  with its own cons and pros
Concurrent data structures are hard!

● Making mutable concurrent data structures requires inter-
                       Yes, but are persistent data
  thread coordination within these structures
                       structures actually simpler?
● Locks and atomic references all over the place

● Decades of research by academia with many attempts

● Sophisticated algorithms that are hard to test and prove

● Several different ways to solve the same problems, each
  with its own cons and pros
Just give up mutability!

● Persistent data structures are easy to reason about in
  concurrent environment

● The behavior does not depend on how many threads are
  trying to "modify" it at once

● Therefore persistent data structures are very easy to test
  and debug
The whole picture

 ● Persistent data structures alone are not sufficient
They are an essential part of the picture, but not the whole
answer to concurrency
 ● Inter-thread coordination is needed
Threads still need to know what each other thread is doing to
agree on a common outcome

 ● But it can be added "outside"
Which gives us complete separation of concerns
The whole picture

Solving concurrency challenge in a modern language:

 ● Scala Way — Persistent data structures with message
   passing

 ● Clojure Way — Persistent data structures with software
   transactional memory

 ● Will likely be mixed in the future
Last few words on concurrency

● Persistent data structures are slower than ephemeral ones
  in sequential use

● But not that much slower!

● We can forgive it, since they give you more functionality,
  and ephemeral data structures are simply less capable

● And in multiprocessor era, it is better to make things
  scalable rather than fast
Efficient persistent data structures

We want persistent data structures to be space and time
efficient:

 ● Structural sharing
We want to reuse as many fragments of the previous version
as possible
 ● Path copying
We want to copy as few pieces as possible
 ● Maybe, just maybe lazy evaluation (where available)
We don't want nasty pathological cases
A case study

● Let's make some persistent data
  structures in Java

● All these structures consist of     Why are you
  classes with only final fields    looking at me?!

● With good amortized asymptotic
  complexity in most cases
Our plan

Lets start with some trivial examples

 ● Stack

 ● Queue

 ● Tree

The proceed with more advanced structures

 ● Hash Table

 ● Finger Tree
Trivial Example — Persistent Stack
class Stack<T> {
 final T v; (a)
 final Stack<T> next; (b)
                                         It's just a singly linked
 Stack() {                                      list of nodes
   v = null;
   next = null;
   size = 0;
 }

 Stack(T v, Stack<T> next) {
   this.v = v;
   this.next = next;
 }
 ...




                               Source Code 1/2
Trivial Example — Persistent Stack
class Stack<T> {
 ...
 Stack<T> push(T v) {
   return new Stack<T>(v, this); (a)
 }

 T peek() {
   if (next == null)
     throw new NoSuchElementException();
   return v; (b)
 }

 Stack<T> pop() {
   if (next == null)
     throw new NoSuchElementException();
   return next; (c)
 }




                                Source Code 2/2
Trivial Example — Persistent Stack




      Structural sharing in persistent stack
Trivial Example — Persistent Stack


      Looks familiar?
     The versions tree!
Trivial Example — Persistent Stack



    Also known as
   Spaghetti stack or
     Cactus stack
Persistent Queue




It's just two stacks combined:    When front stack is empty,
                                  reverse back stack and
 ● Back stack to enqueue items    use it as front stack
 ● Front stack to dequeue items
Persistent Queue
class Queue<T> {
 // back stack - push elements here
 final Stack<T> b; (a)
 // front stack - pop elements from here
 final Stack<T> f; (b)

 Queue() {
   b = f = new Stack<T>();
 }

 Queue(Stack<T> b, Stack<T> f) {
   this.b = b;
   this.f = f;
 }

 boolean isEmpty() {
   return f.isEmpty(); (c)
 }
 ...


                              Source Code 1/3
Persistent Queue
class Queue<T> {
 ...
 static <T> Queue<T> check(Stack<T> b, Stack<T> f) {
   if (f.isEmpty())
     return new Queue<T>(f, b.reverse()); (a)
   else
     return new Queue<T>(b, f); (b)
 }

 Queue<T> push(T v) {
   return check(b.push(v), f);
 }

 Queue<T> pop() {
   if (isEmpty()) {
     throw new NoSuchElementException();
   }
   return check(b, f.pop());
 }


                                 Source Code 2/3
Persistent Queue
class Queue<T> {
 ...
 T peek() {
   if (isEmpty()) {
     throw new NoSuchElementException();
   }
   return f.peek();
 }

class Stack<T> {
 ...
 Stack<T> reverse() {
   if (isEmpty() || next.isEmpty())
     return this;
   Stack<T> r = new Stack<T>();
   for (Stack<T> s = this; !s.isEmpty(); s = s.pop()) {
     r = r.push(s.peek());
   }
   return r;
 }

                               Source Code 3/3
Persistent Queue




Structural sharing in persistent queue
Persistent Queue

Beware pathological cases!

 ● What is forward stack is empty, but back stack is full?

 ● And we are going to pop from the same queue N times

 ● Then we get N back back stack reversions!

 ● Lazy evaluation to the rescue — use lazy streams instead of
   strict stacks
Persistent Queue


                               But there is a better way
                                   to design queue!




Monoidally Annotated 2-3 Finger Tree is a versatile data
structure that can be used to build efficient lists, deques,
priority queues, interval trees, ropes, etc.

It is more complex, we will take a look at it later.
Persistent Tree

● It is trivial to convert any ephemeral tree to a persistent one
  by means of path copying

● It works for binary trees, 2-3 trees, B-trees, etc

● The shape of tree is not affected, only mutating algorithms

● In a balanced binary tree at most log N nodes need to be
  copied — quite efficient

● The secret to all persistent data structures is that they all
  are trees! (Yes, lists and hash tables are trees too)
Persistent Tree
Simple Persistent Binary Tree

class SimpleBinaryTree {
 static class Node {
   final K key; (a)
   final V value; (b)
   final Node l, r; (c)

   Node(K key, V value, Node l, Node r) {
     this.key = key;
     this.value = value;
     this.l = l;
     this.r = r;
   }
 }
 ...




                           Source Code 1/2
Simple Persistent Binary Tree

class SimpleBinaryTree {
 ...
 static Node insert(Node n, K key, V value) {
   if (n == null) {
     return new Node(key, value, null, null); (a)
   }
   int cmp = key.compareTo(n.key); (b)
   if (cmp < 0) {
     return new Node(n.key, n.value, (c)
      insert(n.l, key, value), n.r);
   }
   if (cmp > 0) {
     return new Node(n.key, n.value, (d)
      n.l, insert(n.r, key, value));
   }
   return new Node(key, value, n.l, n.r); (e)
 }



                            Source Code 2/2
Persistent Tree

Multiple definitions of persistence:

 ● Immutable data structure with history
 ● Committed to a persistent storage

Append only databases and file systems:

 ● CouchDB uses append only B-Tree
 ● RethinkDB makes append only variant of MySQL
 ● ZFS, BTRFS implement copy-on-write transactions
   and snapshots

Nothing is new under the moon!
Persistent Map

interface Map<K, V> {
  // get value for a key, or null if not found
  V get(K key);
  // make key/value association
  Map<K, V> put(K key, V value);
  // remove key/value association
  Map<K, V> remove(K key);
}




             Remember, no in-place updates
             Mutations create new instances
Persistent Map

Implementation Strategy

 ● Persistent red-black tree for ordered keys
   Time complexity — O(log n)

 ● Persistent hash table for hashable keys
   Time complexity — O(1)
Persistent Hash Table

But how do we implement it?
Copying the whole table would be too expensive!
Persistent Hash Table

Here's the idea: partition hash table into smaller
pieces, organized them as a persistent tree




Nice idea, but how do we navigate in such a tree?
Prefix Tree/Trie
Search is guided by individual letters of a string key




Hash code is just a string of digits!
Persistent Hash Table in Prefix Tree

Represent 32 bit hash codes as strings of 5 bit symbol:

hashCode = CAFEBABE16
level 6 5 4 3 2 1 0
bits 11 00101 01111 11101 01110 10101 11110
symbol 3 5 15 29 14 21 30
Persistent Hash Table

     hashCode = ... xxxxx xxxxx xxxxx xxxxx




Each item is either a key/value pair or a subtree
Persistent Hash Table

class PersistentHashMap {
 abstract class Item<K, V> {}

 class Node<K, V> extends Item<K, V> {
   final Item<K, V> children = new Item<K, V>[32]; (a)
 }

 class Entry<K, V> extends Item<K, V> {
   final int hashCode; (b)
   final K key; (c)
   final V value; (d)
   final Entry<K, V> next; (e)
 }




                        Source Code 1/2
Persistent Hash Table

class PersistentHashMap {
 V get(K key) {
   return root.find(key.hashCode(), key, 0); (a)
 }

 class Node<K, V> extends Item<K, V> {
  V find(int hashCode, K key, int level) {
    int index = (hashCode >>> (level * 5)) & 31; (b)
    Item<K, V> item = children[index]; (c)
    if (item instanceof Node) { (d)
      return ((Node<K, V>) item) (e)
       .find(hashCode, key, level + 1);
    }
    if (item instanceof Entry) { (f)
      return ((Entry<K, V>) item) (g)
       .find(hashCode, key);
    }
    return null;
  }

                          Source Code 2/2
Persistent Hash Table

Do not waste space!

      class PersistentHashMap {
       class Node<K, V> {
         final Item<K, V> children = new Item<K, V>[32]; (a)
       }


 ● Most of the children would be null on deeper levels

 ● The number of arrays grows exponentially as we go deeper

 ● Need to find a way to compact tree

 ● Simply get rid of nulls in arrays!
Persistent Hash Table

    class Node<K, V> {
      final int mask; (a)
      final Item<K, V> children =
        new Item<K, V>[bitCount(mask)]; (b)
    }


● Mask is a 32-bit integer whose bits set to 1 only for those
  array elements that are not null

● Array stores only non-null elements. Its size is the number
  of 1 bits in the mask. Array size varies from 2 to 32
  elements.

● Overhead for null array element is just one bit. Quite good!
Persistent Hash Table

● To test that array has element at index i, simply test if ith bit
  in the mask is 1:

  if ((mask & (1 << i)) != 0) { ...

● To get offset to ith element in the array, count number of 1
  bits lower than i in the mask:

  int offset = bitCount(mask & ((1 << i) - 1));
  if (children[offset] instanceof ...
Persistent List

interface Seq<T> {
  T head(); // get first element
  Seq<T> tail(); // get list without first element
  Seq<T> cons(T v); // append element to head
  Seq<T> snoc(T v); // append element to tail
  Seq<T> concat(Seq<T> that); // join two lists
  int size(); // get number of elements
  T get(int index); // get Nth element
  Seq<T> set(int index, T v); // set Nth element
}




             Remember, no in-place updates
             Mutations create new instances
Persistent List

● There are quite a few ways to implement persistent lists

● But we will not be studying them

● Instead, we will turn our attention to finger trees

● Soon, it will be clear why
Finger Trees

● An incredibly elegant, simple and efficient data structure

● Oh so very versatile, functional programmer's Swiss Army
  knife

● Basic data structure for building random acces sequences,
  deques, priority queues, ropes, interval trees, etc.

● Let's define it in stages
Persistent leafy 2-3 trees

Let's begin with a simple data structure — leafy 2-3 tree

 ● Every intermediate node has either two childrent or three
   children

 ● All values are stored in leafs

 ● Perfectly balanced — all leafs are at the same level
Persistent leafy 2-3 trees
Persistent leafy 2-3 trees



         Leafs contain interesting
         values,
           but what is stored in nodes?
Annotated leafy 2-3 trees

● There must be a way to find interesting values in a tree

● We need to guide search from the root of a tree to its leafs

● Let's add special annotations to nodes

● Use these annotations to find values
Size annotated leafy 2-3 trees

● Each intermediate node is annotated with the size of a
  subtree rooted at this node

● Makes it trivial to find any leaf by its index

● Starting from root, test if index is in the range of its left
  (middle) or right subtree, and repeat recursively for that
  subtree, until a leaf is found
Size annotated leafy 2-3 trees




     Looks like random access list
Priority annotated leafy 2-3 trees

● Each intermediate node is annotated with the highest
  priority of an element in its subtree

● Makes it trivial to find value with the highest priority

● Starting from root, find subtree with the highest priority
  descent recursively into it, until a leaf is found
Priority annotated leafy 2-3 trees




         Looks like priority queue
Monoids

● One interface to unify size, priority (and more!) annotations
  on trees

● A set of values with a "zero" element 0 and a binary
  associative operation ⊕

● Monoid laws:
  0⊕a = a
  a⊕0 = a
  a⊕(b⊕c) = (a⊕b)⊕c
Monoid examples

● Strings with empty string and concatenation
  "" + "a" = "a", "a" + "" = "a"
  "a" + ("b" + "c") = ("a" + "b") + "c"

● Integers with zero and addition
  0 + 1 = 1, 1 + 0 = 1
  1 + (2 + 3) = (1 + 2) + 3

● Integers with one and multiplication
  1 * 2 = 2, 2 * 1 = 1
  2 * (3 * 4) = (2 * 3) * 4

● And many, more of them! (Monoids are everywhere)
Monoid interface

interface Monoid<T extends Monoid<T>> {
  T unit();
  T combine(T that);
}

class String implements Monoid<String> {
 ...

    String unit() {
      return ""; (a)
    }

    String combine(String that) {
      return this + that; (b)
    }
}
Size monoid

class Size implements Monoid<Size> {
 final int size; (a)

    Size(int size) {
      this.size = size;
    }

    Size unit() {
      return new Size(0); (b)
    }

    Size combine(Size that) {
      return new Size(this.size + that.size); (c)
    }
}
Priority monoid

class Priority implements Monoid<Priority> {
 final int priority; (a)

    Priority(int priority) {
      this.priority = priority;
    }

    Priority unit() {
      return new Priority(MAX_INTEGER); (b)
    }

    Priority combine(Priority that) {
      return new Priority(
       Math.min(this.priority, that.priority)); (c)
    }
}
But where do we get monoids from?

● Monoids have nice property of composability

● We can get more monoids by combining existing ones

● But where do we get initial monoids to begin with?

● We need a way to measure values!

● Those measures must be monoids, obviously
    interface Measured<M extends Monoid> {
      M measure();
    }
Let's make a sketch of annotated tree
/** <V> is the type of values
   <M> is the type of monoidal measures of values */
class Tree<M extends Monoid, V extends Measured<M>>
   implements Measured<M> { (a)

 abstract class Leaf<M, V> extends Tree<M, V> {
   final V value; (b)
   override abstract M measure(); (c)
 }

 class Node<M, V> extends Tree<M, V> {
  final Tree<M, V> left, right; (d)
  final M m; (e)
  Node(Tree<M, V> l, Tree<M, V> r) {
    left = l; right = r;
    m = l.measure().combine(r.measure()); (f)
  }
  override final M measure() {                         Pseudocode!
    return m; (g)
  }
Let's make a sketch of annotated tree
...
class Leaf<V> extends Tree<Size, V> {
  final V value;

    override final Size measure() {
      return new Size(1); (a)
    }
}

...
class Leaf<V> extends Tree<Priority, V> {
  final V value;

    override final Priority measure() {
      return new Priority(value.priority()); (b)
    }
}
                                                   Pseudocode!
But that is not finger tree yet!
Finger Tree




... is a just an annotated tree of annotated 2-3 trees!
Finger Tree




Digits, 2-3 trees, fingers and nested levels
Finger Tree

A little bit of Haskell would not hurt:

data Node v a = Node2 v a a | Node3 v a a a

data Digit v a = One v a
        | Two v a a
        | Three v a a a
        | Four v a a a a

data FingerTree v a = Empty
           | Single a
           | Deep v
             (Digit a) (a)
             (FingerTree v (Node v a)) (b)
             (Digit a) (c)
Finger Tree

class FingerTree<M extends Monoid<M>, T extends Measured<M>>
   implements Measured<M> {

 class Empty<M extends Monoid<M>, T extends Measured<M>>
    extends FingerTree<M, T> {}

 class Single<M extends Monoid<M>, T extends Measured<M>>
    extends FingerTree<M, T> {
  final T v; (a)
  final M m; (b)

 class Deep<M extends Monoid<M>, T extends Measured<M>>
    extends FingerTree<M, T> {
  final Digit<M, T> prefix; (c)
  final FingerTree<M, Node<M, T>> middle; (d)
  final Digit<M, T> suffix; (e)
  final M m; (f)



                                  Source Code 1/3
Finger Tree

class Digit<M extends Monoid<M>, T extends Measured<M>>
   implements Measured<M> {
 final M m; (a)

 class One<M extends Monoid<M>, T extends Measured<M>>
    extends Digit<M, T> {
  final T a; (b)

 class Two<M extends Monoid<M>, T extends Measured<M>>
    extends Digit<M, T> {
  final T a, b; (c)

 class Three<M extends Monoid<M>, T extends Measured<M>>
    extends Digit<M, T> {
  final T a, b, c; (d)

 class Four<M extends Monoid<M>, T extends Measured<M>>
    extends Digit<M, T> {
  final T a, b, c, d; (e)

                                  Source Code 2/3
Finger Tree

class Node<M extends Monoid<M>, T extends Measured<M>>
   implements Measured<M> {
 final M m; (a)

 class Node2<M extends Monoid<M>, T extends Measured<M>>
    extends Node<M, T> {
  final T a, b; (b)

 class Node3<M extends Monoid<M>, T extends Measured<M>>
    extends Node<M, T> {
  final T a, b, c; (c)




                                 Source Code 3/3
Finger Tree Interface

Basic operations:

 ● cons, snoc — append/prepend element
 ● concat — join two trees
 ● split — find prefix, element and suffix using predicate

Beyond the scope of this presentation, sorry
Finger Tree Performance

Amortized bounds:

              Finger Tree          2-3 Tree   List
 ● cons, snoc O(1)                 O(log n)   O(1)/O(n)
 ● head, last O(1)                 O(log n)   O(1)/O(n)
 ● concat     O(log min(ℓ1, ℓ2))   O(log n)   O(n)
 ● split      O(log min(n, ℓ-n))   O(log n)   O(n)
 ● index      O(log min(n, ℓ-n)    O(log n)   O(n)
Thanks!

Questions?

More Related Content

Viewers also liked

Introduction of data structure
Introduction of data structureIntroduction of data structure
Introduction of data structureeShikshak
 
Introduction to data structures and Algorithm
Introduction to data structures and AlgorithmIntroduction to data structures and Algorithm
Introduction to data structures and AlgorithmDhaval Kaneria
 
Lecture 1 data structures and algorithms
Lecture 1 data structures and algorithmsLecture 1 data structures and algorithms
Lecture 1 data structures and algorithmsAakash deep Singhal
 
Data structures (introduction)
 Data structures (introduction) Data structures (introduction)
Data structures (introduction)Arvind Devaraj
 
DATA STRUCTURES
DATA STRUCTURESDATA STRUCTURES
DATA STRUCTURESbca2010
 

Viewers also liked (7)

Data Structures for Robotic Learning
Data Structures for Robotic LearningData Structures for Robotic Learning
Data Structures for Robotic Learning
 
Introduction of data structure
Introduction of data structureIntroduction of data structure
Introduction of data structure
 
Introduction to data structures and Algorithm
Introduction to data structures and AlgorithmIntroduction to data structures and Algorithm
Introduction to data structures and Algorithm
 
Data Structure
Data StructureData Structure
Data Structure
 
Lecture 1 data structures and algorithms
Lecture 1 data structures and algorithmsLecture 1 data structures and algorithms
Lecture 1 data structures and algorithms
 
Data structures (introduction)
 Data structures (introduction) Data structures (introduction)
Data structures (introduction)
 
DATA STRUCTURES
DATA STRUCTURESDATA STRUCTURES
DATA STRUCTURES
 

Similar to Persistent Data Structures by @aradzie

Programming picaresque
Programming picaresqueProgramming picaresque
Programming picaresqueBret McGuire
 
If You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongIf You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongMario Fusco
 
ParaSail
ParaSail  ParaSail
ParaSail AdaCore
 
Why we cannot ignore Functional Programming
Why we cannot ignore Functional ProgrammingWhy we cannot ignore Functional Programming
Why we cannot ignore Functional ProgrammingMario Fusco
 
BCS SPA 2010 - An Introduction to Scala for Java Developers
BCS SPA 2010 - An Introduction to Scala for Java DevelopersBCS SPA 2010 - An Introduction to Scala for Java Developers
BCS SPA 2010 - An Introduction to Scala for Java DevelopersMiles Sabin
 
An Introduction to Scala for Java Developers
An Introduction to Scala for Java DevelopersAn Introduction to Scala for Java Developers
An Introduction to Scala for Java DevelopersMiles Sabin
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Martin Odersky
 
A Brief Introduction to Scala for Java Developers
A Brief Introduction to Scala for Java DevelopersA Brief Introduction to Scala for Java Developers
A Brief Introduction to Scala for Java DevelopersMiles Sabin
 
GSoC2014 - Uniritter Presentation May, 2015
GSoC2014 - Uniritter Presentation May, 2015GSoC2014 - Uniritter Presentation May, 2015
GSoC2014 - Uniritter Presentation May, 2015Fabrízio Mello
 
Introductiontoprogramminginscala
IntroductiontoprogramminginscalaIntroductiontoprogramminginscala
IntroductiontoprogramminginscalaAmuhinda Hungai
 
Java.util.concurrent.concurrent hashmap
Java.util.concurrent.concurrent hashmapJava.util.concurrent.concurrent hashmap
Java.util.concurrent.concurrent hashmapSrinivasan Raghvan
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
 
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docxCoding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docxmary772
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdfHiroshi Ono
 

Similar to Persistent Data Structures by @aradzie (20)

Programming picaresque
Programming picaresqueProgramming picaresque
Programming picaresque
 
If You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are WrongIf You Think You Can Stay Away from Functional Programming, You Are Wrong
If You Think You Can Stay Away from Functional Programming, You Are Wrong
 
ParaSail
ParaSail  ParaSail
ParaSail
 
Lockless
LocklessLockless
Lockless
 
Yes scala can!
Yes scala can!Yes scala can!
Yes scala can!
 
Why we cannot ignore Functional Programming
Why we cannot ignore Functional ProgrammingWhy we cannot ignore Functional Programming
Why we cannot ignore Functional Programming
 
BCS SPA 2010 - An Introduction to Scala for Java Developers
BCS SPA 2010 - An Introduction to Scala for Java DevelopersBCS SPA 2010 - An Introduction to Scala for Java Developers
BCS SPA 2010 - An Introduction to Scala for Java Developers
 
An Introduction to Scala for Java Developers
An Introduction to Scala for Java DevelopersAn Introduction to Scala for Java Developers
An Introduction to Scala for Java Developers
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009
 
A Brief Introduction to Scala for Java Developers
A Brief Introduction to Scala for Java DevelopersA Brief Introduction to Scala for Java Developers
A Brief Introduction to Scala for Java Developers
 
GSoC2014 - Uniritter Presentation May, 2015
GSoC2014 - Uniritter Presentation May, 2015GSoC2014 - Uniritter Presentation May, 2015
GSoC2014 - Uniritter Presentation May, 2015
 
Introductiontoprogramminginscala
IntroductiontoprogramminginscalaIntroductiontoprogramminginscala
Introductiontoprogramminginscala
 
Java.util.concurrent.concurrent hashmap
Java.util.concurrent.concurrent hashmapJava.util.concurrent.concurrent hashmap
Java.util.concurrent.concurrent hashmap
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 
Java best practices
Java best practicesJava best practices
Java best practices
 
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docxCoding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
Coding Assignment 3CSC 330 Advanced Data Structures, Spri.docx
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 
scalaliftoff2009.pdf
scalaliftoff2009.pdfscalaliftoff2009.pdf
scalaliftoff2009.pdf
 

More from Vasil Remeniuk

Product Minsk - РТБ и Программатик
Product Minsk - РТБ и ПрограмматикProduct Minsk - РТБ и Программатик
Product Minsk - РТБ и ПрограмматикVasil Remeniuk
 
Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14Vasil Remeniuk
 
Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14Vasil Remeniuk
 
Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3Vasil Remeniuk
 
Testing in Scala by Adform research
Testing in Scala by Adform researchTesting in Scala by Adform research
Testing in Scala by Adform researchVasil Remeniuk
 
Spark Intro by Adform Research
Spark Intro by Adform ResearchSpark Intro by Adform Research
Spark Intro by Adform ResearchVasil Remeniuk
 
Types by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaTypes by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaVasil Remeniuk
 
Types by Adform Research
Types by Adform ResearchTypes by Adform Research
Types by Adform ResearchVasil Remeniuk
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovVasil Remeniuk
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovVasil Remeniuk
 
Spark by Adform Research, Paulius
Spark by Adform Research, PauliusSpark by Adform Research, Paulius
Spark by Adform Research, PauliusVasil Remeniuk
 
Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)Vasil Remeniuk
 
Spark intro by Adform Research
Spark intro by Adform ResearchSpark intro by Adform Research
Spark intro by Adform ResearchVasil Remeniuk
 
SBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius ValatkaSBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius ValatkaVasil Remeniuk
 
Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2Vasil Remeniuk
 
Testing in Scala. Adform Research
Testing in Scala. Adform ResearchTesting in Scala. Adform Research
Testing in Scala. Adform ResearchVasil Remeniuk
 
Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Vasil Remeniuk
 
Cassandra + Spark + Elk
Cassandra + Spark + ElkCassandra + Spark + Elk
Cassandra + Spark + ElkVasil Remeniuk
 
Опыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событияхОпыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событияхVasil Remeniuk
 

More from Vasil Remeniuk (20)

Product Minsk - РТБ и Программатик
Product Minsk - РТБ и ПрограмматикProduct Minsk - РТБ и Программатик
Product Minsk - РТБ и Программатик
 
Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14Работа с Akka Сluster, @afiskon, scalaby#14
Работа с Akka Сluster, @afiskon, scalaby#14
 
Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14Cake pattern. Presentation by Alex Famin at scalaby#14
Cake pattern. Presentation by Alex Famin at scalaby#14
 
Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3Scala laboratory: Globus. iteration #3
Scala laboratory: Globus. iteration #3
 
Testing in Scala by Adform research
Testing in Scala by Adform researchTesting in Scala by Adform research
Testing in Scala by Adform research
 
Spark Intro by Adform Research
Spark Intro by Adform ResearchSpark Intro by Adform Research
Spark Intro by Adform Research
 
Types by Adform Research, Saulius Valatka
Types by Adform Research, Saulius ValatkaTypes by Adform Research, Saulius Valatka
Types by Adform Research, Saulius Valatka
 
Types by Adform Research
Types by Adform ResearchTypes by Adform Research
Types by Adform Research
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
 
Spark by Adform Research, Paulius
Spark by Adform Research, PauliusSpark by Adform Research, Paulius
Spark by Adform Research, Paulius
 
Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)Scala Style by Adform Research (Saulius Valatka)
Scala Style by Adform Research (Saulius Valatka)
 
Spark intro by Adform Research
Spark intro by Adform ResearchSpark intro by Adform Research
Spark intro by Adform Research
 
SBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius ValatkaSBT by Aform Research, Saulius Valatka
SBT by Aform Research, Saulius Valatka
 
Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2
 
Testing in Scala. Adform Research
Testing in Scala. Adform ResearchTesting in Scala. Adform Research
Testing in Scala. Adform Research
 
Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1Scala laboratory. Globus. iteration #1
Scala laboratory. Globus. iteration #1
 
Cassandra + Spark + Elk
Cassandra + Spark + ElkCassandra + Spark + Elk
Cassandra + Spark + Elk
 
Опыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событияхОпыт использования Spark, Основано на реальных событиях
Опыт использования Spark, Основано на реальных событиях
 
ETL со Spark
ETL со SparkETL со Spark
ETL со Spark
 

Recently uploaded

9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 

Recently uploaded (20)

9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 

Persistent Data Structures by @aradzie

  • 1. Persistent Data Structures Living in a world where nothing changes but everything evolves - or - A complete idiot's guide to immutability
  • 2. Java Haskell vs ● Warm, soft and cute ● Strange, unfamiliar alien ● Imperative ● Purely functional ● Object oriented ● Everything is different ● Just like good old ● Shocking news! It's not Basic, but with classes like Basic!
  • 3. Haskell does not have variables! Imagine a dialect of Java where everything is final by default class LinkedList { class Node { final Node next, prev; final Object value; } final Node head, tail; void add(final Object v) { for (final Node n = head; n != null; n = n.next) { ... } } } All fields, parameters and variables are automatically immutable, the final is implied everywhere, and there is no way to get rid of it
  • 4. Haskell does not have variables! Imagine a dialect of Java where everything is final by default class LinkedList { class Node { final Node next, prev; final Object value; } It does for me! final Node head, tail; void add(final doesn't make But it Object v) { sense! for (final Node n = head; n != null; n = n.next) { ... } It won't work! } } All fields, parameters and variables are automatically immutable, the final is implied everywhere
  • 5. What is a variable? var·y/ˈve(ə)rē/ vary, varied, varying ● — verb (used with object) Definition: to change or alter, as in form, appearance, character, or substance ● — verb (used without object) Definition: to undergo change in appearance, form, substance, character, etc ● — synonyms: modify, mutate
  • 6. "Variables" in Haskell ● Must be assigned once declared YES: int a = 1; NO: int a; ● Cannot be reassigned YES: final int a = 1; NO: a = 2; These are mathematical variables, not imperative ones!
  • 7. When everything is immutable There is no notion of time: ● Functions take old values, produce new values, nothing is changed in-place ● It does not matter when a function was called, it only matters what arguments it was called with There is no notion of identity: ● Everything is a value, complex data structures are values too ● There is no way to tell if a == b, only if a.equals(b) ● In other words, values are never identical to each other, but may be equal
  • 8. I want my linked list! Basic terminology: ● Ephemeral data structure — everything that is not persistent. Most Java data structures (lists, sets, etc.) are ephemeral. ● Persistent data structure — immutable data structure with history. No in-place modifications. Operations on it create new versions. Older versions are always available. That. Is. Simple. ● The persistence property has nothing to do with persistent storage, like disks! This is a completely different story.
  • 9. I want my linked list! ● In imperative languages, like Java, most data structures are ephemeral by default Designing persistent data structures is somewhat awkward and not always efficient ● In purely functional languages, like Haskell, all data structures are automatically persistent! There is just no other way to make data structures
  • 10. History of updates Making update to a persistent DS instance always creates a new instance that contains this update. The current version is left unmodified.
  • 11. Why should I bother? Is it fun? Hell yeah! But is it practical? Let's see!
  • 12. The free lunch is over! "The biggest sea change in software development since the OO revolution is knocking at the door, and its name is Concurrency." — Herb Sutter A commodity hardware (my laptop) The need for writing correct multi-threaded code is constantly increasing
  • 13. Concurrent data structures are hard! Want a concurrent ephemeral linked list? Here are some implementation strategies: ● Coarse-grained synchronization ● Fine-grained synchronization ● Optimistic synchronization ● Lazy synchronization All lock-based — no composition, deadlocks, etc ● Non-blocking synchronization in different flavors And you need the size of a list you are in trouble!
  • 14. Concurrent data structures are hard! ● Making mutable concurrent data structures requires inter- thread coordination within these structures ● Locks and atomic references all over the place ● Decades of research by academia with many attempts ● Sophisticated algorithms that are hard to reason about, test and prove ● Several different ways to solve the same problems, each with its own cons and pros
  • 15. Concurrent data structures are hard! ● Making mutable concurrent data structures requires inter- Yes, but are persistent data thread coordination within these structures structures actually simpler? ● Locks and atomic references all over the place ● Decades of research by academia with many attempts ● Sophisticated algorithms that are hard to test and prove ● Several different ways to solve the same problems, each with its own cons and pros
  • 16. Just give up mutability! ● Persistent data structures are easy to reason about in concurrent environment ● The behavior does not depend on how many threads are trying to "modify" it at once ● Therefore persistent data structures are very easy to test and debug
  • 17. The whole picture ● Persistent data structures alone are not sufficient They are an essential part of the picture, but not the whole answer to concurrency ● Inter-thread coordination is needed Threads still need to know what each other thread is doing to agree on a common outcome ● But it can be added "outside" Which gives us complete separation of concerns
  • 18. The whole picture Solving concurrency challenge in a modern language: ● Scala Way — Persistent data structures with message passing ● Clojure Way — Persistent data structures with software transactional memory ● Will likely be mixed in the future
  • 19. Last few words on concurrency ● Persistent data structures are slower than ephemeral ones in sequential use ● But not that much slower! ● We can forgive it, since they give you more functionality, and ephemeral data structures are simply less capable ● And in multiprocessor era, it is better to make things scalable rather than fast
  • 20. Efficient persistent data structures We want persistent data structures to be space and time efficient: ● Structural sharing We want to reuse as many fragments of the previous version as possible ● Path copying We want to copy as few pieces as possible ● Maybe, just maybe lazy evaluation (where available) We don't want nasty pathological cases
  • 21. A case study ● Let's make some persistent data structures in Java ● All these structures consist of Why are you classes with only final fields looking at me?! ● With good amortized asymptotic complexity in most cases
  • 22. Our plan Lets start with some trivial examples ● Stack ● Queue ● Tree The proceed with more advanced structures ● Hash Table ● Finger Tree
  • 23. Trivial Example — Persistent Stack class Stack<T> { final T v; (a) final Stack<T> next; (b) It's just a singly linked Stack() { list of nodes v = null; next = null; size = 0; } Stack(T v, Stack<T> next) { this.v = v; this.next = next; } ... Source Code 1/2
  • 24. Trivial Example — Persistent Stack class Stack<T> { ... Stack<T> push(T v) { return new Stack<T>(v, this); (a) } T peek() { if (next == null) throw new NoSuchElementException(); return v; (b) } Stack<T> pop() { if (next == null) throw new NoSuchElementException(); return next; (c) } Source Code 2/2
  • 25. Trivial Example — Persistent Stack Structural sharing in persistent stack
  • 26. Trivial Example — Persistent Stack Looks familiar? The versions tree!
  • 27. Trivial Example — Persistent Stack Also known as Spaghetti stack or Cactus stack
  • 28. Persistent Queue It's just two stacks combined: When front stack is empty, reverse back stack and ● Back stack to enqueue items use it as front stack ● Front stack to dequeue items
  • 29. Persistent Queue class Queue<T> { // back stack - push elements here final Stack<T> b; (a) // front stack - pop elements from here final Stack<T> f; (b) Queue() { b = f = new Stack<T>(); } Queue(Stack<T> b, Stack<T> f) { this.b = b; this.f = f; } boolean isEmpty() { return f.isEmpty(); (c) } ... Source Code 1/3
  • 30. Persistent Queue class Queue<T> { ... static <T> Queue<T> check(Stack<T> b, Stack<T> f) { if (f.isEmpty()) return new Queue<T>(f, b.reverse()); (a) else return new Queue<T>(b, f); (b) } Queue<T> push(T v) { return check(b.push(v), f); } Queue<T> pop() { if (isEmpty()) { throw new NoSuchElementException(); } return check(b, f.pop()); } Source Code 2/3
  • 31. Persistent Queue class Queue<T> { ... T peek() { if (isEmpty()) { throw new NoSuchElementException(); } return f.peek(); } class Stack<T> { ... Stack<T> reverse() { if (isEmpty() || next.isEmpty()) return this; Stack<T> r = new Stack<T>(); for (Stack<T> s = this; !s.isEmpty(); s = s.pop()) { r = r.push(s.peek()); } return r; } Source Code 3/3
  • 32. Persistent Queue Structural sharing in persistent queue
  • 33. Persistent Queue Beware pathological cases! ● What is forward stack is empty, but back stack is full? ● And we are going to pop from the same queue N times ● Then we get N back back stack reversions! ● Lazy evaluation to the rescue — use lazy streams instead of strict stacks
  • 34. Persistent Queue But there is a better way to design queue! Monoidally Annotated 2-3 Finger Tree is a versatile data structure that can be used to build efficient lists, deques, priority queues, interval trees, ropes, etc. It is more complex, we will take a look at it later.
  • 35. Persistent Tree ● It is trivial to convert any ephemeral tree to a persistent one by means of path copying ● It works for binary trees, 2-3 trees, B-trees, etc ● The shape of tree is not affected, only mutating algorithms ● In a balanced binary tree at most log N nodes need to be copied — quite efficient ● The secret to all persistent data structures is that they all are trees! (Yes, lists and hash tables are trees too)
  • 37. Simple Persistent Binary Tree class SimpleBinaryTree { static class Node { final K key; (a) final V value; (b) final Node l, r; (c) Node(K key, V value, Node l, Node r) { this.key = key; this.value = value; this.l = l; this.r = r; } } ... Source Code 1/2
  • 38. Simple Persistent Binary Tree class SimpleBinaryTree { ... static Node insert(Node n, K key, V value) { if (n == null) { return new Node(key, value, null, null); (a) } int cmp = key.compareTo(n.key); (b) if (cmp < 0) { return new Node(n.key, n.value, (c) insert(n.l, key, value), n.r); } if (cmp > 0) { return new Node(n.key, n.value, (d) n.l, insert(n.r, key, value)); } return new Node(key, value, n.l, n.r); (e) } Source Code 2/2
  • 39. Persistent Tree Multiple definitions of persistence: ● Immutable data structure with history ● Committed to a persistent storage Append only databases and file systems: ● CouchDB uses append only B-Tree ● RethinkDB makes append only variant of MySQL ● ZFS, BTRFS implement copy-on-write transactions and snapshots Nothing is new under the moon!
  • 40. Persistent Map interface Map<K, V> { // get value for a key, or null if not found V get(K key); // make key/value association Map<K, V> put(K key, V value); // remove key/value association Map<K, V> remove(K key); } Remember, no in-place updates Mutations create new instances
  • 41. Persistent Map Implementation Strategy ● Persistent red-black tree for ordered keys Time complexity — O(log n) ● Persistent hash table for hashable keys Time complexity — O(1)
  • 42. Persistent Hash Table But how do we implement it? Copying the whole table would be too expensive!
  • 43. Persistent Hash Table Here's the idea: partition hash table into smaller pieces, organized them as a persistent tree Nice idea, but how do we navigate in such a tree?
  • 44. Prefix Tree/Trie Search is guided by individual letters of a string key Hash code is just a string of digits!
  • 45. Persistent Hash Table in Prefix Tree Represent 32 bit hash codes as strings of 5 bit symbol: hashCode = CAFEBABE16 level 6 5 4 3 2 1 0 bits 11 00101 01111 11101 01110 10101 11110 symbol 3 5 15 29 14 21 30
  • 46. Persistent Hash Table hashCode = ... xxxxx xxxxx xxxxx xxxxx Each item is either a key/value pair or a subtree
  • 47. Persistent Hash Table class PersistentHashMap { abstract class Item<K, V> {} class Node<K, V> extends Item<K, V> { final Item<K, V> children = new Item<K, V>[32]; (a) } class Entry<K, V> extends Item<K, V> { final int hashCode; (b) final K key; (c) final V value; (d) final Entry<K, V> next; (e) } Source Code 1/2
  • 48. Persistent Hash Table class PersistentHashMap { V get(K key) { return root.find(key.hashCode(), key, 0); (a) } class Node<K, V> extends Item<K, V> { V find(int hashCode, K key, int level) { int index = (hashCode >>> (level * 5)) & 31; (b) Item<K, V> item = children[index]; (c) if (item instanceof Node) { (d) return ((Node<K, V>) item) (e) .find(hashCode, key, level + 1); } if (item instanceof Entry) { (f) return ((Entry<K, V>) item) (g) .find(hashCode, key); } return null; } Source Code 2/2
  • 49. Persistent Hash Table Do not waste space! class PersistentHashMap { class Node<K, V> { final Item<K, V> children = new Item<K, V>[32]; (a) } ● Most of the children would be null on deeper levels ● The number of arrays grows exponentially as we go deeper ● Need to find a way to compact tree ● Simply get rid of nulls in arrays!
  • 50. Persistent Hash Table class Node<K, V> { final int mask; (a) final Item<K, V> children = new Item<K, V>[bitCount(mask)]; (b) } ● Mask is a 32-bit integer whose bits set to 1 only for those array elements that are not null ● Array stores only non-null elements. Its size is the number of 1 bits in the mask. Array size varies from 2 to 32 elements. ● Overhead for null array element is just one bit. Quite good!
  • 51. Persistent Hash Table ● To test that array has element at index i, simply test if ith bit in the mask is 1: if ((mask & (1 << i)) != 0) { ... ● To get offset to ith element in the array, count number of 1 bits lower than i in the mask: int offset = bitCount(mask & ((1 << i) - 1)); if (children[offset] instanceof ...
  • 52. Persistent List interface Seq<T> { T head(); // get first element Seq<T> tail(); // get list without first element Seq<T> cons(T v); // append element to head Seq<T> snoc(T v); // append element to tail Seq<T> concat(Seq<T> that); // join two lists int size(); // get number of elements T get(int index); // get Nth element Seq<T> set(int index, T v); // set Nth element } Remember, no in-place updates Mutations create new instances
  • 53. Persistent List ● There are quite a few ways to implement persistent lists ● But we will not be studying them ● Instead, we will turn our attention to finger trees ● Soon, it will be clear why
  • 54. Finger Trees ● An incredibly elegant, simple and efficient data structure ● Oh so very versatile, functional programmer's Swiss Army knife ● Basic data structure for building random acces sequences, deques, priority queues, ropes, interval trees, etc. ● Let's define it in stages
  • 55. Persistent leafy 2-3 trees Let's begin with a simple data structure — leafy 2-3 tree ● Every intermediate node has either two childrent or three children ● All values are stored in leafs ● Perfectly balanced — all leafs are at the same level
  • 57. Persistent leafy 2-3 trees Leafs contain interesting values, but what is stored in nodes?
  • 58. Annotated leafy 2-3 trees ● There must be a way to find interesting values in a tree ● We need to guide search from the root of a tree to its leafs ● Let's add special annotations to nodes ● Use these annotations to find values
  • 59. Size annotated leafy 2-3 trees ● Each intermediate node is annotated with the size of a subtree rooted at this node ● Makes it trivial to find any leaf by its index ● Starting from root, test if index is in the range of its left (middle) or right subtree, and repeat recursively for that subtree, until a leaf is found
  • 60. Size annotated leafy 2-3 trees Looks like random access list
  • 61. Priority annotated leafy 2-3 trees ● Each intermediate node is annotated with the highest priority of an element in its subtree ● Makes it trivial to find value with the highest priority ● Starting from root, find subtree with the highest priority descent recursively into it, until a leaf is found
  • 62. Priority annotated leafy 2-3 trees Looks like priority queue
  • 63. Monoids ● One interface to unify size, priority (and more!) annotations on trees ● A set of values with a "zero" element 0 and a binary associative operation ⊕ ● Monoid laws: 0⊕a = a a⊕0 = a a⊕(b⊕c) = (a⊕b)⊕c
  • 64. Monoid examples ● Strings with empty string and concatenation "" + "a" = "a", "a" + "" = "a" "a" + ("b" + "c") = ("a" + "b") + "c" ● Integers with zero and addition 0 + 1 = 1, 1 + 0 = 1 1 + (2 + 3) = (1 + 2) + 3 ● Integers with one and multiplication 1 * 2 = 2, 2 * 1 = 1 2 * (3 * 4) = (2 * 3) * 4 ● And many, more of them! (Monoids are everywhere)
  • 65. Monoid interface interface Monoid<T extends Monoid<T>> { T unit(); T combine(T that); } class String implements Monoid<String> { ... String unit() { return ""; (a) } String combine(String that) { return this + that; (b) } }
  • 66. Size monoid class Size implements Monoid<Size> { final int size; (a) Size(int size) { this.size = size; } Size unit() { return new Size(0); (b) } Size combine(Size that) { return new Size(this.size + that.size); (c) } }
  • 67. Priority monoid class Priority implements Monoid<Priority> { final int priority; (a) Priority(int priority) { this.priority = priority; } Priority unit() { return new Priority(MAX_INTEGER); (b) } Priority combine(Priority that) { return new Priority( Math.min(this.priority, that.priority)); (c) } }
  • 68. But where do we get monoids from? ● Monoids have nice property of composability ● We can get more monoids by combining existing ones ● But where do we get initial monoids to begin with? ● We need a way to measure values! ● Those measures must be monoids, obviously interface Measured<M extends Monoid> { M measure(); }
  • 69. Let's make a sketch of annotated tree /** <V> is the type of values <M> is the type of monoidal measures of values */ class Tree<M extends Monoid, V extends Measured<M>> implements Measured<M> { (a) abstract class Leaf<M, V> extends Tree<M, V> { final V value; (b) override abstract M measure(); (c) } class Node<M, V> extends Tree<M, V> { final Tree<M, V> left, right; (d) final M m; (e) Node(Tree<M, V> l, Tree<M, V> r) { left = l; right = r; m = l.measure().combine(r.measure()); (f) } override final M measure() { Pseudocode! return m; (g) }
  • 70. Let's make a sketch of annotated tree ... class Leaf<V> extends Tree<Size, V> { final V value; override final Size measure() { return new Size(1); (a) } } ... class Leaf<V> extends Tree<Priority, V> { final V value; override final Priority measure() { return new Priority(value.priority()); (b) } } Pseudocode!
  • 71. But that is not finger tree yet!
  • 72. Finger Tree ... is a just an annotated tree of annotated 2-3 trees!
  • 73. Finger Tree Digits, 2-3 trees, fingers and nested levels
  • 74. Finger Tree A little bit of Haskell would not hurt: data Node v a = Node2 v a a | Node3 v a a a data Digit v a = One v a | Two v a a | Three v a a a | Four v a a a a data FingerTree v a = Empty | Single a | Deep v (Digit a) (a) (FingerTree v (Node v a)) (b) (Digit a) (c)
  • 75. Finger Tree class FingerTree<M extends Monoid<M>, T extends Measured<M>> implements Measured<M> { class Empty<M extends Monoid<M>, T extends Measured<M>> extends FingerTree<M, T> {} class Single<M extends Monoid<M>, T extends Measured<M>> extends FingerTree<M, T> { final T v; (a) final M m; (b) class Deep<M extends Monoid<M>, T extends Measured<M>> extends FingerTree<M, T> { final Digit<M, T> prefix; (c) final FingerTree<M, Node<M, T>> middle; (d) final Digit<M, T> suffix; (e) final M m; (f) Source Code 1/3
  • 76. Finger Tree class Digit<M extends Monoid<M>, T extends Measured<M>> implements Measured<M> { final M m; (a) class One<M extends Monoid<M>, T extends Measured<M>> extends Digit<M, T> { final T a; (b) class Two<M extends Monoid<M>, T extends Measured<M>> extends Digit<M, T> { final T a, b; (c) class Three<M extends Monoid<M>, T extends Measured<M>> extends Digit<M, T> { final T a, b, c; (d) class Four<M extends Monoid<M>, T extends Measured<M>> extends Digit<M, T> { final T a, b, c, d; (e) Source Code 2/3
  • 77. Finger Tree class Node<M extends Monoid<M>, T extends Measured<M>> implements Measured<M> { final M m; (a) class Node2<M extends Monoid<M>, T extends Measured<M>> extends Node<M, T> { final T a, b; (b) class Node3<M extends Monoid<M>, T extends Measured<M>> extends Node<M, T> { final T a, b, c; (c) Source Code 3/3
  • 78. Finger Tree Interface Basic operations: ● cons, snoc — append/prepend element ● concat — join two trees ● split — find prefix, element and suffix using predicate Beyond the scope of this presentation, sorry
  • 79. Finger Tree Performance Amortized bounds: Finger Tree 2-3 Tree List ● cons, snoc O(1) O(log n) O(1)/O(n) ● head, last O(1) O(log n) O(1)/O(n) ● concat O(log min(ℓ1, ℓ2)) O(log n) O(n) ● split O(log min(n, ℓ-n)) O(log n) O(n) ● index O(log min(n, ℓ-n) O(log n) O(n)