com.sun.electric.database.geometry.btree
Class BTree<K extends java.io.Serializable & java.lang.Comparable,V extends java.io.Serializable,S extends java.io.Serializable>

java.lang.Object
  extended by com.sun.electric.database.geometry.btree.BTree<K,V,S>

public class BTree<K extends java.io.Serializable & java.lang.Comparable,V extends java.io.Serializable,S extends java.io.Serializable>
extends java.lang.Object

A B+Tree implemented using PageStorage.

This is a B-Plus-Tree; values are stored only in leaf nodes.

Usage Notes

Each element in a BTree is conceptually a triple <ordinal,key,value> where "key" is a user-supplied key (belonging to a type that is Comparable), "value" is a user-supplied value (no restrictions) and "ordinal" is an integer indicating the number of keys in the tree less than this one. Note that the ordinal is not actually stored in the tree, and inserting a new value can potentially modify the ordinals of all preexisting elements! Each of the getXXX() methods takes one of these three coordinates (Ord, Key, or Val) and returns one of the others, or else a count (Num). Additionally, the getXXXFromKey() methods include floor/ceiling versions that take an upper/lower bound and search for the largest/smallest key which is less/greater than the one supplied.

The BTree supports appending a new element (that is, inserting a value with a key greater than any key in the table) in near-constant time (Actually log*(n)) and all other queries in log(n) time. All operations are done in a single pass down the BTree from the root to the leaves; this brings two benefits: the data structure can be made concurrent with very little lock contention and it can support copy-on-write shadow versions.

You must distinguish between insert() and replace() ahead of time; you can't call insert() on a key that is already in the tree or replace() on one that isn't. In order to replace() on a BTree with a summary the summary product operation must be commutative and invertible, and you must know the old value which you are replacing. This lets us update the interior node invariants as we walk down the tree and avoid having to walk back up afterwards. If you're not sure if a key is in the tree, just do a get() -- the net result is two passes over the tree, which is what we'd have to do anyways if the user didn't distinguish insert() from replace(). We're just offering the option to double the performance in the case where the user already knows if the key is in the tree or not.

You can associate a summary with each leaf node of the BTree. In order to do this, you must provide an instance of AssociativeOperation to the BTree when you construct it. The AssociativeOperation knows the key and value type of the BTree, and must know two things:

The process of merging two summaries must be associative; if we want to merge three summaries (ABC) it must not matter if we merge them as ((AB)C) or (A(BC)). Technically this makes the merge operation a semigroup. In exchange for providing all of this information you can ask the BTree to calculate the summary of any contiguous region of the keyspace in log(n) time. For example, this can be used to answer "min/max over this range" queries very efficiently.

Implementation Notes

We proactively split nodes as soon as they become full rather than waiting for them to become overfull. This has a space overhead of 1/NUM_KEYS_PER_PAGE, but puts an O(1) bound on the number of pages written per operation (number of pages read is still O(log n)). It also makes the walk routine tail-recursive.

Each node of the BTree uses one page of the PageStorage; we don't yet support situations where a single page is too small for one key and one value.

The coding style in this file is pretty unusual; it looks a lot like "Java as a better C". This is mainly because Hotspot is remarkably good at inlining methods, but remarkably bad (still, even in Java 1.7) at figuring out when it's safe to unbox values. So I expend a lot of effort trying not to create boxed values, but don't worry at all about the overhead of method calls, particularly when made via "final" references (nearly always inlined). I don't code this way very often -- I reserve this for the 2% of my code that bears 98% of the performance burden. There is anecdotal evidence that using the server JVM rather than the client JVM yields a dramatic increase in performance; this coding style is especially friendly to the server JVM's optimization techniques.

Keys, values, and summary elements are stored in unboxed form. This means that each of these types must know how to serialize itself to and deserialize itself from a sequence of bytes, and it must be possible to perform the important operations (comparison for keys, product for summary values) on these values in unboxed form. The reasons for this are set out in the previous paragraph. See the package btree.unboxed for further details.


Constructor Summary
BTree(CachingPageStorage ps, UnboxedComparable<K> uk, Unboxed<V> uv, UnboxedFunction<Pair<K,V>,S> summarize, AssociativeOperation<S> combine)
          Create a BTree.
 
Method Summary
 void clear()
          remove all entries
static void clearStats()
          debugging method; may go away in future releases
static void dumpStats(java.io.PrintStream pw)
          debugging method; may go away in future releases
 V getKeyFromKeyNext(K key)
          returns the least key strictly greater than the argument
 V getKeyFromKeyPrev(K key)
          returns the greatest key strictly less than the argument
 K getKeyFromOrd(int ord)
          returns the i^th key in the tree
 int getNumFromKeys(K min, K max)
          Returns the number of entries in the tree with a key between min and max inclusive; if either min or max is null it is treated as negative or positive infinity (respectively)
 int getOrdFromKey(K key)
          returns the ordinal of the given key, or -1 if not found
 int getOrdFromKeyCeiling(K key)
          returns the ordinal of the smallest key greater than or equal to the one supplied
 int getOrdFromKeyFloor(K key)
          returns the ordinal of the largest key less than or equal to the one supplied
 S getSummaryFromKeys(K min, K max)
          compute the summary of all (key,value) pairs between min and max, inclusive
 V getValFromKey(K key)
          returns the value in the tree, or null if not found
 V getValFromKeyCeiling(K key)
          returns the value of the smallest key greater than or equal to the one supplied
 V getValFromKeyFloor(K key)
          returns the value of the largest key less than or equal to the one supplied
 V getValFromOrd(int ord)
          returns the i^th value in the tree
 void insert(K key, V val)
          will throw an exception if the key is already in the tree
 V remove(K key)
          returns value previously in the tree; will throw an exception if the key is not already in the tree
 V replace(K key, V val)
          returns value previously in the tree; will throw an exception if the key is not already in the tree
 int size()
          same as getNumFromKeys(null,null)
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BTree

public BTree(CachingPageStorage ps,
             UnboxedComparable<K> uk,
             Unboxed<V> uv,
             UnboxedFunction<Pair<K,V>,S> summarize,
             AssociativeOperation<S> combine)
Create a BTree.

Parameters:
ps - the PageStorage to hold the underlying bytes
uk - the unboxed type for keys (must be comparable)
uv - the unboxed type for values
summarize - the function which summarizes a single (key,value) pair
combine - the function which associatively combines two summaries
Method Detail

getNumFromKeys

public int getNumFromKeys(K min,
                          K max)
Returns the number of entries in the tree with a key between min and max inclusive; if either min or max is null it is treated as negative or positive infinity (respectively)


size

public int size()
same as getNumFromKeys(null,null)


getValFromKey

public V getValFromKey(K key)
returns the value in the tree, or null if not found


getValFromKeyFloor

public V getValFromKeyFloor(K key)
returns the value of the largest key less than or equal to the one supplied


getValFromKeyCeiling

public V getValFromKeyCeiling(K key)
returns the value of the smallest key greater than or equal to the one supplied


getOrdFromKey

public int getOrdFromKey(K key)
returns the ordinal of the given key, or -1 if not found


getOrdFromKeyFloor

public int getOrdFromKeyFloor(K key)
returns the ordinal of the largest key less than or equal to the one supplied


getOrdFromKeyCeiling

public int getOrdFromKeyCeiling(K key)
returns the ordinal of the smallest key greater than or equal to the one supplied


getKeyFromKeyNext

public V getKeyFromKeyNext(K key)
returns the least key strictly greater than the argument


getKeyFromKeyPrev

public V getKeyFromKeyPrev(K key)
returns the greatest key strictly less than the argument


getValFromOrd

public V getValFromOrd(int ord)
returns the i^th value in the tree


getKeyFromOrd

public K getKeyFromOrd(int ord)
returns the i^th key in the tree


insert

public void insert(K key,
                   V val)
will throw an exception if the key is already in the tree


replace

public V replace(K key,
                 V val)
returns value previously in the tree; will throw an exception if the key is not already in the tree


remove

public V remove(K key)
returns value previously in the tree; will throw an exception if the key is not already in the tree


clear

public void clear()
remove all entries


getSummaryFromKeys

public S getSummaryFromKeys(K min,
                            K max)
compute the summary of all (key,value) pairs between min and max, inclusive


clearStats

public static void clearStats()
debugging method; may go away in future releases


dumpStats

public static void dumpStats(java.io.PrintStream pw)
debugging method; may go away in future releases