bloomfilter (revision 6049116fb66c9cd530ff90350fc92c0a80119ec9) - OpenGrok cross reference for /inferno-os/man/2/bloomfilter

     BLOOMFILTER 2
 NAME
Bloomfilter - Bloom filters
 SYNOPSIS
.EX
include "sets.m";
include "bloomfilter.m";
bloomfilter := load Bloomfilter Bloomfilter->PATH;

init: fn();
filter: fn(d: array of byte, logm, k: int): Sets->Set;

 DESCRIPTION
A Bloom filter is a method of representing a set to support probabilistic
membership queries. It uses independent hash functions of members
of the set to set elements of a bit-vector.
 Init should be called first to initialise the module.
 Filter returns a Set
 s representing the Bloom filter for the single-member
set
 { d }.  K independent hash functions are used, each of range
 "[0, 2^" logm ), to return a Bloom filter
 2^ logm bits wide. It is an error if
 logm is less than 3 or greater than 30.

Bloom filters can be combined by set union.
The set represented by Bloom filter
 a is not a subset of another
 b if there are any members in
 a that are not in
 b . Together,
 logm ,  k , and
 n (the number of members in the set)
determine the
 "false positve" rate (the probability that a membership test will not eliminate
a member that is not in fact in the set).
The probability of a
 "false positive" is approximately (1-e^(-kn/(2^logm))^k.
For a given false positive rate,
 f , a useful formula to determine appropriate parameters
is: k=ceil(-log₂(f)),
and logm=ceil(log₂(nk)).
 EXAMPLES
Create a 128 bit-wide bloom filter
 f representing all the elements
in the string array
 elems , with
 k =6. .EX
 A, B, None: import Sets;
 for(i:=0; i<len elems; i++)
 f = f.X(A|B, filter(array of byte elems[i], 7, 6));

Test whether the string
 s is a member of
 f . If there were 12 elements in
 elems , the probability of a false positive would be
approximately 0.0063.
.EX
 if(filter(array of byte s, 7, 6).X(A&~B, f).eq(None))
 sys->print("'%s' might be a member of f\en", s);

 SOURCE
 /appl/lib/bloomfilter.b  SEE ALSO
 sets (2)