You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+11-11Lines changed: 11 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,7 +22,7 @@ The following additional types are implemented, but less tested:
22
22
23
23
## Reference
24
24
25
-
* Thomas Mueller Graf, Daniel Lemire, [Binary Fuse Filters: Fast and Smaller Than Xor Filters](http://arxiv.org/abs/2201.01174), Journal of Experimental Algorithmics 27, 2022. DOI: 10.1145/3510449
25
+
* Thomas Mueller Graf, Daniel Lemire, [Binary Fuse Filters: Fast and Smaller Than Xor Filters](http://arxiv.org/abs/2201.01174), Journal of Experimental Algorithmics 27, 2022. DOI: 10.1145/3510449
26
26
* Thomas Mueller Graf, Daniel Lemire, [Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters](https://arxiv.org/abs/1912.08258), Journal of Experimental Algorithmics 25 (1), 2020. DOI: 10.1145/3376122
27
27
28
28
## Usage
@@ -31,17 +31,13 @@ The following additional types are implemented, but less tested:
31
31
To use the XOR and Binary Fuse filters, first prepare an array of keys, then construct the filter:
All filters implement the `Filter` interface and support the `mayContain(long key)` method to check if a key might be in the set. Note that false positives are possible, but false negatives are not.
54
50
51
+
### Generating the Hash Values
52
+
53
+
The library is written to process `long` values that are meant to be hash values. Though you do not need to use
54
+
cryptographically strong hashing, you should make sure that your hash functions are reasonable: they should
55
+
not generate too many collisions (two objects mapping to the same `long` value).
55
56
56
57
### Serialization and Deserialization
57
58
@@ -60,25 +61,24 @@ Filters can be serialized to and deserialized from a `ByteBuffer` for persistenc
60
61
```java
61
62
importjava.nio.ByteBuffer;
62
63
63
-
// Assuming you have a constructed filter, e.g., Xor8 xor8 = Xor8.construct(keys);
Copy file name to clipboardExpand all lines: fastfilter/src/main/java/org/fastfilter/xor/Xor16.java
+3Lines changed: 3 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,10 @@
6
6
importorg.fastfilter.utils.Hash;
7
7
8
8
/**
9
+
* The Xor16 filter implementation is experimental. We recommend using XorBinaryFuse16 instead. Use at your own risks.
10
+
*
9
11
* The xor filter, a new algorithm that can replace a Bloom filter.
12
+
* Thomas Mueller Graf, Daniel Lemire, [Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters](https://arxiv.org/abs/1912.08258), Journal of Experimental Algorithmics 25 (1), 2020. DOI: 10.1145/3376122
10
13
*
11
14
* It needs 1.23 log(1/fpp) bits per key. It is related to the BDZ algorithm [1]
Copy file name to clipboardExpand all lines: fastfilter/src/main/java/org/fastfilter/xor/Xor8.java
+4Lines changed: 4 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -6,8 +6,12 @@
6
6
importorg.fastfilter.Filter;
7
7
importorg.fastfilter.utils.Hash;
8
8
9
+
9
10
/**
11
+
* The Xor8 filter implementation is experimental. We recommend using XorBinaryFuse8 instead. Use at your own risks.
12
+
*
10
13
* The xor filter, a new algorithm that can replace a Bloom filter.
14
+
* Thomas Mueller Graf, Daniel Lemire, [Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters](https://arxiv.org/abs/1912.08258), Journal of Experimental Algorithmics 25 (1), 2020. DOI: 10.1145/3376122
11
15
*
12
16
* It needs 1.23 log(1/fpp) bits per key. It is related to the BDZ algorithm [1]
Copy file name to clipboardExpand all lines: fastfilter/src/main/java/org/fastfilter/xor/XorBinaryFuse16.java
+59-46Lines changed: 59 additions & 46 deletions
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,7 @@
7
7
8
8
/**
9
9
* The xor binary fuse filter, a new algorithm that can replace a Bloom filter.
10
+
* Thomas Mueller Graf, Daniel Lemire, [Binary Fuse Filters: Fast and Smaller Than Xor Filters](http://arxiv.org/abs/2201.01174), Journal of Experimental Algorithmics 27, 2022. DOI: 10.1145/3510449
10
11
*/
11
12
publicclassXorBinaryFuse16implementsFilter {
12
13
@@ -78,6 +79,15 @@ private static int mod3(int x) {
78
79
returnx;
79
80
}
80
81
82
+
/**
83
+
* Constructs a new XorBinaryFuse16 filter from the given array of keys.
84
+
* The filter is designed to have a low false positive rate while being space-efficient.
85
+
* The keys array should contain unique values. The array may be mutated during construction
86
+
* (e.g., sorted and deduplicated) if the algorithm detects that there are likely too many duplicates.
87
+
*
88
+
* @param keys the array of long keys to add to the filter
89
+
* @return a new XorBinaryFuse16 filter containing all the keys
0 commit comments