Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hierarchical structure to the graph index #402

Open
wants to merge 101 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
8572202
degen dataset, tested in memory
Feb 7, 2025
5bc9c85
extract IntMap, removing @deprecated put()
Feb 7, 2025
2cf61be
SparseIntMap
Feb 7, 2025
2907a1a
move javadoc into IntMap
Feb 10, 2025
ae2fb7a
multilayer part 1
Feb 10, 2025
b6b4bf0
almost builds
Feb 10, 2025
b1e864c
r/m cache stuff so it builds
Feb 10, 2025
d174556
fixes
Feb 10, 2025
b17226a
don't try to write to disk, it doesn't work yet
Feb 10, 2025
98934c1
fixes
Feb 10, 2025
2b8900a
Bypass the order check in NodeQueue.copyFrom when other.size == 1
marianotepper Feb 10, 2025
08f848f
Search all levels when building the graph but only update the layers …
marianotepper Feb 10, 2025
b75000e
tests build
Feb 10, 2025
f2edbb5
add support for copying from opposite-order queues
Feb 10, 2025
1ef1010
need to update snapshot during construction to see other nodes that w…
Feb 11, 2025
0114f5a
fix copyFrom better
Feb 11, 2025
4ac9144
put back .parallel in build()
Feb 11, 2025
7c167a7
address TODOs in GIB. short edges computation is gone
Feb 11, 2025
6f54bd3
fix TODOs in GraphIndex
Feb 11, 2025
d46d8e1
refactor getNodes and the underlying IntMap methods
Feb 11, 2025
807a6c7
tests build
Feb 11, 2025
ddf8ffa
fix NodeArray overload of addNode to add layers if needed
Feb 11, 2025
7770ec1
need to reset `visited` between levels
Feb 11, 2025
7aafffb
copy the vector being searched for, this was a pre-existing bug in th…
Feb 11, 2025
700bad1
fix `visited` better and fix adding evicted notes from higher le
Feb 11, 2025
7bc2072
cleanup
Feb 11, 2025
a5f8d6b
r/m printlns
Feb 11, 2025
909b782
insertSorted ignores duplicates by design, not correct to assert that…
Feb 11, 2025
66b807d
acceptOrds should only be consulted in L0
Feb 11, 2025
bdfbbcc
add multi-layer validateIndex
Feb 11, 2025
ae90234
move OnDiskGraphIndex and Writer into v3 package
Feb 11, 2025
79a18e1
fix tests for v3 OnDiskGraphIndex
Feb 11, 2025
0dbdf38
Revert "move OnDiskGraphIndex and Writer into v3 package"
Feb 11, 2025
64dc9d7
ef=200, prune=false
Feb 11, 2025
c3ca431
SeparatedFeature
Feb 12, 2025
e53e784
move Features into disk.feature package
Feb 12, 2025
46ad560
version 4 OnDiskGraphIndex can handle multiple layers
Feb 12, 2025
b4283ef
r/m heavyweight asserts
Feb 12, 2025
64a4d1c
add support for different degree per level, and implement GIB.load
Feb 12, 2025
2297965
fix ramBytesUsed TODOs
Feb 12, 2025
30a4064
push the candidates seen so far back onto the queue for the next layer
Feb 12, 2025
0ba73d5
Use ef=1 in the upper layers during construction
marianotepper Feb 12, 2025
2a08442
Merge remote-tracking branch 'origin/hnsw-2' into hnsw-2
marianotepper Feb 12, 2025
baab488
Add one additional constructor for convenience that uses the default …
marianotepper Feb 12, 2025
4708491
fix confusing names
Feb 12, 2025
ece68f6
fix load and add loadV3 for backwards compatibility
Feb 12, 2025
f0f10c8
add assert
Feb 12, 2025
5ec8b03
fix different layer degree
Feb 12, 2025
61fa5e6
restrict search to live nodes
Feb 12, 2025
656b55b
add back improveConnections
Feb 12, 2025
f4d2d42
Store maxOverflowFactor in OnHeapGraphIndex and apply it to each leve…
marianotepper Feb 12, 2025
bacf4b3
merge
Feb 12, 2025
3c44838
fix merge better
Feb 12, 2025
a2f259e
version 4.0.0-hnsw.1
Feb 13, 2025
3a4b2e3
Fix BQVectors.ramBytesUsed when compressedVectors is not yet initialized
jkni Feb 13, 2025
524602b
Stagewise hierarchy construction with batches
marianotepper Feb 14, 2025
3fb9aad
Set efConstructionGrid to 100
marianotepper Feb 14, 2025
d6936a4
Remove commented line
marianotepper Feb 28, 2025
717d089
Update hyperparameters
marianotepper Mar 3, 2025
03b0c6e
Replace buildLevelwise by a batched incremental construction
marianotepper Mar 3, 2025
8d2398c
Enable optional addHierarchy parameter in the constructors of GraphIn…
marianotepper Mar 3, 2025
68b285f
Remove printing code
marianotepper Mar 3, 2025
19b1461
Add the with and without hierarchy cases to the tests and benchmarks
marianotepper Mar 3, 2025
ca6f99f
Increase mGridto 64
marianotepper Mar 3, 2025
f1fbcae
Fix documentation of searchOneLayer
marianotepper Mar 4, 2025
96fcdaf
Make pruning configurable at runtime
marianotepper Mar 4, 2025
4c53a24
Make rng final in GraphIndexBuilder
marianotepper Mar 4, 2025
d3f5745
Use static final int BUILD_BATCH_SIZE in GraphIndexBuilder.build
marianotepper Mar 4, 2025
369e685
Add check to the GraphIndexBuilder constructor preventing from using …
marianotepper Mar 4, 2025
cf97039
Expose parameters in Bench
marianotepper Mar 4, 2025
939396a
Do not use pruning during index construction
marianotepper Mar 5, 2025
0a3c0e1
Use the 0-th level to check for an empty graph in GraphIndexBuilder
marianotepper Mar 5, 2025
346ea3a
Change unit from seconds to milliseconds
marianotepper Mar 6, 2025
9dce154
Add 5.0 to overqueryGrid
marianotepper Mar 6, 2025
f191546
Add async batched construction
marianotepper Mar 7, 2025
d45856c
Set pruneSearch=false by default in GraphSearcher
marianotepper Mar 12, 2025
0dd41e4
Add last commits from hnsw-2
marianotepper Mar 13, 2025
a78bc95
Code cleanup in GraphIndexBuilder.java
marianotepper Mar 13, 2025
26b4930
Update Bench with new parameters
marianotepper Mar 13, 2025
8ac9484
Merge branch 'main' into hnsw-3
marianotepper Mar 13, 2025
85c4889
Fix merging issues
marianotepper Mar 13, 2025
54eab1f
Fix merging issues in jmh
marianotepper Mar 13, 2025
ababca3
Add license notice to SparseIntMap.java
marianotepper Mar 13, 2025
bcccf33
Add license notice to IntMap.java
marianotepper Mar 13, 2025
a5b8c25
Make sure top level is printed by GraphIndex.prettyPrint
marianotepper Mar 17, 2025
aef6fd9
When adding random nodes in removeDeletedNodes, make sure they are fr…
marianotepper Mar 17, 2025
a913fed
Remove redundant @Test decorator
marianotepper Mar 17, 2025
b53f165
In GraphSearcher.search, use all nodes in the top layers, only use th…
marianotepper Mar 17, 2025
04b002c
Fix OnDiskGraphIndex.getNodes to account for the new layout
marianotepper Mar 17, 2025
87deaa7
After deletions, OnHeapGraphIndex.getMaxLevel can have a maxLevel tha…
marianotepper Mar 17, 2025
0553fd5
Use 4 neighbors in TestDeletions instead of 2
marianotepper Mar 17, 2025
7321c8d
Update comment in GraphIndex
marianotepper Mar 24, 2025
916c1c7
Changes following the PR review
marianotepper Mar 24, 2025
9af51bf
Add to the Javadoc of all constructors
marianotepper Mar 24, 2025
ffaafc1
Remove unused lock member
marianotepper Mar 24, 2025
88d2553
Fix typo
marianotepper Mar 24, 2025
43a9c0d
Add reason for deprecation
marianotepper Mar 24, 2025
57ebc52
Add javadocs to GraphIndex.java
marianotepper Mar 24, 2025
1db1448
Update layerInfo to contain the correct size for L0
marianotepper Mar 24, 2025
984cdf9
Add javadoc for GraphSearcher.usePruning
marianotepper Mar 24, 2025
a617b29
Merge branch 'main' into hnsw-3
marianotepper Mar 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add license notice to SparseIntMap.java
marianotepper committed Mar 13, 2025
commit ababca35f09a18cd49baaa41ffe04c5808045a04
Original file line number Diff line number Diff line change
@@ -1,3 +1,19 @@
/*
* Copyright DataStax, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package io.github.jbellis.jvector.util;

import io.github.jbellis.jvector.graph.NodesIterator;