Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add external retriever to usearch so vector nodes can be externally stored and managed #171

Closed

Conversation

Ngalstyan4
Copy link
Contributor

This is incomplete but hopefully has enough to clarify the approach.
What's missing?

  • Allow setting external retriever functions in usearch_init, via the configuration struct
  • allow passing an optional opaque pointer in set_retriever that will be passed to the external retrievers
  • Wherever there are sets of functions with different type parameters, currently only the functions, currently only the functions corresponding to f32 are properly instrumented.

Note: this is based off of the lookup label PR so changes from that PR also appear here.

@ashvardanian
Copy link
Contributor

Thanks again, @Ngalstyan4! Our plans seem well aligned. In terms of implementation, there may be a better alternative using batch-evaluated metrics expected in 1.0.0. Will share more details soon.

ashvardanian added a commit to ashvardanian/usearch that referenced this pull request Aug 4, 2023
ashvardanian pushed a commit that referenced this pull request Aug 5, 2023
# [0.23.0](v0.22.3...v0.23.0) (2023-08-05)

### Add

* `Matches` and `BatchMatches` simple API ([1b40f13](1b40f13))
* Add node offsets in a serialized file ([c600ffd](c600ffd))
* Batch add ([74860d6](74860d6))
* Batch add test ([5f99b05](5f99b05))
* Changing the metric at runtime ([d7bfac7](d7bfac7))
* Compactions ([434c1da](434c1da))
* efficiency estimate in `recall_members` ([64a60b4](64a60b4))
* Exact search shortcut ([a005084](a005084)), closes [#176](#176)
* Multi-`Index` lookups ([c5b7ccd](c5b7ccd))
* Parallel View ([ed3f845](ed3f845))
* Prefetching functionality for external memory ([b544ddb](b544ddb)), closes [#170](#170) [#171](#171)
* Streaming and in-memory serialization in C++ ([7da44a2](7da44a2))
* Vector alignment ([ea230e0](ea230e0))

### Break

* Final patches for 1.0 release ([8d557e2](8d557e2))

### Docs

* add descriptions of match-related classes ([637e5ef](637e5ef))
* Annotating C 99 and GoLang interfaces ([4b910a8](4b910a8))
* Documenting Python tests ([1f89e0a](1f89e0a))
* Shorten name ([9a6a01c](9a6a01c))
* Spelling and details ([6f25ed9](6f25ed9))
* Spelling and links ([20566e0](20566e0))
* TypeScript docs factual errors ([fe8103c](fe8103c))
* Update benchmarking sections ([96baa09](96baa09))

### Fix

* `reset` and serialization code ([11d7844](11d7844))
* Avoid exception in `.xbin` file is missing ([4863bea](4863bea))
* Avoid spawning needless threads ([9dff0fb](9dff0fb))
* Concurrent file access issues in tests ([5ae6db1](5ae6db1))
* Dead-lock on post-removal insertions ([284b058](284b058)), closes [#175](#175)
* Excpetion handling for `index_dense_metadata` ([d9627ba](d9627ba))
* Heap overflow for fractional-size scalars ([459abcd](459abcd))
* Imports in Python benchmarks ([cffe507](cffe507))
* Inferring OS-dependant file path in Python ([7743709](7743709)), closes [#174](#174)
* JavaScript bindings ([ee04856](ee04856))
* JS keys should be `bigint` ([e1fbec4](e1fbec4)), closes [#178](#178)
* Memory leak and multi-index lookup overflow ([597b0d5](597b0d5))
* Narrowing conversions for WASM 32-bit builds ([79add97](79add97))
* Portable way of matching 32-bit builds ([604e634](604e634))
* Progress reporting issue ([b2565e5](b2565e5))
* Reclaiming file descriptor ([05e908f](05e908f))
* Report error if `reserve` hasn't been called ([f94f358](f94f358))
* Typo in metric name ([34f5530](34f5530))
* Undefined behaviour on duplicate labels ([c04a5cc](c04a5cc))

### Improve

* `usearch_remove` C99 interface ([2072540](2072540))
* Align allocations to page size by default ([134a6f0](134a6f0))
* Broader types support in `usearch.io` ([b1a1439](b1a1439))
* Exposing search stats to users ([2779ffc](2779ffc))
* Feature-complete GoLang bindings ([e2058d1](e2058d1))
* More flexibility for Python args ([6aa06cb](6aa06cb))
* Out-of-bounds checks ([54cecb6](54cecb6))
* Task scheduling with STL threads ([9131287](9131287))

### Make

* Add CMake for C builds ([4d2127b](4d2127b))
* All targets enabled for debugging ([ea0f835](ea0f835))
* Build only WASM tests ([372738b](372738b))
* Typescript ([dacfbed](dacfbed))
* Upgrade to the newest SimSIMD ([368d853](368d853))

### Refactor

* `label_t` to `key_t` ([0d6c800](0d6c800))
* Add ([5d62180](5d62180))
* Index serialization in a file ([ba72585](ba72585))
* JS and GoLang tests ([a45fc40](a45fc40))
* Keep only batch requests in CPython ([44c0318](44c0318))
* Rename `f8` to `i8` to match IEEE ([c37f80b](c37f80b))
* Revert `Matches` ([5731e70](5731e70))
* Splitting proximity-graphs and vectors ([e996b38](e996b38))
* Use Executor instead of std::thread ([c3a3693](c3a3693))
* Vector alignment issue ([b02d0ad](b02d0ad))

### Test

* Set vector alignment ([0acb54a](0acb54a))
* Wrong buffer size caused illegal access ([830e280](830e280))
@Ngalstyan4
Copy link
Contributor Author

more up to date implementation of this is in #335

@Ngalstyan4 Ngalstyan4 closed this Mar 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants