semi-technologies/weaviate
Weaviate is a cloud-native, modular, real-time vector search engine
repo name | semi-technologies/weaviate |
repo link | https://github.com/semi-technologies/weaviate |
homepage | https://www.semi.technology/developers/weaviate/current/ |
language | Go |
size (curr.) | 787827 kB |
stars (curr.) | 1137 |
created | 2016-03-30 |
license | BSD 3-Clause “New” or “Revised” License |
Weaviate GraphQL demo on news article dataset containing: Transformers module, GraphQL usage, semantic search, _additional{} features, Q&A, and Aggregate{} function. You can the demo on this dataset in the GUI here: semantic search, Q&A, Aggregate.
Description
Weaviate is a cloud-native, real-time vector search engine (aka neural search engine or deep search engine). There are modules for specific use cases such as semantic search, plugins to integrate Weaviate in any application of your choice, and a console to visualize your data.
GraphQL - RESTful - vector search engine - vector database - neural search engine - semantic search - HNSW - deep search - machine learning - kNN
Features
Weaviate makes it easy to use state-of-the-art AI models while giving you the scalability, ease of use, safety and cost-effectiveness of a purpose-built vector database. Most notably:
-
Fast queries Weaviate typically performs a 10-NN neighbor search out of millions of objects in considerably less than 100ms.
-
Any media type with Weaviate Modules Use State-of-the-Art AI model inference (e.g. Transformers) for Text, Images, etc. at search and query time to let Weaviate manage the process of vectorizing your data for your - or import your own vectors.
-
Combine vector and scalar search Weaviate allows for efficient combined vector and scalar searches, e.g “articles related to the COVID 19 pandemic published within the past 7 days”. Weaviate stores both your objects and the vectors and make sure the retrieval of both is always efficient. There is no need for a third party object storage.
-
Real-time and persistent Weaviate let’s you search through your data even if it’s currently being imported or updated. In addition, every write is written to a Write-Ahead-Log (WAL) for immediately persisted writes - even when a crash occurs.
-
Horizontal Scalability Scale Weaviate for your exact needs, e.g. High-Availability, maximum ingestion, largest possible dataset size, maximum queries per second, etc. (Currently under development, ETA Fall 2021)
-
Cost-Effectiveness Very large datasets do not need to be kept entirely in memory in Weaviate. At the same time available memory can be used to increase the speed of queries. This allows for a conscious speed/cost trade-off to suit every use case.
-
Graph-like connections between objects Make arbitrary connections between your objects in a graph-like fashion to resemble real-life connections between your data points. Traverse those connections using GraphQL.
Documentation
You can find detailed documentation in the developers section of our website or directly go to one of the docs using the links in the list below.
Additional reading
- Weaviate is an open-source search engine powered by ML, vectors, graphs, and GraphQL (ZDNet)
- Weaviate, an ANN Database with CRUD support (DB-Engines.com)
- A sub-50ms neural search with DistilBERT and Weaviate (Towards Datascience)
- Getting Started with Weaviate Python Library (Towards Datascience)
- Industry use cases (SeMI Technologies)
Examples
You can find code examples here
Support
- Stackoverflow for questions
- Github for issues
- Slack channel to connect
- Newsletter to stay in the know