181 lines
5.0 KiB
Markdown
181 lines
5.0 KiB
Markdown
# stratum-embeddings
|
|
|
|
Unified embedding providers with caching, batch processing, and vector storage for the STRATUMIOPS ecosystem.
|
|
|
|
## Features
|
|
|
|
- **Multiple Providers**: FastEmbed (local), OpenAI, Ollama
|
|
- **Smart Caching**: In-memory caching with configurable TTL
|
|
- **Batch Processing**: Efficient batch embedding with automatic chunking
|
|
- **Vector Storage**: LanceDB (scale-first) and SurrealDB (graph-first)
|
|
- **Fallback Support**: Automatic failover between providers
|
|
- **Feature Flags**: Modular compilation for minimal dependencies
|
|
|
|
## Architecture
|
|
|
|
```text
|
|
┌─────────────────────────────────────────┐
|
|
│ EmbeddingService │
|
|
│ (facade with caching + fallback) │
|
|
└─────────────┬───────────────────────────┘
|
|
│
|
|
┌─────────┴─────────┐
|
|
▼ ▼
|
|
┌─────────────┐ ┌─────────────┐
|
|
│ Providers │ │ Cache │
|
|
│ │ │ │
|
|
│ • FastEmbed │ │ • Memory │
|
|
│ • OpenAI │ │ • (Sled) │
|
|
│ • Ollama │ │ │
|
|
└─────────────┘ └─────────────┘
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
### Basic Usage
|
|
|
|
```rust
|
|
use stratum_embeddings::{
|
|
EmbeddingService, FastEmbedProvider, MemoryCache, EmbeddingOptions
|
|
};
|
|
use std::time::Duration;
|
|
|
|
#[tokio::main]
|
|
async fn main() -> Result<(), Box<dyn std::error::Error>> {
|
|
let provider = FastEmbedProvider::small()?;
|
|
let cache = MemoryCache::new(1000, Duration::from_secs(300));
|
|
let service = EmbeddingService::new(provider).with_cache(cache);
|
|
|
|
let options = EmbeddingOptions::default_with_cache();
|
|
let embedding = service.embed("Hello world", &options).await?;
|
|
|
|
println!("Generated {} dimensions", embedding.len());
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
### Batch Processing
|
|
|
|
```rust
|
|
let texts = vec![
|
|
"Text 1".to_string(),
|
|
"Text 2".to_string(),
|
|
"Text 3".to_string(),
|
|
];
|
|
|
|
let result = service.embed_batch(texts, &options).await?;
|
|
println!("Embeddings: {}, Cached: {}",
|
|
result.embeddings.len(),
|
|
result.cached_count
|
|
);
|
|
```
|
|
|
|
### Vector Storage
|
|
|
|
#### LanceDB (Provisioning, Vapora)
|
|
|
|
```rust
|
|
use stratum_embeddings::{LanceDbStore, VectorStore, VectorStoreConfig};
|
|
|
|
let config = VectorStoreConfig::new(384);
|
|
let store = LanceDbStore::new("./data", "embeddings", config).await?;
|
|
|
|
store.upsert("doc1", &embedding, metadata).await?;
|
|
let results = store.search(&query_embedding, 10, None).await?;
|
|
```
|
|
|
|
#### SurrealDB (Kogral)
|
|
|
|
```rust
|
|
use stratum_embeddings::{SurrealDbStore, VectorStore, VectorStoreConfig};
|
|
|
|
let config = VectorStoreConfig::new(384);
|
|
let store = SurrealDbStore::new_memory("concepts", config).await?;
|
|
|
|
store.upsert("concept1", &embedding, metadata).await?;
|
|
let results = store.search(&query_embedding, 10, None).await?;
|
|
```
|
|
|
|
## Feature Flags
|
|
|
|
### Providers
|
|
|
|
- `fastembed-provider` (default) - Local embeddings via fastembed
|
|
- `openai-provider` - OpenAI API embeddings
|
|
- `ollama-provider` - Ollama local server embeddings
|
|
- `all-providers` - All embedding providers
|
|
|
|
### Cache
|
|
|
|
- `memory-cache` (default) - In-memory caching with moka
|
|
- `persistent-cache` - Persistent cache with sled
|
|
- `all-cache` - All cache backends
|
|
|
|
### Vector Storage
|
|
|
|
- `lancedb-store` - LanceDB vector storage (columnar, disk-native)
|
|
- `surrealdb-store` - SurrealDB vector storage (graph + vector)
|
|
- `all-stores` - All storage backends
|
|
|
|
### Project Presets
|
|
|
|
- `kogral` - fastembed + memory + surrealdb
|
|
- `provisioning` - openai + memory + lancedb
|
|
- `vapora` - all-providers + memory + lancedb
|
|
- `full` - Everything enabled
|
|
|
|
## Examples
|
|
|
|
Run examples with:
|
|
|
|
```bash
|
|
cargo run --example basic_usage --features=default
|
|
cargo run --example fallback_demo --features=fastembed-provider,ollama-provider
|
|
cargo run --example lancedb_usage --features=lancedb-store
|
|
cargo run --example surrealdb_usage --features=surrealdb-store
|
|
```
|
|
|
|
## Provider Comparison
|
|
|
|
| Provider | Type | Cost | Dimensions | Use Case |
|
|
|----------|------|------|------------|----------|
|
|
| FastEmbed | Local | Free | 384-1024 | Dev, privacy-first |
|
|
| OpenAI | Cloud | $0.02-0.13/1M | 1536-3072 | Production RAG |
|
|
| Ollama | Local | Free | 384-1024 | Self-hosted |
|
|
|
|
## Storage Backend Comparison
|
|
|
|
| Backend | Best For | Strength | Scale |
|
|
|---------|----------|----------|-------|
|
|
| LanceDB | RAG, traces | Columnar, IVF-PQ index | Billions |
|
|
| SurrealDB | Knowledge graphs | Unified graph+vector queries | Millions |
|
|
|
|
## Configuration
|
|
|
|
Environment variables:
|
|
|
|
```bash
|
|
# FastEmbed
|
|
FASTEMBED_MODEL=bge-small-en
|
|
|
|
# OpenAI
|
|
OPENAI_API_KEY=sk-...
|
|
OPENAI_MODEL=text-embedding-3-small
|
|
|
|
# Ollama
|
|
OLLAMA_MODEL=nomic-embed-text
|
|
OLLAMA_BASE_URL=http://localhost:11434
|
|
```
|
|
|
|
## Development
|
|
|
|
```bash
|
|
cargo check -p stratum-embeddings --all-features
|
|
cargo test -p stratum-embeddings --all-features
|
|
cargo clippy -p stratum-embeddings --all-features -- -D warnings
|
|
```
|
|
|
|
## License
|
|
|
|
MIT OR Apache-2.0
|