181 lines
5.0 KiB
Markdown
Raw Permalink Normal View History

# stratum-embeddings
Unified embedding providers with caching, batch processing, and vector storage for the STRATUMIOPS ecosystem.
## Features
- **Multiple Providers**: FastEmbed (local), OpenAI, Ollama
- **Smart Caching**: In-memory caching with configurable TTL
- **Batch Processing**: Efficient batch embedding with automatic chunking
- **Vector Storage**: LanceDB (scale-first) and SurrealDB (graph-first)
- **Fallback Support**: Automatic failover between providers
- **Feature Flags**: Modular compilation for minimal dependencies
## Architecture
```text
┌─────────────────────────────────────────┐
│ EmbeddingService │
│ (facade with caching + fallback) │
└─────────────┬───────────────────────────┘
┌─────────┴─────────┐
▼ ▼
┌─────────────┐ ┌─────────────┐
│ Providers │ │ Cache │
│ │ │ │
│ • FastEmbed │ │ • Memory │
│ • OpenAI │ │ • (Sled) │
│ • Ollama │ │ │
└─────────────┘ └─────────────┘
```
## Quick Start
### Basic Usage
```rust
use stratum_embeddings::{
EmbeddingService, FastEmbedProvider, MemoryCache, EmbeddingOptions
};
use std::time::Duration;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let provider = FastEmbedProvider::small()?;
let cache = MemoryCache::new(1000, Duration::from_secs(300));
let service = EmbeddingService::new(provider).with_cache(cache);
let options = EmbeddingOptions::default_with_cache();
let embedding = service.embed("Hello world", &options).await?;
println!("Generated {} dimensions", embedding.len());
Ok(())
}
```
### Batch Processing
```rust
let texts = vec![
"Text 1".to_string(),
"Text 2".to_string(),
"Text 3".to_string(),
];
let result = service.embed_batch(texts, &options).await?;
println!("Embeddings: {}, Cached: {}",
result.embeddings.len(),
result.cached_count
);
```
### Vector Storage
#### LanceDB (Provisioning, Vapora)
```rust
use stratum_embeddings::{LanceDbStore, VectorStore, VectorStoreConfig};
let config = VectorStoreConfig::new(384);
let store = LanceDbStore::new("./data", "embeddings", config).await?;
store.upsert("doc1", &embedding, metadata).await?;
let results = store.search(&query_embedding, 10, None).await?;
```
#### SurrealDB (Kogral)
```rust
use stratum_embeddings::{SurrealDbStore, VectorStore, VectorStoreConfig};
let config = VectorStoreConfig::new(384);
let store = SurrealDbStore::new_memory("concepts", config).await?;
store.upsert("concept1", &embedding, metadata).await?;
let results = store.search(&query_embedding, 10, None).await?;
```
## Feature Flags
### Providers
- `fastembed-provider` (default) - Local embeddings via fastembed
- `openai-provider` - OpenAI API embeddings
- `ollama-provider` - Ollama local server embeddings
- `all-providers` - All embedding providers
### Cache
- `memory-cache` (default) - In-memory caching with moka
- `persistent-cache` - Persistent cache with sled
- `all-cache` - All cache backends
### Vector Storage
- `lancedb-store` - LanceDB vector storage (columnar, disk-native)
- `surrealdb-store` - SurrealDB vector storage (graph + vector)
- `all-stores` - All storage backends
### Project Presets
- `kogral` - fastembed + memory + surrealdb
- `provisioning` - openai + memory + lancedb
- `vapora` - all-providers + memory + lancedb
- `full` - Everything enabled
## Examples
Run examples with:
```bash
cargo run --example basic_usage --features=default
cargo run --example fallback_demo --features=fastembed-provider,ollama-provider
cargo run --example lancedb_usage --features=lancedb-store
cargo run --example surrealdb_usage --features=surrealdb-store
```
## Provider Comparison
| Provider | Type | Cost | Dimensions | Use Case |
|----------|------|------|------------|----------|
| FastEmbed | Local | Free | 384-1024 | Dev, privacy-first |
| OpenAI | Cloud | $0.02-0.13/1M | 1536-3072 | Production RAG |
| Ollama | Local | Free | 384-1024 | Self-hosted |
## Storage Backend Comparison
| Backend | Best For | Strength | Scale |
|---------|----------|----------|-------|
| LanceDB | RAG, traces | Columnar, IVF-PQ index | Billions |
| SurrealDB | Knowledge graphs | Unified graph+vector queries | Millions |
## Configuration
Environment variables:
```bash
# FastEmbed
FASTEMBED_MODEL=bge-small-en
# OpenAI
OPENAI_API_KEY=sk-...
OPENAI_MODEL=text-embedding-3-small
# Ollama
OLLAMA_MODEL=nomic-embed-text
OLLAMA_BASE_URL=http://localhost:11434
```
## Development
```bash
cargo check -p stratum-embeddings --all-features
cargo test -p stratum-embeddings --all-features
cargo clippy -p stratum-embeddings --all-features -- -D warnings
```
## License
MIT OR Apache-2.0