Jesús Pérez 0ae853c2fa
Some checks failed
Rust CI / Security Audit (push) Has been cancelled
Rust CI / Check + Test + Lint (nightly) (push) Has been cancelled
Rust CI / Check + Test + Lint (stable) (push) Has been cancelled
chore: create stratum-embeddings and stratum-llm crates, docs
2026-01-24 02:03:12 +00:00

5.0 KiB

stratum-embeddings

Unified embedding providers with caching, batch processing, and vector storage for the STRATUMIOPS ecosystem.

Features

  • Multiple Providers: FastEmbed (local), OpenAI, Ollama
  • Smart Caching: In-memory caching with configurable TTL
  • Batch Processing: Efficient batch embedding with automatic chunking
  • Vector Storage: LanceDB (scale-first) and SurrealDB (graph-first)
  • Fallback Support: Automatic failover between providers
  • Feature Flags: Modular compilation for minimal dependencies

Architecture

┌─────────────────────────────────────────┐
│         EmbeddingService                │
│  (facade with caching + fallback)       │
└─────────────┬───────────────────────────┘
              │
    ┌─────────┴─────────┐
    ▼                   ▼
┌─────────────┐   ┌─────────────┐
│  Providers  │   │    Cache    │
│             │   │             │
│ • FastEmbed │   │ • Memory    │
│ • OpenAI    │   │ • (Sled)    │
│ • Ollama    │   │             │
└─────────────┘   └─────────────┘

Quick Start

Basic Usage

use stratum_embeddings::{
    EmbeddingService, FastEmbedProvider, MemoryCache, EmbeddingOptions
};
use std::time::Duration;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let provider = FastEmbedProvider::small()?;
    let cache = MemoryCache::new(1000, Duration::from_secs(300));
    let service = EmbeddingService::new(provider).with_cache(cache);

    let options = EmbeddingOptions::default_with_cache();
    let embedding = service.embed("Hello world", &options).await?;

    println!("Generated {} dimensions", embedding.len());
    Ok(())
}

Batch Processing

let texts = vec![
    "Text 1".to_string(),
    "Text 2".to_string(),
    "Text 3".to_string(),
];

let result = service.embed_batch(texts, &options).await?;
println!("Embeddings: {}, Cached: {}",
    result.embeddings.len(),
    result.cached_count
);

Vector Storage

LanceDB (Provisioning, Vapora)

use stratum_embeddings::{LanceDbStore, VectorStore, VectorStoreConfig};

let config = VectorStoreConfig::new(384);
let store = LanceDbStore::new("./data", "embeddings", config).await?;

store.upsert("doc1", &embedding, metadata).await?;
let results = store.search(&query_embedding, 10, None).await?;

SurrealDB (Kogral)

use stratum_embeddings::{SurrealDbStore, VectorStore, VectorStoreConfig};

let config = VectorStoreConfig::new(384);
let store = SurrealDbStore::new_memory("concepts", config).await?;

store.upsert("concept1", &embedding, metadata).await?;
let results = store.search(&query_embedding, 10, None).await?;

Feature Flags

Providers

  • fastembed-provider (default) - Local embeddings via fastembed
  • openai-provider - OpenAI API embeddings
  • ollama-provider - Ollama local server embeddings
  • all-providers - All embedding providers

Cache

  • memory-cache (default) - In-memory caching with moka
  • persistent-cache - Persistent cache with sled
  • all-cache - All cache backends

Vector Storage

  • lancedb-store - LanceDB vector storage (columnar, disk-native)
  • surrealdb-store - SurrealDB vector storage (graph + vector)
  • all-stores - All storage backends

Project Presets

  • kogral - fastembed + memory + surrealdb
  • provisioning - openai + memory + lancedb
  • vapora - all-providers + memory + lancedb
  • full - Everything enabled

Examples

Run examples with:

cargo run --example basic_usage --features=default
cargo run --example fallback_demo --features=fastembed-provider,ollama-provider
cargo run --example lancedb_usage --features=lancedb-store
cargo run --example surrealdb_usage --features=surrealdb-store

Provider Comparison

Provider Type Cost Dimensions Use Case
FastEmbed Local Free 384-1024 Dev, privacy-first
OpenAI Cloud $0.02-0.13/1M 1536-3072 Production RAG
Ollama Local Free 384-1024 Self-hosted

Storage Backend Comparison

Backend Best For Strength Scale
LanceDB RAG, traces Columnar, IVF-PQ index Billions
SurrealDB Knowledge graphs Unified graph+vector queries Millions

Configuration

Environment variables:

# FastEmbed
FASTEMBED_MODEL=bge-small-en

# OpenAI
OPENAI_API_KEY=sk-...
OPENAI_MODEL=text-embedding-3-small

# Ollama
OLLAMA_MODEL=nomic-embed-text
OLLAMA_BASE_URL=http://localhost:11434

Development

cargo check -p stratum-embeddings --all-features
cargo test -p stratum-embeddings --all-features
cargo clippy -p stratum-embeddings --all-features -- -D warnings

License

MIT OR Apache-2.0