132 lines
2.9 KiB
Markdown
132 lines
2.9 KiB
Markdown
|
|
# stratum-llm
|
||
|
|
|
||
|
|
Unified LLM abstraction for the stratumiops ecosystem with automatic provider detection, fallback chains, and smart caching.
|
||
|
|
|
||
|
|
## Features
|
||
|
|
|
||
|
|
- **Credential Auto-detection**: Automatically finds CLI credentials (Claude, OpenAI) and API keys
|
||
|
|
- **Provider Fallback**: Circuit breaker pattern with automatic failover across providers
|
||
|
|
- **Smart Caching**: xxHash-based request deduplication reduces duplicate API calls
|
||
|
|
- **Kogral Integration**: Inject project context from knowledge base (optional)
|
||
|
|
- **Cost Tracking**: Transparent cost estimation across all providers
|
||
|
|
- **Multiple Providers**: Anthropic Claude, OpenAI, DeepSeek, Ollama
|
||
|
|
|
||
|
|
## Quick Start
|
||
|
|
|
||
|
|
```rust
|
||
|
|
use stratum_llm::{UnifiedClient, Message, Role};
|
||
|
|
|
||
|
|
#[tokio::main]
|
||
|
|
async fn main() -> Result<(), Box<dyn std::error::Error>> {
|
||
|
|
let client = UnifiedClient::auto()?;
|
||
|
|
|
||
|
|
let messages = vec![
|
||
|
|
Message {
|
||
|
|
role: Role::User,
|
||
|
|
content: "What is Rust?".to_string(),
|
||
|
|
}
|
||
|
|
];
|
||
|
|
|
||
|
|
let response = client.generate(&messages, None).await?;
|
||
|
|
println!("{}", response.content);
|
||
|
|
|
||
|
|
Ok(())
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
## Provider Priority
|
||
|
|
|
||
|
|
1. **CLI credentials** (subscription-based, no per-token cost) - preferred
|
||
|
|
2. **API keys** from environment variables
|
||
|
|
3. **Local models** (Ollama)
|
||
|
|
|
||
|
|
The client automatically detects available credentials and builds a fallback chain.
|
||
|
|
|
||
|
|
## Features
|
||
|
|
|
||
|
|
### Default Features
|
||
|
|
|
||
|
|
```toml
|
||
|
|
[dependencies]
|
||
|
|
stratum-llm = "0.1"
|
||
|
|
```
|
||
|
|
|
||
|
|
Includes: Anthropic, OpenAI, Ollama
|
||
|
|
|
||
|
|
### All Features
|
||
|
|
|
||
|
|
```toml
|
||
|
|
[dependencies]
|
||
|
|
stratum-llm = { version = "0.1", features = ["all"] }
|
||
|
|
```
|
||
|
|
|
||
|
|
Includes: All providers, CLI detection, Kogral integration, Prometheus metrics
|
||
|
|
|
||
|
|
### Custom Feature Set
|
||
|
|
|
||
|
|
```toml
|
||
|
|
[dependencies]
|
||
|
|
stratum-llm = { version = "0.1", features = ["anthropic", "deepseek", "kogral"] }
|
||
|
|
```
|
||
|
|
|
||
|
|
Available features:
|
||
|
|
|
||
|
|
- `anthropic` - Anthropic Claude API
|
||
|
|
- `openai` - OpenAI API
|
||
|
|
- `deepseek` - DeepSeek API
|
||
|
|
- `ollama` - Ollama local models
|
||
|
|
- `claude-cli` - Claude CLI credential detection
|
||
|
|
- `kogral` - Kogral knowledge base integration
|
||
|
|
- `metrics` - Prometheus metrics
|
||
|
|
|
||
|
|
## Advanced Usage
|
||
|
|
|
||
|
|
### With Kogral Context
|
||
|
|
|
||
|
|
```rust
|
||
|
|
let client = UnifiedClient::builder()
|
||
|
|
.auto_detect()?
|
||
|
|
.with_kogral()
|
||
|
|
.build()?;
|
||
|
|
|
||
|
|
let response = client
|
||
|
|
.generate_with_kogral(&messages, None, Some("rust"), None)
|
||
|
|
.await?;
|
||
|
|
```
|
||
|
|
|
||
|
|
### Custom Fallback Strategy
|
||
|
|
|
||
|
|
```rust
|
||
|
|
use stratum_llm::{FallbackStrategy, ProviderChain};
|
||
|
|
|
||
|
|
let chain = ProviderChain::from_detected()?
|
||
|
|
.with_strategy(FallbackStrategy::OnRateLimitOrUnavailable);
|
||
|
|
|
||
|
|
let client = UnifiedClient::builder()
|
||
|
|
.with_chain(chain)
|
||
|
|
.build()?;
|
||
|
|
```
|
||
|
|
|
||
|
|
### Cost Budget
|
||
|
|
|
||
|
|
```rust
|
||
|
|
let chain = ProviderChain::from_detected()?
|
||
|
|
.with_strategy(FallbackStrategy::OnBudgetExceeded {
|
||
|
|
budget_cents: 10.0,
|
||
|
|
});
|
||
|
|
```
|
||
|
|
|
||
|
|
## Examples
|
||
|
|
|
||
|
|
Run examples with:
|
||
|
|
|
||
|
|
```bash
|
||
|
|
cargo run --example basic_usage
|
||
|
|
cargo run --example with_kogral --features kogral
|
||
|
|
cargo run --example fallback_demo
|
||
|
|
```
|
||
|
|
|
||
|
|
## License
|
||
|
|
|
||
|
|
MIT OR Apache-2.0
|