An F# library providing streaming-capable bindings to whisper.cpp, designed from the ground up to support both real-time transcription and batch processing scenarios.
- π― Comparable to Whisper.NET - Push-to-talk batch support with enhanced streaming capabilities
- π True Streaming Support - Real-time transcription using whisper.cpp's state management
- π§ Unified API - Single
IWhisperClientinterface for both batch and streaming modes - π Token-Level Access - Confidence scores and timestamps for fine-grained control
- π Language Detection - Automatic language identification with confidence scores
- πͺ Platform Optimized - Automatic GPU detection (CUDA, OpenCL, CoreML) with CPU fallback
- π· F# Idiomatic - Leverages discriminated unions, async workflows, and observables
- β‘ Zero-Copy Operations - Efficient memory management for audio buffers
- π Robust Error Handling - Result types with comprehensive error discrimination
dotnet add package WhisperFSWhisperFS automatically downloads and manages the appropriate native runtime for your platform:
- Windows: CUDA, OpenCL, AVX2, AVX, or CPU variants
- macOS: CoreML optimized, OpenCL, or CPU variants
- Linux: CUDA, OpenCL, or CPU variants
For detailed GPU acceleration support including OpenCL for AMD/Intel GPUs, see Native Libraries Documentation.
open WhisperFS
// Build a client with fluent configuration
let! clientResult =
WhisperBuilder()
.WithModel(ModelType.Base)
.WithLanguage("en")
.WithGpu()
.Build()
match clientResult with
| Ok client ->
// Process audio file
let! result = client.ProcessFileAsync("audio.wav")
match result with
| Ok transcription ->
printfn "Text: %s" transcription.FullText
printfn "Duration: %A" transcription.Duration
// Access segments with timestamps
for segment in transcription.Segments do
printfn "[%.2f-%.2f] %s"
segment.StartTime
segment.EndTime
segment.Text
| Error err ->
printfn "Transcription failed: %A" err
| Error err ->
printfn "Failed to create client: %A" erropen WhisperFS
open System.Reactive.Linq
// Create streaming client
let! clientResult =
WhisperBuilder()
.WithModel(ModelType.Base)
.WithStreaming(chunkMs = 1000, overlapMs = 200)
.WithTokenTimestamps()
.Build()
match clientResult with
| Ok client ->
// Create audio source (e.g., from microphone)
let audioSource = AudioCapture.CreateMicrophone(sampleRate = 16000)
// Process stream with real-time updates
client.ProcessStream(audioSource)
|> Observable.subscribe (function
| PartialTranscription(text, tokens, confidence) ->
printfn "Partial: %s (confidence: %.2f)" text confidence
| FinalTranscription(text, tokens, segments) ->
printfn "Final: %s" text
| ProcessingError msg ->
printfn "Error: %s" msg
| _ -> ())
|> ignore
| Error err ->
printfn "Failed to create streaming client: %A" errlet! detection = client.DetectLanguageAsync(audioSamples)
match detection with
| Ok lang ->
printfn "Detected language: %s (confidence: %.2f)"
lang.Language
lang.Confidence
// Access probabilities for all languages
for KeyValue(language, probability) in lang.Probabilities do
if probability > 0.01f then
printfn " %s: %.2f%%" language (probability * 100.0f)
| Error err ->
printfn "Language detection failed: %A" errlet! client =
WhisperBuilder()
.WithModel(ModelType.LargeV3)
.WithLanguageDetection() // Auto-detect language
.WithBeamSearch(beamSize = 5) // Better accuracy
.WithTemperature(0.0f) // Deterministic output
.WithPrompt("Technical terms: API, GPU, CPU, RAM")
.WithTokenTimestamps() // Enable token-level timestamps
.WithMaxSegmentLength(30) // Segment length in seconds
.WithThreads(8) // Parallel processing
.Build()type TranscriptionEvent =
| PartialTranscription of text:string * tokens:Token list * confidence:float32
| FinalTranscription of text:string * tokens:Token list * segments:Segment list
| ContextUpdate of contextData:byte[]
| ProcessingError of error:string
type IWhisperClient =
abstract member ProcessAsync: samples:float32[] -> Async<Result<TranscriptionResult, WhisperError>>
abstract member ProcessStream: audioStream:IObservable<float32[]> -> IObservable<TranscriptionEvent>
abstract member ProcessFileAsync: path:string -> Async<Result<TranscriptionResult, WhisperError>>
abstract member DetectLanguageAsync: samples:float32[] -> Async<Result<LanguageDetection, WhisperError>>
abstract member Reset: unit -> unit
abstract member StreamingMode: bool with get, settype WhisperError =
| ModelLoadError of message:string
| ProcessingError of code:int * message:string
| InvalidAudioFormat of message:string
| StateError of message:string
| NativeLibraryError of message:string
| TokenizationError of message:string
| OutOfMemory
| CancelledWhisperFS provides full backward compatibility with Whisper.NET through the IWhisperProcessor interface:
// Existing Whisper.NET code
let processor = whisperFactory.CreateBuilder()
.WithLanguage("en")
.Build()
let! result = processor.ProcessAsync(audioFile)
// WhisperFS - identical API
let processor = whisperFactory.CreateBuilder()
.WithLanguage("en")
.Build()
let! result = processor.ProcessAsync(audioFile)| Feature | Whisper.NET | WhisperFS |
|---|---|---|
| Streaming | β | β Real-time with state management |
| Token Confidence | β | β Per-token probabilities |
| Language Detection | β | β With confidence scores |
| Custom Prompts | β | β Context hints for technical terms |
| Beam Search | β | β Configurable parameters |
| Error Handling | β Exceptions | β Result types |
| Observables | β | β Reactive extensions |
# Clone the repository
git clone https://github.com/yourusername/WhisperFS.git
cd WhisperFS
# Build the solution
dotnet build
# Run tests
dotnet test
# Pack NuGet packages
dotnet pack -c ReleaseWhisperFS is designed for optimal performance:
- Memory Efficient: Streaming processes audio in chunks, not loading entire files
- Platform Optimized: Automatically uses GPU acceleration when available
- Parallel Processing: Configurable thread count for CPU processing
- Zero-Copy: Direct memory access for native interop
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
MIT License - see LICENSE for details.
- whisper.cpp - The underlying C++ implementation
- Whisper.NET - Inspiration for .NET bindings
- OpenAI Whisper - The original Whisper model
- π Documentation
- π Issue Tracker
- π¬ Discussions
