4371
Programming

How to Build a .NET AI Orchestration Library: A Step-by-Step Guide

Posted by u/Merekku · 2026-05-02 13:13:22

Introduction

If you're a .NET developer building AI features, you've likely felt the pain. The ecosystem is dominated by Python-first frameworks like LangChain, LlamaIndex, and DSPy. When a .NET port exists, it's incomplete, fights C# idioms, and uses exceptions in ways that make async code fragile. But you can create your own orchestration library—one that feels native to .NET. This guide walks you through the key decisions and implementations behind WeaveLLM, a .NET 8 AI orchestration library designed for C#. By the end, you'll have a clear blueprint to build your own or adapt existing solutions.

How to Build a .NET AI Orchestration Library: A Step-by-Step Guide
Source: dev.to

What You Need

  • .NET 8 SDK installed
  • A C# IDE (Visual Studio, Rider, VS Code with C# extensions)
  • Basic knowledge of C#, asynchronous programming, and generic types
  • Familiarity with IAsyncEnumerable<T> (recommended)
  • An LLM provider (OpenAI, Azure OpenAI, etc.) with API access
  • NuGet packages: Microsoft.Extensions.Logging, System.Text.Json

Step-by-Step Guide

Step 1: Define Your Core Types and Error Handling Strategy

The first step is to replace exceptions with a result type. In exception-based frameworks, method signatures like Task<string> hide failure paths. In .NET, you can use railway-oriented programming to make errors explicit. Create a ChainResult<T> class that encapsulates success, failure, token usage, timing, and metadata. This type should be the return type for every operation. Here's a simplified version:

public sealed class ChainResult<T>
{
    public bool IsSuccess { get; }
    public T? Value { get; }
    public ChainError? Error { get; }
    public TokenUsage? TokenUsage { get; }
    public TimeSpan Duration { get; }
    public IReadOnlyDictionary<string, object> Metadata { get; }

    public static ChainResult<T> Success(T value, TokenUsage? usage = null) => ...
    public static ChainResult<T> Failure(ChainError error) => ...
    // Add combinators like Map, Bind to compose results
}

Why this matters: Errors become first-class citizens. You can compose chains without try/catch nesting, and the type system enforces error handling at compile time.

Step 2: Implement Compile-Time Safe Generic Chains

LangChain's Python roots mean it relies on runtime type checks. For .NET, leverage generics and constraints to catch mismatches during compilation. Define a IChain<TInput, TOutput> interface with a single method:

public interface IChain<TInput, TOutput>
{
    ValueTask<ChainResult<TOutput>> ExecuteAsync(TInput input, CancellationToken token = default);
}

Then build concrete chains like LLMChain<string, string>, PromptChain<TInput, TOutput>, or MapChain<TInput, TOutput>. Use generic constraints where needed (e.g., where TInput : notnull). The key is to prevent developers from accidentally chaining incompatible types. For example, you cannot pipe a chain returning int into one expecting string without an explicit mapping.

Step 3: Build a Middleware Pipeline Inspired by ASP.NET Core

One of the best patterns in .NET is the ASP.NET Core middleware pipeline. You can adapt this for AI orchestration. Create a ChainMiddleware class that takes a delegate representing the next step. Middleware components can handle concerns like logging, caching, retries, rate limiting, and token counting—without cluttering your chain logic.

public delegate ValueTask<ChainResult<TOutput>> ChainDelegate<TInput, TOutput>(TInput input, CancellationToken token);

public abstract class ChainMiddleware<TInput, TOutput>
{
    public ChainDelegate<TInput, TOutput> Next { get; set; }
    public abstract ValueTask<ChainResult<TOutput>> InvokeAsync(TInput input, CancellationToken token);
}

Register middleware in a builder class, similar to IApplicationBuilder. This gives you a composable, testable pipeline that executes in order. You can swap implementations without changing the chain’s core logic.

Step 4: Add Streaming Support Using IAsyncEnumerable<T>

LLM responses often come in chunks. Instead of waiting for the full response, use .NET's IAsyncEnumerable<T> to stream tokens as they arrive. This improves perceived performance and enables real-time scenarios like chatbots. Design your chain to have a streaming variant:

How to Build a .NET AI Orchestration Library: A Step-by-Step Guide
Source: dev.to
public interface IStreamingChain<TInput, TOutput> : IChain<TInput, TOutput>
{
    IAsyncEnumerable<ChainToken<TOutput>> StreamAsync(TInput input, CancellationToken token = default);
}

Where ChainToken<T> can represent partial results, errors, or final tokens. Use yield return and await foreach to consume the stream. This pattern is native to C# and provides efficient, backpressure-aware streaming.

Step 5: Integrate Telemetry and Observability

Production AI applications need monitoring. Wrap your pipeline to emit metrics: token usage per chain, latency, error rates. Use System.Diagnostics.Metrics or extend with OpenTelemetry. Modify your ChainResult to include duration, token usage, and metadata. In your middleware, capture timestamps and pass them along. Expose a ChainMetrics class that integrates with .NET's dependency injection and logging. This ensures you can alert on failures or cost spikes.

public class ChainMetrics
{
    private readonly Counter<int> _tokenCounter;
    private readonly Histogram<double> _chainDuration;

    public void RecordChain(ChainResult result) { ... }
}

Attach this to each step via middleware or via a dedicated MetricsMiddleware.

Tips for Success

  • Start small: Build only the core types and one chain first. Avoid over-engineering the pipeline until you have a working example calling an LLM.
  • Test with async: Write unit tests using ValueTask and IAsyncEnumerable to ensure cancellation propagation works correctly.
  • Keep exceptions for truly exceptional cases: Use ChainResult for expected failures (rate limits, bad requests). Throw exceptions only for bugs (e.g., null input when not allowed).
  • Leverage existing .NET patterns: Mirror ASP.NET Core's UseMiddleware<T> pattern for easy registration. Use IServiceCollection for dependency injection.
  • Document type contracts: Since you're relying on generics, write clear XML comments on what each chain expects and returns. This helps your team avoid compile-time mistakes.
  • Plan for versioning: As LLM APIs change, your library should abstract providers behind a common interface. Consider a provider plugin model from day one.

By following these steps, you'll create a .NET-native AI orchestration library that feels natural to C# developers—no fighting against Python idioms, no runtime surprises, and full support for modern async patterns. WeaveLLM is one such implementation, but the principles apply to any .NET project aiming to tame LLM complexity.