Provider Pattern
Overview
Section titled “Overview”In the previous chapter, we confirmed that the pipeline is configured inside the Initialize method. So how exactly is this pipeline built? If you have used LINQ, operators like Select, Where, and Collect should be familiar. The Provider pattern uses exactly these LINQ-style declarative operators to compose a data pipeline that extracts and transforms needed information from source code. Our project’s ObservablePortGenerator also uses this pattern to find classes with the [GenerateObservablePort] attribute and transform them into ObservableClassInfo.
Learning Objectives
Section titled “Learning Objectives”Core Learning Objectives
Section titled “Core Learning Objectives”- Understand the difference between IncrementalValuesProvider and IncrementalValueProvider
- Distinguishing between multiple values (0..N) and a single value (exactly 1)
- Learn pipeline composition using LINQ-style operators
- Roles and usage timing of Select, Where, Collect, and Combine
- Analyze the actual pipeline structure of ObservablePortGenerator
- ForAttributeWithMetadataName -> Where -> Collect flow
What is a Provider?
Section titled “What is a Provider?”A Provider is the core element for composing the source generator’s data pipeline. Just as you chain Select and Where on IEnumerable<T> in LINQ, you chain operators with the same names on Providers to declaratively express the process of extracting and transforming needed information from source code.
Provider Pipeline Flow=======================
Source Code | v+-------------------------+| SyntaxProvider | Extract nodes from source| (ForAttributeWithMeta...)|+----------+--------------+ | v+-------------------------+| Select | Data transformation| (Syntax -> needed info) |+----------+--------------+ | v+-------------------------+| Where | Filtering| (select valid only) |+----------+--------------+ | v+-------------------------+| Collect | Collect into array| (individual items -> array) |+----------+--------------+ | vRegisterSourceOutput(code generation)Two Provider Types
Section titled “Two Provider Types”IncrementalValuesProvider
Section titled “IncrementalValuesProvider”Represents multiple (0 or more) values:
// Multiple classes may have the [GenerateObservablePort] attributeIncrementalValuesProvider<ObservableClassInfo> provider = context.SyntaxProvider .ForAttributeWithMetadataName(...);
// 0: No classes with the attribute// 1: One class// N: Multiple classesIncrementalValueProvider
Section titled “IncrementalValueProvider”Represents exactly one value:
// Compilation options are always exactly oneIncrementalValueProvider<CompilationOptions> options = context.CompilationOptionsProvider;
// Converting with Collect yields a single valueIncrementalValueProvider<ImmutableArray<ObservableClassInfo>> collected = provider.Collect();Key Operators
Section titled “Key Operators”Each operator has the same meaning as its LINQ counterpart. The difference is that these operators are integrated with the compiler’s incremental caching system, reusing previous results when input has not changed.
Select - Data Transformation
Section titled “Select - Data Transformation”// SyntaxNode -> class namevar classNames = context.SyntaxProvider .ForAttributeWithMetadataName(...) .Select((ctx, _) => ctx.TargetSymbol.Name);
// ObservableClassInfo -> code to generatevar codes = provider .Select((info, _) => GenerateCode(info));Where - Filtering
Section titled “Where - Filtering”// Select only valid itemsvar validClasses = provider .Where(x => x != ObservableClassInfo.None);
// Select only public classesvar publicClasses = provider .Where(x => x.IsPublic);Collect - Collect Into Array
Section titled “Collect - Collect Into Array”// IncrementalValuesProvider<T> -> IncrementalValueProvider<ImmutableArray<T>>var collected = provider.Collect();
// Useful when processing multiple items at oncecontext.RegisterSourceOutput(collected, (ctx, items) =>{ foreach (var item in items) { ctx.AddSource(...); }});Combine - Combine Two Providers
Section titled “Combine - Combine Two Providers”// Combine class info + compilation optionsvar combined = provider.Combine(context.CompilationOptionsProvider);
context.RegisterSourceOutput(combined, (ctx, pair) =>{ var classInfo = pair.Left; var options = pair.Right; // ...});Actual Code: ObservablePortGenerator
Section titled “Actual Code: ObservablePortGenerator”Now that we have examined individual operators, let us see how they are combined in our project.
private static IncrementalValuesProvider<ObservableClassInfo> RegisterSourceProvider( IncrementalGeneratorInitializationContext context){ // Stage 1: Generate fixed code (Attribute definition) context.RegisterPostInitializationOutput(ctx => ctx.AddSource( hintName: GenerateObservablePortAttributeFileName, sourceText: SourceText.From(GenerateObservablePortAttribute, Encoding.UTF8)));
// Stage 2: Configure pipeline return context .SyntaxProvider // Select only classes with [GenerateObservablePort] attribute .ForAttributeWithMetadataName( fullyQualifiedMetadataName: FullyQualifiedAttributeName, predicate: IsClass, // Syntax-level filter transform: MapToObservableClassInfo) // Semantic info extraction // Exclude invalid items .Where(x => x != ObservableClassInfo.None);}Pipeline Composition Patterns
Section titled “Pipeline Composition Patterns”Pattern 1: Simple Transformation
Section titled “Pattern 1: Simple Transformation”// Extract only class namesvar classNames = context.SyntaxProvider .ForAttributeWithMetadataName("MyAttribute", ...) .Select((ctx, _) => ctx.TargetSymbol.Name);
context.RegisterSourceOutput(classNames, (ctx, name) =>{ ctx.AddSource($"{name}.g.cs", $"// Generated for {name}");});Pattern 2: Complex Data Structure
Section titled “Pattern 2: Complex Data Structure”// Transform into detailed info recordvar classInfos = context.SyntaxProvider .ForAttributeWithMetadataName("MyAttribute", ...) .Select((ctx, _) => new ClassInfo( Name: ctx.TargetSymbol.Name, Namespace: ctx.TargetSymbol.ContainingNamespace.ToString(), Methods: GetMethods(ctx.TargetSymbol)));
context.RegisterSourceOutput(classInfos, (ctx, info) =>{ var code = GenerateCode(info); ctx.AddSource($"{info.Name}.g.cs", code);});Pattern 3: Batch Processing
Section titled “Pattern 3: Batch Processing”// Process all classes at oncevar allClasses = context.SyntaxProvider .ForAttributeWithMetadataName("MyAttribute", ...) .Collect(); // Collect into ImmutableArray
context.RegisterSourceOutput(allClasses, (ctx, classes) =>{ // Generate summary file var summary = string.Join("\n", classes.Select(c => c.Name)); ctx.AddSource("Summary.g.cs", $"// Generated {classes.Length} classes\n{summary}");
// Generate file for each class foreach (var cls in classes) { ctx.AddSource($"{cls.Name}.g.cs", GenerateCode(cls)); }});Pattern 4: Conditional Combination
Section titled “Pattern 4: Conditional Combination”// Generate different code based on compilation optionsvar withOptions = provider .Combine(context.CompilationOptionsProvider);
context.RegisterSourceOutput(withOptions, (ctx, pair) =>{ var (classInfo, options) = pair;
string code = options.OptimizationLevel == OptimizationLevel.Debug ? GenerateDebugCode(classInfo) : GenerateReleaseCode(classInfo);
ctx.AddSource($"{classInfo.Name}.g.cs", code);});Caching and Performance
Section titled “Caching and Performance”The most important reason for using the Provider pattern is automatic caching. At each stage of the pipeline, if the input is the same as before, the compiler retrieves the result from cache and skips processing.
Behavior During Incremental Build==================================
1. File A modified | v2. Pipeline re-execution - File A: Process anew - File B: Retrieved from cache (processing skipped) - File C: Retrieved from cache (processing skipped) | v3. Code regenerated only for changed file ACaching Considerations
Section titled “Caching Considerations”// Bad example: Non-deterministic data.Select((ctx, _) => new ClassInfo( Name: ctx.TargetSymbol.Name, Timestamp: DateTime.Now // Different value every time!))
// Good example: Deterministic data.Select((ctx, _) => new ClassInfo( Name: ctx.TargetSymbol.Name, Namespace: ctx.TargetSymbol.ContainingNamespace.ToString()))Data Model Design
Section titled “Data Model Design”For caching to work correctly, the data model must have value semantics. Two objects with the same content must be judged equal by Equals so the compiler can recognize “no change” and utilize the cache. This is why our project’s ObservableClassInfo is defined as a readonly record struct.
// Using readonly record struct (value semantics + automatic Equals/GetHashCode)public readonly record struct ObservableClassInfo{ public readonly string Namespace; public readonly string ClassName; public readonly List<MethodInfo> Methods; public readonly List<ParameterInfo> BaseConstructorParameters; public readonly Location? Location;
// None pattern: use empty object instead of null public static readonly ObservableClassInfo None = new( string.Empty, string.Empty, new List<MethodInfo>(), new List<ParameterInfo>(), null);
public ObservableClassInfo( string @namespace, string className, List<MethodInfo> methods, List<ParameterInfo> baseConstructorParameters, Location? location) { Namespace = @namespace; ClassName = className; Methods = methods; BaseConstructorParameters = baseConstructorParameters; Location = location; }}
// Constructor-based classpublic class MethodInfo{ public string Name { get; } public List<ParameterInfo> Parameters { get; } public string ReturnType { get; }
public MethodInfo(string name, List<ParameterInfo> parameters, string returnType) { Name = name; Parameters = parameters; ReturnType = returnType; }}
public class ParameterInfo{ public string Name { get; } public string Type { get; } public RefKind RefKind { get; } public bool IsCollection { get; }
public ParameterInfo(string name, string type, RefKind refKind) { Name = name; Type = type; RefKind = refKind; IsCollection = CollectionTypeHelper.IsCollectionType(type); }}Summary at a Glance
Section titled “Summary at a Glance”The Provider pattern composes the source generation pipeline in the same declarative style as LINQ, while providing automatic caching at each stage to guarantee incremental build performance. Applying value semantics to data models is the key prerequisite for caching.
| Provider Type | Value Count | Purpose |
|---|---|---|
IncrementalValuesProvider<T> | 0..N | Processing multiple items |
IncrementalValueProvider<T> | Exactly 1 | Single value, Collect result |
| Operator | Function | Returns |
|---|---|---|
Select | Transform | Same Provider type |
Where | Filter | ValuesProvider |
Collect | Collect into array | ValueProvider |
Combine | Combine | ValueProvider (tuple) |
Q1: How do IncrementalValuesProvider<T> and IncrementalValueProvider<T> differ?
Section titled “Q1: How do IncrementalValuesProvider<T> and IncrementalValueProvider<T> differ?”A: IncrementalValuesProvider<T> provides 0 or more values as a stream and supports operators like Select and Where. IncrementalValueProvider<T> provides exactly one value, and results of Collect() or Combine() are of this type. Both types can be passed to RegisterSourceOutput when registering code generation.
Q2: Why does using readonly record struct for data models improve caching performance?
Section titled “Q2: Why does using readonly record struct for data models improve caching performance?”A: record struct automatically generates value-based Equals/GetHashCode. Roslyn compares previous and current results at each pipeline stage and skips subsequent stages if they are identical. Accurate value comparison ensures higher cache hit rates and reduces unnecessary code regeneration.
Q3: In what situations is the Combine operator used?
Section titled “Q3: In what situations is the Combine operator used?”A: It is used when you need to combine data extracted from source code with external information like compilation options. For example, to generate different code depending on Debug/Release mode, you can combine two data sources with provider.Combine(context.CompilationProvider) and reference both during code generation.
Now that we understand the full flow of the Provider pipeline, next we examine ForAttributeWithMetadataName, the most frequently used API at the pipeline’s starting point. We will see how this API optimizes attribute-based filtering and why it is 10-100x faster compared to manual implementation.