Syntax API
Overview
Section titled “Overview”In the previous chapter, we examined the full picture of Roslyn’s architecture. Now we dive deep into its first layer: the Syntax API.
The Syntax API provides a structural representation of source code. The predicate stage that quickly filters “does a certain class have a specific attribute?” in source generators is exactly the domain of the Syntax API. Our project’s Selectors.IsClass is a typical example. However, the Syntax API alone cannot determine the full name of a type or whether it implements an interface, and recognizing this limitation is the starting point for understanding the Semantic API in the next chapter.
Learning Objectives
Section titled “Learning Objectives”Core Learning Objectives
Section titled “Core Learning Objectives”- Understand the differences between SyntaxNode, SyntaxToken, and SyntaxTrivia
- The roles and relationships of the three elements that compose a Syntax Tree
- Learn Syntax Tree traversal methods
- Traversal APIs such as
DescendantNodes(),ChildNodes(),Ancestors()
- Traversal APIs such as
- Learn how to use Syntax API in source generators
- Implementing syntax-level filtering in the
predicateofForAttributeWithMetadataName
- Implementing syntax-level filtering in the
Components of a Syntax Tree
Section titled “Components of a Syntax Tree”A Syntax Tree consists of three elements:
Syntax Tree Components====================
SyntaxNode (node)├── Unit representing grammatical structure├── Examples: class declaration, method declaration, if statement└── Contains child nodes or tokens
SyntaxToken (token)├── Smallest grammatical unit├── Examples: keyword (class), identifier (User), operator (+)└── Contains Leading/Trailing Trivia
SyntaxTrivia (trivia)├── Insignificant text├── Examples: whitespace, line breaks, comments└── Attached to tokensUnderstanding Through Example
Section titled “Understanding Through Example”// Original codepublic class User { }Syntax Tree Structure=====================
ClassDeclarationSyntax (node)├── Modifiers: [public] (token)│ └── LeadingTrivia: [whitespace]├── Keyword: [class] (token)│ └── LeadingTrivia: [whitespace]├── Identifier: [User] (token)│ └── LeadingTrivia: [whitespace]├── OpenBraceToken: [{] (token)│ └── LeadingTrivia: [whitespace]└── CloseBraceToken: [}] (token) └── LeadingTrivia: [whitespace]Key SyntaxNode Types
Section titled “Key SyntaxNode Types”Each C# grammar element has a corresponding SyntaxNode. The most frequently used in source generator development are declaration-related nodes. Our project primarily works with ClassDeclarationSyntax and InterfaceDeclarationSyntax.
Declaration-related (most frequently used in source generators)=========CompilationUnitSyntax Entire fileNamespaceDeclarationSyntax NamespaceClassDeclarationSyntax Class <- Used in Selectors.IsClassInterfaceDeclarationSyntax Interface <- Used in Selectors.IsInterfaceMethodDeclarationSyntax MethodPropertyDeclarationSyntax PropertyFieldDeclarationSyntax FieldParameterSyntax Parameter
Statement-related=========BlockSyntax { } blockIfStatementSyntax if statementForStatementSyntax for statementReturnStatementSyntax return statementExpressionStatementSyntax Expression statement
Expression-related==========InvocationExpressionSyntax Method callMemberAccessExpressionSyntax Member access (a.b)LiteralExpressionSyntax Literal (5, "hello")IdentifierNameSyntax Identifier (variable name)Syntax Tree Traversal
Section titled “Syntax Tree Traversal”DescendantNodes - All Descendant Nodes
Section titled “DescendantNodes - All Descendant Nodes”string code = """ public class User { public int Id { get; set; } public string Name { get; set; } } """;
var tree = CSharpSyntaxTree.ParseText(code);var root = tree.GetRoot();
// Find all property declarationsvar properties = root .DescendantNodes() .OfType<PropertyDeclarationSyntax>();
foreach (var prop in properties){ Console.WriteLine($"{prop.Type} {prop.Identifier}");}// Output:// int Id// string NameChildNodes - Direct Children Only
Section titled “ChildNodes - Direct Children Only”var classDecl = root .DescendantNodes() .OfType<ClassDeclarationSyntax>() .First();
// Only direct children of the class (properties, methods, etc.)var members = classDecl.ChildNodes();
foreach (var member in members){ Console.WriteLine($"Member kind: {member.Kind()}");}// Output:// Member kind: PropertyDeclaration// Member kind: PropertyDeclarationAncestors - Parent Nodes
Section titled “Ancestors - Parent Nodes”var property = root .DescendantNodes() .OfType<PropertyDeclarationSyntax>() .First();
// Parent nodes of the propertyvar ancestors = property.Ancestors();
foreach (var ancestor in ancestors){ Console.WriteLine($"Parent: {ancestor.Kind()}");}// Output:// Parent: ClassDeclaration// Parent: CompilationUnitUsage in Source Generators
Section titled “Usage in Source Generators”predicate of ForAttributeWithMetadataName
Section titled “predicate of ForAttributeWithMetadataName”context.SyntaxProvider .ForAttributeWithMetadataName( "MyNamespace.GenerateObservablePortAttribute", // predicate: Uses Syntax API predicate: (node, cancellationToken) => { // Check if node is a class return node is ClassDeclarationSyntax classDecl // Check if it's a public class (determinable from Syntax alone) && classDecl.Modifiers.Any(SyntaxKind.PublicKeyword); }, transform: (ctx, ct) => /* ... */ );Actual Code: Selectors.cs
Section titled “Actual Code: Selectors.cs”// Selectors.cs from the Functorium projectnamespace Functorium.SourceGenerators.Abstractions;
public static class Selectors{ /// <summary> /// Checks if the node is a class declaration. /// </summary> public static bool IsClass(SyntaxNode node, CancellationToken cancellationToken) => node is ClassDeclarationSyntax;
/// <summary> /// Checks if the node is an interface declaration. /// </summary> public static bool IsInterface(SyntaxNode node, CancellationToken cancellationToken) => node is InterfaceDeclarationSyntax;}SyntaxToken Usage
Section titled “SyntaxToken Usage”Accessing Token Information
Section titled “Accessing Token Information”var classDecl = root .DescendantNodes() .OfType<ClassDeclarationSyntax>() .First();
// Class name tokenSyntaxToken identifier = classDecl.Identifier;Console.WriteLine($"Name: {identifier.Text}"); // UserConsole.WriteLine($"Position: {identifier.SpanStart}"); // character positionConsole.WriteLine($"Kind: {identifier.Kind()}"); // IdentifierToken
// Modifier tokensvar modifiers = classDecl.Modifiers;foreach (var modifier in modifiers){ Console.WriteLine($"Modifier: {modifier.Text}"); // public}Checking Specific Modifiers
Section titled “Checking Specific Modifiers”// Check if publicbool isPublic = classDecl.Modifiers.Any(SyntaxKind.PublicKeyword);
// Check if partialbool isPartial = classDecl.Modifiers.Any(SyntaxKind.PartialKeyword);
// Check if abstractbool isAbstract = classDecl.Modifiers.Any(SyntaxKind.AbstractKeyword);SyntaxTrivia Usage
Section titled “SyntaxTrivia Usage”Used when comment or whitespace information is needed:
string code = """ /// <summary> /// User information /// </summary> public class User { } """;
var tree = CSharpSyntaxTree.ParseText(code);var classDecl = tree.GetRoot() .DescendantNodes() .OfType<ClassDeclarationSyntax>() .First();
// Trivia before the public keyword (including comments)var leadingTrivia = classDecl.GetLeadingTrivia();
foreach (var trivia in leadingTrivia){ if (trivia.IsKind(SyntaxKind.SingleLineDocumentationCommentTrivia)) { Console.WriteLine($"Documentation comment found: {trivia}"); }}Pattern Matching and Syntax API
Section titled “Pattern Matching and Syntax API”C# pattern matching makes it easy to analyze Syntax nodes:
// Method analysisvoid AnalyzeMethod(SyntaxNode node){ if (node is MethodDeclarationSyntax method) { // Method name var name = method.Identifier.Text;
// Return type (Syntax level) var returnType = method.ReturnType switch { PredefinedTypeSyntax predefined => predefined.Keyword.Text, IdentifierNameSyntax identifier => identifier.Identifier.Text, GenericNameSyntax generic => $"{generic.Identifier}<...>", _ => "unknown" };
// Parameter list var parameters = method.ParameterList.Parameters .Select(p => $"{p.Type} {p.Identifier}") .ToList();
Console.WriteLine($"{returnType} {name}({string.Join(", ", parameters)})"); }}Limitations of the Syntax API
Section titled “Limitations of the Syntax API”Clearly recognizing these limitations is the key takeaway of learning the Syntax API. The Syntax API alone cannot determine type information. This is precisely why our project’s ObservablePortGenerator first filters with ClassDeclarationSyntax in the predicate, then necessarily uses the Semantic API (ctx.TargetSymbol) in the transform:
string code = """ public class Example { public void Process(User user) { } } """;
var method = tree.GetRoot() .DescendantNodes() .OfType<MethodDeclarationSyntax>() .First();
var parameter = method.ParameterList.Parameters.First();
// What Syntax alone can tellConsole.WriteLine(parameter.Type!.ToString()); // "User" (string)Console.WriteLine(parameter.Identifier.Text); // "user"
// What Syntax alone cannot tell// - Is User a class or interface?// - What is User's namespace?// - Which assembly is User defined in?// -> Such information requires the Semantic APISummary at a Glance
Section titled “Summary at a Glance”The Syntax API is a tool for quickly traversing the structure of source code. In source generators, it is primarily used for first-pass filtering in the predicate stage, while detailed analysis requiring type resolution is delegated to the Semantic API.
| Component | Role | Example |
|---|---|---|
| SyntaxNode | Grammatical structure | ClassDeclarationSyntax |
| SyntaxToken | Smallest grammatical unit | public, User |
| SyntaxTrivia | Whitespace, comments | spaces, // comment |
| Traversal Method | Description |
|---|---|
DescendantNodes() | All descendant nodes |
ChildNodes() | Direct children only |
Ancestors() | All parent nodes |
GetLeadingTrivia() | Leading trivia |
GetTrailingTrivia() | Trailing trivia |
Q1: Why does the source generator’s predicate use only the Syntax API?
Section titled “Q1: Why does the source generator’s predicate use only the Syntax API?”A: The predicate is called for every syntax node, so it must execute quickly. The Syntax API only traverses the parsed tree, making it low-cost, while the Semantic API performs type resolution, making it expensive. The performance optimization pattern is to perform fast first-pass filtering to reduce candidates, then use the Semantic API only in the transform.
Q2: When is SyntaxTrivia used in actual source generators?
Section titled “Q2: When is SyntaxTrivia used in actual source generators?”A: In most source generators, you rarely need to work with SyntaxTrivia directly. However, it is used when analyzing XML documentation comments to reflect them in generated code, or when building code formatting tools that need whitespace and line break information.
Q3: What is the criterion for choosing between DescendantNodes() and ChildNodes()?
Section titled “Q3: What is the criterion for choosing between DescendantNodes() and ChildNodes()?”A: ChildNodes() returns only direct children, so its scope is narrow and fast. DescendantNodes() recursively traverses all descendant nodes. Use DescendantNodes() when finding specific nodes in nested structures, and ChildNodes() when looking only one level down.
As we saw in the limitations of the Syntax API, semantic information such as the full name of a type or whether it implements an interface cannot be obtained through syntax analysis alone. In the next chapter, we learn the Semantic API that goes beyond these limitations.