Skip to content

Data Collection

With the environment ready, it is time to collect the raw data that will serve as the ingredients for the release notes. In Phase 2, two C# scripts are executed in sequence to extract per-component changes and the complete Public API. Since the data created in this step becomes the foundation for all subsequent analysis and document writing, accurate and complete collection is critical.

All work is performed from the script directory.

Terminal window
cd .release-notes/scripts

The first script, AnalyzeAllComponents.cs, explores the Git history to analyze which files changed and which commits were made for each component (project). The Base/Target range determined in Phase 1 is passed as arguments.

For first deployments, analysis starts from the initial commit.

Terminal window
# Find the initial commit SHA
FIRST_COMMIT=$(git rev-list --max-parents=0 HEAD)
# Analyze from the initial commit to current
dotnet AnalyzeAllComponents.cs --base $FIRST_COMMIT --target HEAD

For subsequent releases, the previous release branch is used as the baseline.

Terminal window
dotnet AnalyzeAllComponents.cs --base origin/release/1.0 --target HEAD

This script produces two types of output.

Individual component analysis files (.analysis-output/*.md) are generated one per component. Each file contains overall change statistics such as the number of added/modified/deleted files, a complete commit history for the component, contributor information, and a commit list classified into features/bug fixes/Breaking Changes. These files are the primary input when analyzing commits and extracting features in Phase 3.

Analysis summary (.analysis-output/analysis-summary.md) is a high-level overview of all component changes. It shows the number of changed files per component and the list of generated analysis files at a glance.

The second script, ExtractApiChanges.cs, builds projects and extracts Public APIs from DLLs. If Step 1 tells us “what changed”, Step 2 tells us “what the current API looks like exactly.”

Terminal window
dotnet ExtractApiChanges.cs

This script produces three key outputs.

Uber API file (all-api-changes.txt) is the Single Source of Truth of this workflow. Generated in the .analysis-output/api-changes-build-current/ directory, it contains all Public API definitions of the current build with exact parameter names and types. When writing code examples in Phase 4, any API not in this file is not documented. This is the most important safeguard preventing non-existent APIs from being included in the release notes.

API change diff (api-changes-diff.txt) is the Git diff of the .api folder, used for automatic Breaking Changes detection. It can objectively identify deleted or signature-changed APIs, catching Breaking Changes that might be missed by commit messages alone.

Individual API files (Src/*/.api/*.cs) define each assembly’s Public API in C# source code format. They are tracked by Git to manage API change history.

The complete file structure generated after script execution.

.release-notes/scripts/
└── .analysis-output/
├── analysis-summary.md # Overall summary
├── Functorium.md # Src/Functorium analysis
├── Functorium.Testing.md # Src/Functorium.Testing analysis
├── Docs.md # Docs analysis
└── api-changes-build-current/
├── all-api-changes.txt # Uber file (all APIs)
├── api-changes-summary.md # API summary
└── api-changes-diff.txt # API differences

Let’s look at the format of each component file.

# Analysis for Src/Functorium
Generated: 2025-12-19 10:30:00
Comparing: origin/release/1.0 -> HEAD
## Change Summary
[git diff --stat output]
## All Commits
[Commit SHA and message list]
## Top Contributors
[Commits per contributor]
## Categorized Commits
### Feature Commits
[feat, feature, add pattern commits]
### Bug Fixes
[fix, bug pattern commits]
### Breaking Changes
[breaking, BREAKING, !: pattern commits]

AnalyzeAllComponents.cs automatically classifies commits using Conventional Commits patterns. More refined analysis occurs in Phase 3, but the initial classification at this step provides the baseline data.

Search keywords: feat, feature, add

Examples:
- feat: Add user authentication
- add: Logging feature
- feature(api): New endpoint

Search keywords: fix, bug

Examples:
- fix: Handle null reference exception
- bug: Fix memory leak

Search conditions (OR):

  1. Contains breaking or BREAKING string
  2. ! pattern after type (e.g., feat!:, fix!:)
Examples:
- feat!: Change API response format
- feat!: BREAKING CHANGE: Change API format
- fix: breaking: Compatibility change
TypeDescriptionIncluded in Release Notes
docsDocumentation changeUsually omitted
refactorRefactoringUsually omitted
perfPerformance improvementIncluded
testAdd/modify testsOmitted
buildBuild system changeUsually omitted
choreOther changesOmitted
ciCI configuration changeOmitted

After script execution, verify that the outputs were properly generated.

Terminal window
# Check number of component files
ls -1 .analysis-output/*.md | wc -l
# Verify main components exist
ls .analysis-output/Functorium*.md
# Check analysis summary
cat .analysis-output/analysis-summary.md
Terminal window
# Check Uber file existence and size
wc -l .analysis-output/api-changes-build-current/all-api-changes.txt
# Check key APIs (example)
grep -c "ErrorCodeFactory" .analysis-output/api-changes-build-current/all-api-changes.txt
# Check API files
ls Src/*/.api/*.cs
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Phase 2: Data Collection Complete
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Generated component analysis files:
analysis-summary.md
Functorium.md (31 files, 19 commits)
Functorium.Testing.md (18 files, 13 commits)
Docs.md (38 files, 37 commits)
Generated API files:
all-api-changes.txt (Uber file)
api-changes-summary.md
api-changes-diff.txt
Src/Functorium/.api/Functorium.cs
Src/Functorium.Testing/.api/Functorium.Testing.cs
Location: .release-notes/scripts/.analysis-output/
Script execution failed: AnalyzeAllComponents.cs
Error: <error message>
Troubleshooting:
1. Delete .analysis-output folder and retry
rmdir /s /q .analysis-output (Windows)
rm -rf .analysis-output (Linux/Mac)
2. Clear NuGet cache
dotnet nuget locals all --clear
3. Terminate dotnet processes (Windows)
taskkill /F /IM dotnet.exe
API extraction failed: ExtractApiChanges.cs
Error: <error message>
Possible causes:
1. Build error: Project does not build
2. No DLL: No build output
3. No API: No public types
Resolution:
1. Verify project build
dotnet build -c Release
2. Fix build errors and retry

Data collection is performed only once at the start of the workflow. Re-running scripts during Phase 4 document writing can overwrite analysis results and break consistency. Since the Uber file is the single source of truth for all API verification, all documented APIs must exist in this file. Commit analysis results are provided on a feature basis, directly informing the section structure of the release notes.

Q1: Does the execution order of AnalyzeAllComponents.cs and ExtractApiChanges.cs matter?

Section titled “Q1: Does the execution order of AnalyzeAllComponents.cs and ExtractApiChanges.cs matter?”

A: Yes. AnalyzeAllComponents.cs is run first to collect per-component commit history, and then ExtractApiChanges.cs is run to extract the current build’s Public API. The data generated by the two scripts is used for different purposes, but Phase 3 must analyze both results together to write complete release notes.

Q2: What problems arise from re-running data collection during the workflow?

Section titled “Q2: What problems arise from re-running data collection during the workflow?”

A: Files in the .analysis-output/ folder are overwritten, breaking data consistency. The commit list already analyzed in Phase 3 and the Uber file referenced in Phase 4 could differ, so data collection should be performed only once at the start of the workflow.

Q3: What happens if an API not in the Uber file is included in the release notes?

Section titled “Q3: What happens if an API not in the Uber file is included in the release notes?”

A: It is detected as an “API accuracy error” in Phase 5 validation. Since the Uber file is extracted from compiled DLLs, APIs not in it either do not actually exist or have internal access level. Documenting such APIs would cause users to encounter compilation errors when trying to use non-existent APIs.

Once data collection is complete, proceed to Phase 3: Commit Analysis to transform raw data into meaningful features.