Skip to content

Commit b6108b3

Browse files
committed
feat(docs): update tutorial for building ytx .NET global tool
📝 Revised the tutorial to enhance clarity and provide updated information on building the `ytx` tool. 📁 Modified: _posts/2025-08-31-building-ytx-a-youtube-transcript-extractor-as-a-dotnet-global-tool.md 🔧 Improved descriptions and summaries to better reflect the extraction of YouTube transcripts and metadata as JSON ⚙️ Expanded CI/CD pipeline details with GitHub Actions for production readiness and best practices 📦 Included insights on using AI tools to accelerate the development process and emphasized design-first and documentation-driven approaches
1 parent 16c8c60 commit b6108b3

File tree

1 file changed

+197
-58
lines changed

1 file changed

+197
-58
lines changed

_posts/2025-08-31-building-ytx-a-youtube-transcript-extractor-as-a-dotnet-global-tool.md

Lines changed: 197 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
---
22
layout: post
33
title: Building ytx - A YouTube Transcript Extractor as a .NET Global Tool
4-
description: How to build a .NET Global Tool that extracts YouTube video metadata and transcripts as structured JSON, from concept to NuGet publication
5-
summary: A tutorial on creating ytx, a .NET global tool that extracts YouTube video titles, descriptions, and transcripts as JSON using YoutubeExplode and automated CI/CD.
4+
description: Build a .NET Global Tool to extract YouTube transcripts and metadata as JSON. Learn YoutubeExplode, CLI argument parsing, NuGet packaging, and GitHub Actions automation from concept to publication.
5+
summary: Complete tutorial on creating ytx, a .NET global tool for extracting YouTube video titles, descriptions, and transcripts as JSON. Covers YoutubeExplode library, caption selection logic, JSON serialization, NuGet packaging, and automated CI/CD with GitHub Actions.
66
cover_image: /images/ytx-dotnet-tool-cover.svg
77
tags:
88
- dotnet-global-tools
@@ -19,7 +19,7 @@ tags:
1919

2020
Sometimes you need to extract structured data from YouTube videos for analysis, documentation, or automation. While there are various web-based solutions, having a command-line tool that outputs clean JSON makes integration with scripts and pipelines much easier.
2121

22-
I built `ytx` - a .NET Global Tool that extracts YouTube video metadata and transcripts as structured JSON. The tool takes a YouTube URL and returns the video title, description, and full transcript with timestamps in both raw text and markdown formats.
22+
I built **ytx** - a .NET Global Tool that extracts YouTube video metadata and transcripts as structured JSON. The tool takes a YouTube URL and returns the video title, description, and full transcript with timestamps in both raw text and markdown formats. This post walks you through building your own .NET global tool from scratch, covering architecture design, caption handling, JSON serialization, NuGet packaging, and setting up automated CI/CD with GitHub Actions.
2323

2424
**The Problem** 🎯
2525

@@ -294,66 +294,199 @@ The tool produces clean, structured JSON:
294294

295295
The markdown transcript format makes it easy to create documentation with clickable timestamps that jump directly to specific moments in the video.
296296

297-
**Automated CI/CD Pipeline** 🤖
297+
**Production-Ready CI/CD Pipeline with GitHub Actions** 🤖
298298

299-
To streamline releases, I set up GitHub Actions to automatically:
300-
- Build and test the project
301-
- Increment version numbers
302-
- Publish to NuGet
303-
- Create GitHub releases
299+
To streamline releases and reduce manual work, I set up GitHub Actions to automatically handle the entire release pipeline. Unlike simple workflows, this production pipeline:
300+
- Runs on every push to `master` (only when source files change, avoiding redundant builds)
301+
- Allows manual triggers with version bump selection (patch, minor, or major)
302+
- Automatically increments semantic versions in your `.csproj` file
303+
- Commits version changes back to the repository with git tags
304+
- Builds and publishes to NuGet with proper error handling
305+
- Creates GitHub releases with auto-generated release notes
306+
- Supports multiple .NET versions (8.x and 9.x) for maximum compatibility
304307

305-
The workflow file (`.github/workflows/publish.yml`) handles version bumping:
308+
The complete workflow file (`.github/workflows/publish.yml`) handles all of this:
306309

307310
```yaml
308311
name: Publish NuGet (ytx)
309312

310313
on:
311-
push:
312-
branches: [ master ]
313314
workflow_dispatch:
314315
inputs:
315-
version_bump:
316-
description: 'Version bump type'
316+
bump:
317+
description: 'Version bump type (major|minor|patch)'
317318
required: true
318319
default: 'patch'
319-
type: choice
320-
options:
321-
- patch
322-
- minor
323-
- major
320+
push:
321+
branches: [ "master" ]
322+
paths:
323+
- 'src/Ytx/**'
324+
- '.github/workflows/publish.yml'
325+
326+
permissions:
327+
contents: write
328+
packages: read
329+
330+
env:
331+
PROJECT_DIR: src/Ytx
332+
CSPROJ: src/Ytx/Ytx.csproj
333+
NUPKG_DIR: nupkg
334+
NUGET_SOURCE: https://api.nuget.org/v3/index.json
324335

325336
jobs:
326-
publish:
337+
build-pack-publish:
327338
runs-on: ubuntu-latest
328339
steps:
329-
- uses: actions/checkout@v4
330-
340+
- name: Checkout
341+
uses: actions/checkout@v4
342+
331343
- name: Setup .NET
332344
uses: actions/setup-dotnet@v4
333345
with:
334346
dotnet-version: |
335-
8.0.x
336-
9.0.x
347+
9.x
348+
8.x
337349
338-
- name: Bump version
339-
run: |
340-
# Script to increment version in .csproj
341-
VERSION_TYPE="{% raw %}${{ github.event.inputs.version_bump || 'patch' }}{% endraw %}"
342-
./scripts/bump-version.sh "$VERSION_TYPE"
350+
- name: Restore
351+
run: dotnet restore $PROJECT_DIR
343352

344-
- name: Build and Pack
353+
- name: Determine and bump version
354+
id: bump
355+
shell: bash
345356
run: |
346-
dotnet restore src/Ytx
347-
dotnet build src/Ytx -c Release
348-
dotnet pack src/Ytx -c Release
357+
set -euo pipefail
358+
CURR=$(grep -oPm1 '(?<=<Version>)[^<]+' "$CSPROJ")
359+
echo "Current version: $CURR"
360+
IFS='.' read -r MAJ MIN PAT <<< "$CURR"
361+
BUMP="${{ github.event.inputs.bump || 'patch' }}"
362+
case "$BUMP" in
363+
major) MAJ=$((MAJ+1)); MIN=0; PAT=0 ;;
364+
minor) MIN=$((MIN+1)); PAT=0 ;;
365+
patch|*) PAT=$((PAT+1)) ;;
366+
esac
367+
NEW="$MAJ.$MIN.$PAT"
368+
echo "New version: $NEW"
369+
sed -i "s|<Version>$CURR</Version>|<Version>$NEW</Version>|" "$CSPROJ"
370+
echo "version=$NEW" >> "$GITHUB_OUTPUT"
371+
372+
- name: Commit version bump
373+
if: ${{ github.ref == 'refs/heads/master' }}
374+
run: |
375+
git config user.name "github-actions[bot]"
376+
git config user.email "github-actions[bot]@users.noreply.github.com"
377+
git add ${{ env.CSPROJ }}
378+
git commit -m "chore: bump version to ${{ steps.bump.outputs.version }}"
379+
git tag "v${{ steps.bump.outputs.version }}"
380+
git push --follow-tags
381+
382+
- name: Build
383+
run: dotnet build $PROJECT_DIR -c Release --no-restore
384+
385+
- name: Pack
386+
run: dotnet pack $PROJECT_DIR -c Release --no-build
349387

350388
- name: Publish to NuGet
389+
env:
390+
NUGET_API_KEY: ${{ secrets.NUGET_API_KEY }}
351391
run: |
352-
dotnet nuget push nupkg/*.nupkg \
353-
--api-key {% raw %}${{ secrets.NUGET_API_KEY }}{% endraw %} \
354-
--source https://api.nuget.org/v3/index.json
392+
dotnet nuget push $NUPKG_DIR/*.nupkg \
393+
--api-key "$NUGET_API_KEY" \
394+
--source "$NUGET_SOURCE" \
395+
--skip-duplicate
396+
397+
- name: Create GitHub Release
398+
uses: softprops/action-gh-release@v2
399+
with:
400+
tag_name: v${{ steps.bump.outputs.version }}
401+
name: ytx v${{ steps.bump.outputs.version }}
402+
generate_release_notes: true
355403
```
356404
405+
**Understanding the Workflow Architecture** 🏗️
406+
407+
This workflow implements several production best practices that help .NET developers distribute global tools effectively:
408+
409+
**Environment Variables for DRY Principle:**
410+
The `env:` block defines reusable values (`PROJECT_DIR`, `CSPROJ`, `NUPKG_DIR`, `NUGET_SOURCE`) referenced throughout the workflow. This approach keeps configuration centralized—change a directory path once, and it updates everywhere. This is crucial when managing complex multi-project solutions or adjusting package output locations.
411+
412+
**Permissions Block:**
413+
The `permissions:` section restricts the workflow to only what it needs:
414+
- `contents: write` — Required to create commits, tags, and push back to the repository
415+
- `packages: read` — Required for accessing NuGet package data
416+
417+
This follows the principle of least privilege, improving security by preventing the workflow from performing unauthorized actions.
418+
419+
**Smart Trigger Configuration:**
420+
```yaml
421+
on:
422+
push:
423+
branches: [ "master" ]
424+
paths:
425+
- 'src/Ytx/**'
426+
- '.github/workflows/publish.yml'
427+
```
428+
429+
The `paths:` filter prevents unnecessary builds when only documentation or other non-source files change. This saves CI/CD minutes and reduces feedback latency.
430+
431+
**Semantic Version Bumping with bash:**
432+
The version bump step demonstrates how to parse and manipulate semantic versions programmatically:
433+
434+
```bash
435+
IFS='.' read -r MAJ MIN PAT <<< "$CURR" # Parse 1.0.2 into components
436+
case "$BUMP" in
437+
major) MAJ=$((MAJ+1)); MIN=0; PAT=0 ;; # 1.0.2 → 2.0.0
438+
minor) MIN=$((MIN+1)); PAT=0 ;; # 1.0.2 → 1.1.0
439+
patch|*) PAT=$((PAT+1)) ;; # 1.0.2 → 1.0.3
440+
esac
441+
```
442+
443+
This approach ensures version consistency without manually editing `.csproj` files. The `echo "version=$NEW" >> "$GITHUB_OUTPUT"` sends the new version to subsequent steps—a key pattern in GitHub Actions workflows.
444+
445+
**Git Automation for Reproducible Releases:**
446+
```bash
447+
git config user.name "github-actions[bot]"
448+
git add ${{ env.CSPROJ }}
449+
git commit -m "chore: bump version to ${{ steps.bump.outputs.version }}"
450+
git tag "v${{ steps.bump.outputs.version }}"
451+
git push --follow-tags
452+
```
453+
454+
This creates an immutable audit trail. Every NuGet release corresponds to:
455+
1. A specific git commit (with the bumped version)
456+
2. A git tag (for easy checkout: `git checkout v1.0.3`)
457+
3. A GitHub release (with release notes)
458+
459+
This traceability is essential for troubleshooting issues and understanding what code produced which package version.
460+
461+
**Optimized Build Pipeline:**
462+
Notice the careful use of build flags:
463+
```bash
464+
dotnet restore $PROJECT_DIR # Explicit restore
465+
dotnet build $PROJECT_DIR -c Release --no-restore # Skip redundant restore
466+
dotnet pack $PROJECT_DIR -c Release --no-build # Skip redundant build
467+
```
468+
469+
The `--no-restore` and `--no-build` flags prevent repeating expensive operations. For .NET global tools especially, proper dependency isolation matters—you want to ensure your tool works across different .NET SDK versions, which is why this workflow tests against both 8.x and 9.x.
470+
471+
**NuGet Publishing with Idempotency:**
472+
```bash
473+
dotnet nuget push $NUPKG_DIR/*.nupkg \
474+
--skip-duplicate
475+
```
476+
477+
The `--skip-duplicate` flag means you can safely re-run the workflow without errors if a version was already published. This is crucial for reliability—sometimes you need to retry a build due to temporary network issues or API timeouts.
478+
479+
**Automated GitHub Releases:**
480+
```yaml
481+
- name: Create GitHub Release
482+
uses: softprops/action-gh-release@v2
483+
with:
484+
tag_name: v${{ steps.bump.outputs.version }}
485+
generate_release_notes: true
486+
```
487+
488+
This automatically creates a GitHub release with auto-generated release notes based on commit messages since the last release. Users see a clear changelog without manual effort, and the release is properly associated with the NuGet package version.
489+
357490
**Installation and Usage** 📦
358491

359492
Once published to NuGet, users can install the tool globally:
@@ -386,15 +519,16 @@ The tool handles various error scenarios gracefully:
386519

387520
This makes it suitable for use in scripts and automation pipelines.
388521

389-
**Key Learnings** 💡
522+
**Key Learnings & Best Practices** 💡
390523

391-
Building this tool taught me several valuable lessons:
524+
Building this .NET global tool taught me several valuable lessons applicable to any command-line tool project:
392525

393-
1. **YoutubeExplode Evolution**: The library has improved significantly - version 6.5.4 resolved transcript extraction issues that existed in earlier versions
394-
2. **Global Tool Packaging**: The `PackageReadmeFile` and proper NuGet metadata are crucial for discoverability
395-
3. **Multi-targeting**: Supporting both .NET 8 and 9 ensures broader compatibility
396-
4. **JSON Input/Output**: Supporting both CLI args and stdin makes the tool more versatile for automation
397-
5. **Caption Prioritization**: Smart ordering logic for captions improves user experience significantly
526+
1. **YoutubeExplode Library Maturity**: Version 6.5.4 resolved transcript extraction issues that plagued earlier versions. Always verify library versions match your use case requirements.
527+
2. **.NET Global Tool Packaging**: The `PackAsTool` property, `ToolCommandName`, and `PackageReadmeFile` are crucial for NuGet discoverability. Missing these makes your tool harder to find.
528+
3. **Multi-targeting Strategy**: Supporting both .NET 8 and 9 simultaneously ensures broader compatibility across development environments and CI/CD pipelines.
529+
4. **Flexible Input/Output Design**: Supporting both command-line arguments and stdin (JSON) makes your tool more versatile for automation, scripting, and pipeline integration.
530+
5. **Intelligent Caption Selection**: Smart ordering logic (English preference → auto-generated fallback) dramatically improves user experience compared to simple "first available" approaches.
531+
6. **Semantic Versioning in CI/CD**: Automating patch/minor/major version bumps reduces manual work and ensures consistency across releases.
398532

399533
**Future Enhancements** 🔮
400534

@@ -406,36 +540,41 @@ Potential improvements for future versions:
406540
- Integration with subtitle file formats (SRT, VTT)
407541
- Translation support for non-English captions
408542

409-
**Modern Development Velocity**
543+
**Development Velocity with Modern AI Tooling** ⚡
410544

411-
What strikes me most about this project is the development speed enabled by modern AI tooling. From initial concept to published NuGet package took just a few hours - a timeframe that would have been unthinkable just a few years ago.
545+
What stands out about this project is the development speed enabled by modern AI assistance. From initial concept through architecture, implementation, testing, and NuGet publication took just a few hours - something that would have required days of work just five years ago.
412546

413-
**The AI-Assisted Workflow** 🤖
547+
**The AI-Assisted Development Workflow** 🤖
414548

415-
This project showcased the power of combining multiple AI tools:
549+
This .NET global tool project showcased the power of combining multiple AI tools effectively:
416550

417-
- **Claude Code**: Handled the core architecture decisions, error handling patterns, and CI/CD pipeline setup. Particularly valuable for getting the .csproj packaging configuration right on the first try.
418-
- **GitHub Copilot**: Excelled at generating repetitive code patterns, JSON serialization boilerplate, and regex text normalization functions.
551+
- **Claude Code**: Handled core architecture decisions, .NET-specific patterns, error handling strategies, and GitHub Actions CI/CD pipeline configuration. Particularly valuable for getting the `.csproj` packaging configuration correct on the first attempt.
552+
- **GitHub Copilot**: Excelled at generating repetitive code patterns, JSON serialization boilerplate, regex text normalization functions, and test scaffold code.
419553
- **MCPs (Model Context Protocol)**: Provided seamless integration between different AI tools and development contexts, making the workflow feel natural rather than fragmented.
420554

421-
**The Human-AI Partnership** 🤝
555+
**The Human-AI Partnership in Practice** 🤝
556+
557+
The most interesting insight wasn't that AI wrote the code, but how it transformed the development process itself:
422558

423-
The most interesting aspect wasn't that AI wrote the code, but how it changed the development process itself:
559+
1. **Design-First Development**: Instead of iterating through implementation details, focus shifted to user experience and clean data flow architecture.
560+
2. **Documentation-Driven Development**: Writing this technical blog post in parallel with coding helped clarify requirements and catch edge cases early.
561+
3. **Risk-Free Exploration**: AI assistance made it easy to try different architectural approaches without the usual "sunk cost" hesitation.
424562

425-
1. **Design-First Thinking**: Instead of iterating through implementation details, I could focus on the user experience and data flow
426-
2. **Documentation-Driven Development**: Writing this blog post in parallel with coding helped clarify requirements and catch edge cases early
427-
3. **Confidence in Exploration**: Having AI assistance made it easy to try different approaches without the usual "sunk cost" feeling
563+
**What This Means for .NET Developers** 🚀
428564

429-
**Looking Forward** 🔮
565+
This project represents a new normal in software development—where the bottleneck shifts from typing code to thinking through problems and user needs. The combination of AI coding assistants, intelligent build toolchains (GitHub Actions, NuGet), and human creativity is genuinely transformative.
430566

431-
This project represents a new normal in software development - where the bottleneck shifts from typing code to thinking through problems and user needs. The combination of AI coding assistants, intelligent toolchains, and human creativity is genuinely transformative.
567+
For developers hesitant about AI tools: they're not replacing you; they're amplifying your ability to solve meaningful problems quickly. The future belongs to developers who can effectively collaborate with AI to build better software faster.
432568

433-
For developers hesitant about AI tools: they're not replacing us, they're amplifying our ability to solve meaningful problems quickly. The future belongs to developers who can effectively collaborate with AI to build better software faster.
569+
**Get Started Building Your Own .NET Global Tool** 📦
434570

435-
Ready to extract some YouTube transcripts? 🎬
571+
Ready to create your own command-line tool and publish it to NuGet? Install ytx to see a working example:
436572

437573
```powershell
438574
dotnet tool install -g solrevdev.ytx
575+
ytx "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
439576
```
440577

578+
Or [explore the source code on GitHub](https://github.com/solrevdev/solrevdev.ytx) to see the complete implementation.
579+
441580
Success! 🎉

0 commit comments

Comments
 (0)