Skip to content

Commit 5e35c74

Browse files
authored
Merge pull request #13 from solrevdev/feature/blog-post-winget-search
2 parents bc74c30 + fa6a577 commit 5e35c74

File tree

2 files changed

+303
-0
lines changed

2 files changed

+303
-0
lines changed
Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
---
2+
published: true
3+
layout: post
4+
title: Building Winget Search - A fast web interface for Windows Package Manager
5+
description: >-
6+
How I built a GitHub Pages-hosted search interface for winget packages to solve my machine setup workflow
7+
cover_image: /images/winget-search-cover.svg
8+
tags:
9+
- python
10+
- javascript
11+
- winget
12+
- github-actions
13+
- github-pages
14+
---
15+
16+
## Background
17+
18+
When setting up a new Windows machine, I used to rely on [Scoop](https://scoop.sh/) and [Chocolatey](https://chocolatey.org/) for package management. Both are excellent tools, but when Microsoft introduced [Windows Package Manager (winget)](https://learn.microsoft.com/en-us/windows/package-manager/), I decided to give it a try on my latest machine setup.
19+
20+
The problem? Finding winget package IDs was tedious. While `winget search` works, I wanted something faster - a web interface where I could quickly search, find packages, and copy installation commands. That's how [winget-search](https://github.com/solrevdev/winget-search) was born.
21+
22+
## The Challenge
23+
24+
Winget packages are stored in Microsoft's [winget-pkgs repository](https://github.com/microsoft/winget-pkgs) with over 30,000 YAML manifest files. The repository structure follows a pattern: `manifests/publisher/package_name/version/` with separate files for different locales and installers.
25+
26+
I needed to:
27+
- Extract package metadata from thousands of YAML files
28+
- Handle multiple versions and keep only the latest
29+
- Filter for English descriptions only
30+
- Build a fast, searchable web interface
31+
- Keep data fresh with automated updates
32+
33+
## The Solution
34+
35+
### Architecture Overview
36+
37+
I built a three-part system:
38+
39+
1. **Python extraction script** - Parses YAML manifests and generates JSON
40+
2. **Static HTML search interface** - Client-side search with instant results
41+
3. **GitHub Actions automation** - Daily updates and deployment
42+
43+
### Data Extraction
44+
45+
The Python script (`extract_packages.py`) does the heavy lifting:
46+
47+
```python
48+
def extract_package_info(manifest_dir):
49+
"""Extract comprehensive package info from a manifest directory"""
50+
package_info = {
51+
"id": None,
52+
"name": None,
53+
"description": None,
54+
"publisher": None,
55+
"version": None,
56+
"tags": [],
57+
"homepage": None,
58+
"license": None
59+
}
60+
61+
# Process version manifest first
62+
version_file = next((f for f in yaml_files if not '.locale.' in f), None)
63+
if version_file:
64+
with open(os.path.join(manifest_dir, version_file), encoding="utf-8") as f:
65+
doc = yaml.safe_load(f)
66+
package_info["id"] = doc.get("PackageIdentifier")
67+
package_info["version"] = doc.get("PackageVersion")
68+
69+
# Look for English locale file
70+
locale_file = next((f for f in yaml_files if '.locale.en-US.' in f), None)
71+
if locale_file:
72+
with open(os.path.join(manifest_dir, locale_file), encoding="utf-8") as f:
73+
doc = yaml.safe_load(f)
74+
package_info["name"] = doc.get("PackageName")
75+
package_info["publisher"] = doc.get("Publisher")
76+
package_info["description"] = doc.get("Description")
77+
```
78+
79+
The script handles version comparison using Python's `packaging` library to ensure we only keep the latest version of each package:
80+
81+
```python
82+
def parse_version(ver_str):
83+
"""Parse version string for proper comparison"""
84+
try:
85+
return version.parse(ver_str)
86+
except:
87+
return version.parse("0.0.0") # Fallback for non-standard versions
88+
```
89+
90+
### Web Interface
91+
92+
The search interface is pure vanilla JavaScript - no frameworks needed. It loads a ~5MB JSON file containing all package data and performs client-side search for instant results:
93+
94+
```javascript
95+
function showResults(query) {
96+
const q = query.trim().toLowerCase();
97+
98+
let results;
99+
if (!q) {
100+
results = packages.slice(0, 50);
101+
} else {
102+
results = packages.filter(pkg => {
103+
const idMatch = pkg.id?.toLowerCase().includes(q);
104+
const nameMatch = pkg.name?.toLowerCase().includes(q);
105+
const descMatch = pkg.description?.toLowerCase().includes(q);
106+
const publisherMatch = pkg.publisher?.toLowerCase().includes(q);
107+
const tagMatch = pkg.tags?.some(tag =>
108+
tag && typeof tag === 'string' && tag.toLowerCase().includes(q)
109+
);
110+
111+
return idMatch || nameMatch || descMatch || publisherMatch || tagMatch;
112+
}).slice(0, 100);
113+
}
114+
115+
// Render results...
116+
}
117+
```
118+
119+
Each search result includes a one-click copy button for the winget install command:
120+
121+
```javascript
122+
function copyCommand(button, cmd) {
123+
navigator.clipboard.writeText(cmd).then(() => {
124+
const originalHtml = button.innerHTML;
125+
button.classList.add('success');
126+
button.innerHTML = `
127+
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
128+
<polyline points="20 6 9 17 4 12"/>
129+
</svg>
130+
Copied!
131+
`;
132+
133+
setTimeout(() => {
134+
button.classList.remove('success');
135+
button.innerHTML = originalHtml;
136+
}, 2000);
137+
});
138+
}
139+
```
140+
141+
### Automation with GitHub Actions
142+
143+
The magic happens in the GitHub Actions workflow that runs daily at 2 AM UTC:
144+
145+
```yaml
146+
name: Build and Deploy
147+
on:
148+
schedule:
149+
- cron: '0 2 * * *' # Daily at 2 AM UTC
150+
workflow_dispatch:
151+
152+
jobs:
153+
build-and-deploy:
154+
runs-on: ubuntu-latest
155+
steps:
156+
- name: Checkout
157+
uses: actions/checkout@v4
158+
159+
- name: Setup Python
160+
uses: actions/setup-python@v5
161+
with:
162+
python-version: '3.11'
163+
164+
- name: Cache winget-pkgs
165+
uses: actions/cache@v4
166+
with:
167+
path: winget-pkgs
168+
key: winget-pkgs-${{ github.run_id }}
169+
restore-keys: winget-pkgs-
170+
171+
- name: Clone winget-pkgs repository
172+
run: |
173+
if [ ! -d "winget-pkgs" ]; then
174+
git clone --depth 1 https://github.com/microsoft/winget-pkgs.git
175+
else
176+
cd winget-pkgs && git pull origin main
177+
fi
178+
179+
- name: Extract packages
180+
run: python extract_packages.py winget-pkgs/manifests/ packages.json
181+
182+
- name: Deploy to GitHub Pages
183+
uses: peaceiris/actions-gh-pages@v4
184+
with:
185+
github_token: ${{ secrets.GITHUB_TOKEN }}
186+
publish_dir: .
187+
```
188+
189+
## Technical Highlights
190+
191+
### Performance Optimizations
192+
193+
- **Client-side search**: No server required, instant results
194+
- **Debounced input**: 300ms delay prevents excessive filtering
195+
- **Result limiting**: Shows max 100 results for smooth scrolling
196+
- **Caching**: GitHub Actions caches the large winget-pkgs repository
197+
198+
### User Experience Features
199+
200+
- **Keyboard shortcuts**: Press `/` to focus search, `Esc` to clear
201+
- **Dark mode**: Automatic theme based on system preferences
202+
- **Mobile-friendly**: Responsive design works on all devices
203+
- **Copy feedback**: Visual confirmation when commands are copied
204+
205+
### Data Quality
206+
207+
- **English-only**: Filters for `.locale.en-US.yaml` files
208+
- **Latest versions**: Semantic version comparison ensures freshness
209+
- **Type safety**: Handles edge cases like non-string tags
210+
- **Error resilience**: Continues processing even if individual manifests fail
211+
212+
## Deployment
213+
214+
The entire deployment is automated through GitHub Pages. The workflow:
215+
216+
1. Clones the microsoft/winget-pkgs repository (1GB+, hence the caching)
217+
2. Extracts package data using Python script
218+
3. Generates a `packages.json` file with ~30,000 packages
219+
4. Deploys everything to GitHub Pages using `peaceiris/actions-gh-pages`
220+
221+
No manual intervention needed - just push code and GitHub Actions handles the rest!
222+
223+
## Results
224+
225+
The end result is a fast, searchable interface hosted at GitHub Pages that:
226+
227+
- Loads 30,000+ packages in under 3 seconds
228+
- Provides instant search results
229+
- Generates copy-ready `winget install` commands
230+
- Updates automatically every day
231+
- Costs nothing to host
232+
233+
Perfect for when you need to quickly find that package ID for your setup scripts!
234+
235+
## Future Improvements
236+
237+
Some ideas I'm considering:
238+
239+
- **Package categories** - Group by software type
240+
- **Fuzzy search** - Better matching for typos
241+
- **PowerShell commands** - Alternative to cmd syntax
242+
- **Package details modal** - Show more metadata
243+
- **Search history** - Remember recent searches
244+
245+
## Live Demo
246+
247+
Check out the live site: [https://solrevdev.github.io/winget-search/](https://solrevdev.github.io/winget-search/)
248+
249+
The source code is available on [GitHub](https://github.com/solrevdev/winget-search) under the MIT license.
250+
251+
Success 🎉

images/winget-search-cover.svg

Lines changed: 52 additions & 0 deletions
Loading

0 commit comments

Comments
 (0)