Restructure benchmarks skill and rename to kaggle-benchmarks by nicholaskang-us · Pull Request #1012 · Kaggle/kaggle-cli

nicholaskang-us · 2026-05-15T16:10:53Z

This PR restructures the benchmarks skill to adhere to the Agent Skills standard and completes the truncated content.

What problem are we solving

The benchmarks skill was previously nested under references, which is not an auto-activatable location for Agent Skills-compliant agents (added a new folder)
Not properly formatted with the skills table with a description/version at the top
There were no references pointing agents to the official SDK repository or DeepWiki documentation

What changes are proposed

Renamed and relocated the skill directory to to match the folder-name standard for auto-activation
Updated the YAML frontmatter with proper name (), rich description with keywords, and metadata fields
Added links to official SDK resources so agents have a direct reference for writing benchmark task files

dolaameng · 2026-05-15T16:49:18Z

@nicholaskang-us I think @rosbo and @stevemessick has a plan to organize the skills. Let's consult them first.

stevemessick · 2026-05-15T17:52:51Z

+
+```
+kaggle benchmarks (alias: kaggle b)
+├── auth              — Fetch Model Proxy credentials


How well does this play with kaggle auth login? (Do we really need both?)

kaggle auth login authenticates you with Kaggle.

Then, using your Kaggle credentials, kaggle benchmarks auth fetches a model proxy token. You must be autenticated before you can fetch a model proxy token.

i'm assuming this means this is fine and we don't need to make any change?

rosbo · 2026-05-19T16:51:22Z

@@ -1,400 +0,0 @@
-# Kaggle Benchmarks CLI Reference


This file is referenced from the main SKILL: https://github.com/Kaggle/kaggle-cli/blob/main/skills/SKILL.md.

This will cause a broken link.

The idea is to have one skill with references to the different resources (e.g. kernels, models, datasets, benchmarks, etc).

to clarify, you mean that we just expect users to download the kaggle-cli skill ONLY? and they would say they want to write a benchmark task, which would invoke the main skill

The current thinking followed the playwright-cli structure where you have a SKILL.md explaining cross-cutting CLI concerns (like auth/authz) and then link to references file stored in the references/ folder for each product. One explaining how to use benchmark, one for kernels, etc.

Following this structure: https://agentskills.io/home

But maybe it is best to really have separate skill per concrete task (like creating a benchmark)... We can experiment and see how the agent performs.

I would suggest to put your skill focused on creating a benchmark task e2e in the kaggle-skills repo given it is cross-cutting (kaggle-cli & kaggle-benchmarks).

Restructure benchmarks skill and rename to kaggle-benchmarks

8e8fdbb

nicholaskang-us mentioned this pull request May 15, 2026

Update and rename benchmarks.md to SKILL.md #1011

Closed

nicholaskang-us self-assigned this May 15, 2026

nicholaskang-us requested review from andrewmwang and dolaameng May 15, 2026 16:12

dolaameng requested review from rosbo and stevemessick May 15, 2026 16:49

stevemessick approved these changes May 15, 2026

View reviewed changes

rosbo requested changes May 19, 2026

View reviewed changes

Removed python version installation

1919154

nicholaskang-us commented May 21, 2026

View reviewed changes

Comment thread skills/kaggle-benchmarks/SKILL.md Outdated

nicholaskang-us commented May 22, 2026

View reviewed changes

added intermediate step to run file locally to validate

9b0f739

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restructure benchmarks skill and rename to kaggle-benchmarks#1012

Restructure benchmarks skill and rename to kaggle-benchmarks#1012
nicholaskang-us wants to merge 3 commits into
Kaggle:mainfrom
nicholaskang-us:restructure-benchmarks-skill

nicholaskang-us commented May 15, 2026 •

edited

Loading

Uh oh!

dolaameng commented May 15, 2026

Uh oh!

Uh oh!

stevemessick May 15, 2026

Uh oh!

rosbo May 19, 2026

Uh oh!

nicholaskang-us May 19, 2026

Uh oh!

rosbo May 19, 2026

Uh oh!

nicholaskang-us May 19, 2026

Uh oh!

rosbo May 22, 2026

Uh oh!

rosbo May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

nicholaskang-us commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem are we solving

What changes are proposed

Uh oh!

dolaameng commented May 15, 2026

Uh oh!

Uh oh!

stevemessick May 15, 2026

Choose a reason for hiding this comment

Uh oh!

rosbo May 19, 2026

Choose a reason for hiding this comment

Uh oh!

nicholaskang-us May 19, 2026

Choose a reason for hiding this comment

Uh oh!

rosbo May 19, 2026

Choose a reason for hiding this comment

Uh oh!

nicholaskang-us May 19, 2026

Choose a reason for hiding this comment

Uh oh!

rosbo May 22, 2026

Choose a reason for hiding this comment

Uh oh!

rosbo May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nicholaskang-us commented May 15, 2026 •

edited

Loading