From 45d5717d7b6f17ddaa294c54747c59488879c3ba Mon Sep 17 00:00:00 2001 From: are-ces <195810094+are-ces@users.noreply.github.com> Date: Wed, 20 May 2026 13:37:53 +0200 Subject: [PATCH 1/2] LCORE-2077: Document Agent Skills feature Add user-facing documentation for the Agent Skills feature: - Skills guide with configuration, SKILL.md authoring, and runtime behavior - Example skills (openshift-troubleshooting with references, code-review minimal) - Example lightspeed-stack-skills.yaml config file - README updates: Agentic Capabilities table and Agent Skills config section Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 34 +++ docs/skills_guide.md | 250 ++++++++++++++++++ examples/lightspeed-stack-skills.yaml | 32 +++ examples/skills/code-review/SKILL.md | 54 ++++ .../skills/openshift-troubleshooting/SKILL.md | 69 +++++ .../references/common-errors.md | 59 +++++ 6 files changed, 498 insertions(+) create mode 100644 docs/skills_guide.md create mode 100644 examples/lightspeed-stack-skills.yaml create mode 100644 examples/skills/code-review/SKILL.md create mode 100644 examples/skills/openshift-troubleshooting/SKILL.md create mode 100644 examples/skills/openshift-troubleshooting/references/common-errors.md diff --git a/README.md b/README.md index ecf6aa7da..f176ae6c2 100644 --- a/README.md +++ b/README.md @@ -20,6 +20,7 @@ The service includes comprehensive user data collection capabilities for various * [Installation](#installation) * [Run LCS locally](#run-lcs-locally) * [Configuration](#configuration) + * [Agentic Capabilities](#agentic-capabilities) * [LLM Compatibility](#llm-compatibility) * [Set LLM provider and model](#set-llm-provider-and-model) * [Selecting provider and model](#selecting-provider-and-model) @@ -50,6 +51,7 @@ The service includes comprehensive user data collection capabilities for various * [System Prompt Literal](#system-prompt-literal) * [Custom Profile](#custom-profile) * [Control model/provider overrides via authorization](#control-modelprovider-overrides-via-authorization) + * [Agent Skills](#agent-skills) * [Safety Shields](#safety-shields) * [Authentication](#authentication) * [CORS](#cors) @@ -196,6 +198,19 @@ To quickly get hands on LCS, we can run it using the default configurations prov # Configuration +## Agentic Capabilities + +Lightspeed Core Stack supports the following agentic features: + +| Capability | Status | Description | +|------------|--------|-------------| +| MCP Tools | Supported | External tool integration via [Model Context Protocol](https://modelcontextprotocol.io) servers | +| RAG | Supported | Retrieval-Augmented Generation with vector stores ([RAG Guide](docs/rag_guide.md)) | +| A2A Protocol (Client) | Supported | Agent-to-Agent communication as client ([A2A Protocol](docs/a2a_protocol.md)) | +| Conversation History | Supported | Persistent conversation context across requests | +| Human-in-the-Loop | Upcoming (Q2) | Interactive approval or confirmation steps | +| Agent Skills | Upcoming (Q2) | Domain-specific instructions loaded on demand ([Agent Skills Guide](docs/skills_guide.md)) | + ## LLM Compatibility Lightspeed Core Stack (LCS) provides support for Large Language Model providers. The models listed in the table below represent specific examples that have been tested within LCS. @@ -712,6 +727,25 @@ customization: By default, clients may specify `model` and `provider` in `/v1/query` and `/v1/streaming_query`. Override is permitted only to callers granted the `MODEL_OVERRIDE` action via the authorization rules. Requests that include `model` or `provider` without this permission are rejected with HTTP 403. +## Agent Skills + +Agent Skills allow product teams to extend Lightspeed Core with specialized instructions and domain knowledge that the LLM can load on demand. Skills follow the [Agent Skills open standard](https://agentskills.io) and are packaged as portable directories containing a `SKILL.md` file. + +Skills are configured by specifying paths to skill directories in `lightspeed-stack.yaml`: + +```yaml +skills: + paths: + - "/var/skills/" # Directory containing skill subdirectories +``` + +Each skill directory must contain a `SKILL.md` file with YAML frontmatter (`name` and `description`) followed by Markdown instructions. The LLM discovers available skills via tool calls and loads instructions on demand. + +> [!NOTE] +> Skills are configured by product teams at deployment time. End users do not have the ability to add skills. + +For the full configuration guide, skill authoring instructions, and examples, see the [Agent Skills Guide](docs/skills_guide.md). + ## Safety Shields A single Llama Stack configuration file can include multiple safety shields, which are utilized in agent diff --git a/docs/skills_guide.md b/docs/skills_guide.md new file mode 100644 index 000000000..e6801a8e9 --- /dev/null +++ b/docs/skills_guide.md @@ -0,0 +1,250 @@ +# Agent Skills Guide + +This guide covers how to configure Agent Skills in Lightspeed Core Stack and how to author your own skills. + +--- + +- [Introduction](#introduction) +- [Configuration](#configuration) + - [Option A: Directory of Skills](#option-a-directory-of-skills) + - [Option B: Individual Skill Paths](#option-b-individual-skill-paths) +- [Skill Directory Structure](#skill-directory-structure) +- [SKILL.md Format](#skillmd-format) + - [Frontmatter Fields](#frontmatter-fields) + - [Body Content](#body-content) +- [Creating a Skill](#creating-a-skill) +- [How Skills Work at Runtime](#how-skills-work-at-runtime) +- [Limitations](#limitations) +- [Error Reference](#error-reference) +- [References](#references) + +--- + +# Introduction + +Agent Skills allow product teams (e.g., RHEL Lightspeed, Ansible Lightspeed) to extend Lightspeed Core with specialized instructions and domain knowledge. Skills are packaged as portable directories following the [Agent Skills open standard](https://agentskills.io). + +A skill is a `SKILL.md` file containing metadata and instructions that the LLM can load on demand. For example, a troubleshooting skill might contain step-by-step diagnostic procedures for a specific product, while a code review skill might contain a checklist and best practices. + +> [!IMPORTANT] +> Skills are configured by **product teams at deployment time**. End users of LS app products do not have the ability to add skills, similar to how they cannot add MCP servers. + +# Configuration + +Skills are configured in `lightspeed-stack.yaml` by specifying paths to skill directories. Two forms are supported. + +## Option A: Directory of Skills + +Point to a parent directory containing skill subdirectories. Each subdirectory with a `SKILL.md` file is loaded as a skill. + +```yaml +skills: + paths: + - "/var/skills/" +``` + +This loads all skills found under `/var/skills/`: + +``` +/var/skills/ +├── openshift-troubleshooting/ +│ ├── SKILL.md +│ └── references/ +│ └── common-errors.md +├── code-review/ +│ └── SKILL.md +└── ansible-playbooks/ + ├── SKILL.md + └── references/ + └── module-reference.md +``` + +## Option B: Individual Skill Paths + +Point directly to specific skill directories for fine-grained control over which skills are loaded. + +```yaml +skills: + paths: + - "/var/skills/openshift-troubleshooting/" + - "/var/skills/code-review/" +``` + +> [!TIP] +> Option A is recommended for most deployments. Use Option B when you need to selectively include specific skills from a larger collection. + +See [examples/lightspeed-stack-skills.yaml](../examples/lightspeed-stack-skills.yaml) for a complete configuration example. + +# Skill Directory Structure + +Each skill is a directory containing, at minimum, a `SKILL.md` file: + +``` +skill-name/ +├── SKILL.md # Required: metadata + instructions +└── references/ # Optional: additional documentation + ├── guide.md + └── troubleshooting.md +``` + +- **`SKILL.md`** (required): Contains YAML frontmatter with metadata and Markdown body with instructions. +- **`references/`** (optional): Contains additional documentation files that the LLM can load on demand when the skill instructions reference them. + +> [!NOTE] +> Script execution (`scripts/` subdirectory) is not supported. Only `SKILL.md` and `references/` files are used at runtime. + +# SKILL.md Format + +The `SKILL.md` file must contain YAML frontmatter (between `---` delimiters) followed by Markdown content. + +## Frontmatter Fields + +| Field | Required | Description | +|-----------------|----------|-------------| +| `name` | Yes | Skill identifier. Max 64 characters. Lowercase letters, numbers, and hyphens only. Must match the parent directory name. | +| `description` | Yes | What the skill does and when to use it. Max 1024 characters. | + +### `name` rules + +- 1-64 characters +- Lowercase letters (`a-z`), numbers (`0-9`), and hyphens (`-`) only +- Must not start or end with a hyphen +- Must not contain consecutive hyphens (`--`) +- Must match the parent directory name + +**Valid names**: `openshift-troubleshooting`, `code-review`, `data-analysis` + +**Invalid names**: `OpenShift-Troubleshooting` (uppercase), `-code-review` (starts with hyphen), `code--review` (consecutive hyphens) + +### `description` guidance + +The description should include both **what** the skill does and **when** to use it. Include specific keywords that help the LLM identify relevant tasks. + +```yaml +# Good: specific about what and when +description: Diagnose and fix common OpenShift deployment issues including pod failures, networking problems, and resource constraints. Use when users report deployment failures or application issues on OpenShift. + +# Poor: too vague +description: Helps with OpenShift. +``` + +## Body Content + +The Markdown body after the frontmatter contains the skill instructions. There are no format restrictions. Write whatever helps the LLM perform the task effectively. + +Recommended sections: +- Step-by-step instructions +- Examples of inputs and outputs +- Common edge cases and how to handle them + +> [!TIP] +> Keep `SKILL.md` under 500 lines. Move detailed reference material to files in the `references/` subdirectory and reference them from the main instructions. + +# Creating a Skill + +Follow these steps to create a new skill: + +**1. Create the skill directory** + +The directory name must match the `name` field in `SKILL.md`. + +```bash +mkdir -p /var/skills/my-skill +``` + +**2. Create the `SKILL.md` file** + +```markdown +--- +name: my-skill +description: A brief description of what this skill does and when to use it. +--- + +# My Skill + +## When to use this skill + +Use this skill when: +- Condition A applies +- The user asks about topic B + +## Instructions + +1. First, do X +2. Then check Y +3. If Z occurs, see [the reference guide](references/guide.md) +``` + +**3. (Optional) Add reference files** + +```bash +mkdir -p /var/skills/my-skill/references +``` + +Add documentation files that the skill instructions reference: + +```markdown +# references/guide.md + +Detailed reference content goes here... +``` + +**4. Add the skill path to configuration** + +Add the path to your `lightspeed-stack.yaml`: + +```yaml +skills: + paths: + - "/var/skills/" # If using a directory of skills +``` + +**5. Restart the service** + +Skills are loaded at startup. Restart Lightspeed Core Stack to pick up new or modified skills. + +See [examples/skills/](../examples/skills/) for complete working examples. + +# How Skills Work at Runtime + +Skills use a progressive disclosure pattern with three LLM tools: + +1. **`list_skills`** — The LLM calls this to discover available skills. Returns the name and description of each skill. +2. **`activate_skill`** — When a task matches a skill's description, the LLM calls this to load the full instructions from `SKILL.md`. +3. **`load_skill_resource`** — If the skill instructions reference files in `references/`, the LLM calls this to load them on demand. + +``` +User question + │ + ▼ +LLM calls list_skills → sees skill catalog (name + description) + │ + ▼ (if task matches a skill) +LLM calls activate_skill → loads full SKILL.md instructions + │ + ▼ (if instructions reference a file) +LLM calls load_skill_resource → loads file from references/ + │ + ▼ +LLM follows skill instructions to answer +``` + +The system prompt contains behavioral instructions telling the LLM how to use these tools. When no skills are configured, the tools and instructions are omitted entirely. + +> [!NOTE] +> Skills are tracked per conversation. If a skill is already loaded in a conversation, re-activating it returns a note instead of re-injecting the content. + +# Limitations + +- **No script execution**: The `scripts/` subdirectory from the agentskills.io spec is not supported. Skills provide instructions only; they do not execute code. +- **Read-only**: Skills are loaded from the filesystem at startup and are read-only at runtime. +- **No remote loading**: Skills must be present on the local filesystem. Loading from URLs or registries is not supported. +- **Duplicate names**: Skill names must be unique across all configured paths. Duplicate names cause a startup error. + +# References + +- [Agent Skills Specification](https://agentskills.io/specification) — the open standard for skill format +- [Agent Skills Implementation Guide](https://agentskills.io/client-implementation/adding-skills-support) — client implementation guidance +- [Feature Design Document](design/agent-skills/agent-skills.md) — internal design spec for the Lightspeed Core implementation +- [Example Skills](../examples/skills/) — working example skills +- [Example Configuration](../examples/lightspeed-stack-skills.yaml) — example `lightspeed-stack.yaml` with skills configured diff --git a/examples/lightspeed-stack-skills.yaml b/examples/lightspeed-stack-skills.yaml new file mode 100644 index 000000000..ae0b905af --- /dev/null +++ b/examples/lightspeed-stack-skills.yaml @@ -0,0 +1,32 @@ +name: Lightspeed Core Service (LCS) +service: + host: localhost + port: 8080 + auth_enabled: false + workers: 1 + color_log: true + access_log: true +llama_stack: + use_as_library_client: false + url: http://localhost:8321 + api_key: xyzzy +user_data_collection: + feedback_enabled: true + feedback_storage: "/tmp/data/feedback" + transcripts_enabled: true + transcripts_storage: "/tmp/data/transcripts" +authentication: + module: "noop" +# Agent Skills configuration +# Skills provide domain-specific instructions that the LLM can load on demand. +# Each path points to either: +# - A directory containing a SKILL.md file (single skill) +# - A directory containing subdirectories with SKILL.md files (multiple skills) +skills: + paths: + - "/var/skills/" # Directory containing skill subdirectories +# Alternative: specify individual skill paths for fine-grained control +# skills: +# paths: +# - "/var/skills/openshift-troubleshooting/" +# - "/var/skills/code-review/" diff --git a/examples/skills/code-review/SKILL.md b/examples/skills/code-review/SKILL.md new file mode 100644 index 000000000..0a7955c7a --- /dev/null +++ b/examples/skills/code-review/SKILL.md @@ -0,0 +1,54 @@ +--- +name: code-review +description: Review code changes for quality, security, and maintainability. Use when a user asks for a code review, wants feedback on their code, or asks about best practices for a code change. +--- + +# Code Review + +## When to use this skill + +Use this skill when: +- A user asks you to review code or a diff +- A user wants feedback on code quality +- A user asks about best practices for a specific change + +## Review checklist + +### Correctness +- Does the code do what it claims to do? +- Are edge cases handled (empty inputs, null values, boundary conditions)? +- Are error conditions handled appropriately? + +### Security +- Is user input validated and sanitized? +- Are secrets hardcoded or properly managed via environment variables? +- Are SQL queries parameterized (no string concatenation)? +- Are file paths validated to prevent directory traversal? + +### Maintainability +- Are variable and function names descriptive? +- Is the code structured for readability (appropriate function length, single responsibility)? +- Are there comments explaining non-obvious logic? +- Is there unnecessary complexity that could be simplified? + +### Performance +- Are there obvious performance issues (N+1 queries, unnecessary loops, missing indexes)? +- Are large data sets handled efficiently (pagination, streaming)? +- Are expensive operations cached where appropriate? + +### Testing +- Are there tests for new functionality? +- Do tests cover edge cases and error conditions? +- Are tests readable and well-structured? + +## Review format + +Structure your review as follows: + +1. **Summary**: One sentence describing the overall change +2. **Strengths**: What the code does well (be specific) +3. **Issues**: Problems that should be fixed, ordered by severity + - **Critical**: Bugs, security issues, data loss risks + - **Major**: Logic errors, missing error handling + - **Minor**: Style, naming, documentation +4. **Suggestions**: Optional improvements that are not blocking diff --git a/examples/skills/openshift-troubleshooting/SKILL.md b/examples/skills/openshift-troubleshooting/SKILL.md new file mode 100644 index 000000000..997ef8901 --- /dev/null +++ b/examples/skills/openshift-troubleshooting/SKILL.md @@ -0,0 +1,69 @@ +--- +name: openshift-troubleshooting +description: Diagnose and fix common OpenShift deployment issues including pod failures, networking problems, and resource constraints. Use when users report deployment failures or application issues on OpenShift. +--- + +# OpenShift Troubleshooting + +## When to use this skill + +Use this skill when: +- A user reports pods not starting or crashing +- Deployments are stuck in pending state +- Services are unreachable +- Resource quota issues are suspected + +## Diagnostic steps + +### 1. Check pod status + +First, identify the problematic pods: + +``` +oc get pods -n | grep -v Running +``` + +For each failing pod, get detailed status: + +``` +oc describe pod -n +``` + +Look for: +- **Pending**: Usually resource constraints or scheduling issues +- **CrashLoopBackOff**: Application crash, check logs +- **ImagePullBackOff**: Image registry access issues +- **ErrImagePull**: Image not found or auth failure + +### 2. Check events + +``` +oc get events -n --sort-by='.lastTimestamp' +``` + +Events reveal scheduling failures, resource limits, and pull errors. + +### 3. Check logs + +``` +oc logs -n +oc logs -n --previous # For crashed pods +``` + +### 4. Check resource constraints + +``` +oc describe resourcequota -n +oc describe limitrange -n +``` + +## Common issues and solutions + +See [references/common-errors.md](references/common-errors.md) for detailed solutions to frequently encountered errors. + +## Escalation + +If the issue cannot be resolved with the steps above: +1. Collect the output from all diagnostic commands +2. Check if the issue is cluster-wide or namespace-specific +3. Review recent changes to deployments or cluster configuration diff --git a/examples/skills/openshift-troubleshooting/references/common-errors.md b/examples/skills/openshift-troubleshooting/references/common-errors.md new file mode 100644 index 000000000..75342f200 --- /dev/null +++ b/examples/skills/openshift-troubleshooting/references/common-errors.md @@ -0,0 +1,59 @@ +# Common OpenShift Errors + +## ImagePullBackOff + +**Symptoms**: Pod stuck in `ImagePullBackOff` or `ErrImagePull` status. + +**Causes**: +- Image does not exist in the registry +- Authentication failure (missing or expired pull secret) +- Network connectivity issues to the registry + +**Resolution**: +1. Verify the image name and tag: `oc get pod -o jsonpath='{.spec.containers[*].image}'` +2. Check pull secrets: `oc get secrets -n | grep pull` +3. Test registry access: `oc debug node/ -- chroot /host podman pull ` + +## CrashLoopBackOff + +**Symptoms**: Pod repeatedly starts and crashes. + +**Causes**: +- Application error on startup +- Missing configuration (environment variables, config maps) +- Insufficient memory causing OOM kills + +**Resolution**: +1. Check logs: `oc logs --previous` +2. Check for OOM: `oc get pod -o jsonpath='{.status.containerStatuses[*].lastState.terminated.reason}'` +3. Verify config: `oc get configmap -n ` and `oc get secret -n ` + +## Pending Pods + +**Symptoms**: Pod stays in `Pending` state. + +**Causes**: +- Insufficient cluster resources (CPU, memory) +- Node selector or affinity rules that cannot be satisfied +- PersistentVolumeClaim not bound + +**Resolution**: +1. Check events: `oc describe pod | grep -A5 Events` +2. Check node resources: `oc adm top nodes` +3. Check PVC status: `oc get pvc -n ` + +## Service Unreachable + +**Symptoms**: Cannot connect to a service from within or outside the cluster. + +**Causes**: +- Service selector does not match pod labels +- Pod is not in `Running` state +- NetworkPolicy blocking traffic +- Route not configured (for external access) + +**Resolution**: +1. Verify service selector: `oc get svc -o jsonpath='{.spec.selector}'` +2. Check matching pods: `oc get pods -l ` +3. Check network policies: `oc get networkpolicy -n ` +4. For external access, check routes: `oc get route -n ` From 6516e8857fe09e420eed16c5d844b6c291cf7fe8 Mon Sep 17 00:00:00 2001 From: are-ces <195810094+are-ces@users.noreply.github.com> Date: Wed, 20 May 2026 13:55:35 +0200 Subject: [PATCH 2/2] LCORE-2077: Clarify Agent Skills is an upcoming feature Add upcoming notices to README and skills guide since the feature is not yet implemented. Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 20 +++++--------------- docs/skills_guide.md | 4 +++- 2 files changed, 8 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index f176ae6c2..3f4babfbe 100644 --- a/README.md +++ b/README.md @@ -727,24 +727,14 @@ customization: By default, clients may specify `model` and `provider` in `/v1/query` and `/v1/streaming_query`. Override is permitted only to callers granted the `MODEL_OVERRIDE` action via the authorization rules. Requests that include `model` or `provider` without this permission are rejected with HTTP 403. -## Agent Skills - -Agent Skills allow product teams to extend Lightspeed Core with specialized instructions and domain knowledge that the LLM can load on demand. Skills follow the [Agent Skills open standard](https://agentskills.io) and are packaged as portable directories containing a `SKILL.md` file. - -Skills are configured by specifying paths to skill directories in `lightspeed-stack.yaml`: - -```yaml -skills: - paths: - - "/var/skills/" # Directory containing skill subdirectories -``` - -Each skill directory must contain a `SKILL.md` file with YAML frontmatter (`name` and `description`) followed by Markdown instructions. The LLM discovers available skills via tool calls and loads instructions on demand. +## Agent Skills (Upcoming) > [!NOTE] -> Skills are configured by product teams at deployment time. End users do not have the ability to add skills. +> Agent Skills is an upcoming feature. The documentation below describes the planned design. + +Agent Skills will allow product teams to extend Lightspeed Core with specialized instructions and domain knowledge that the LLM can load on demand. Skills follow the [Agent Skills open standard](https://agentskills.io) and are packaged as portable directories containing a `SKILL.md` file. -For the full configuration guide, skill authoring instructions, and examples, see the [Agent Skills Guide](docs/skills_guide.md). +For the planned configuration guide, skill authoring instructions, and examples, see the [Agent Skills Guide](docs/skills_guide.md). ## Safety Shields diff --git a/docs/skills_guide.md b/docs/skills_guide.md index e6801a8e9..8a6bf8129 100644 --- a/docs/skills_guide.md +++ b/docs/skills_guide.md @@ -1,5 +1,8 @@ # Agent Skills Guide +> [!NOTE] +> Agent Skills is an upcoming feature. This guide describes the planned design and will be updated when the feature is available. + This guide covers how to configure Agent Skills in Lightspeed Core Stack and how to author your own skills. --- @@ -15,7 +18,6 @@ This guide covers how to configure Agent Skills in Lightspeed Core Stack and how - [Creating a Skill](#creating-a-skill) - [How Skills Work at Runtime](#how-skills-work-at-runtime) - [Limitations](#limitations) -- [Error Reference](#error-reference) - [References](#references) ---