From 08e96369b0d75294205ed6582a7cee924bdfbf44 Mon Sep 17 00:00:00 2001 From: Pringled Date: Mon, 4 May 2026 19:43:53 +0200 Subject: [PATCH 1/6] Re-order readme --- README.md | 55 +++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 37 insertions(+), 18 deletions(-) diff --git a/README.md b/README.md index b85d810..019e9c6 100644 --- a/README.md +++ b/README.md @@ -20,6 +20,7 @@ [Main Features](#main-features) • [MCP Server](#mcp-server) • [CLI](#cli) • +[Python API](#python-api) • [How it works](#how-it-works) • [Benchmarks](#benchmarks) @@ -29,33 +30,26 @@ Semble is a code search library built for agents. It returns the exact code snip ## Quickstart +Install Semble: + ```bash pip install semble # Install with pip uv add semble # Install with uv ``` -```python -from semble import SembleIndex - -# Index a local directory -index = SembleIndex.from_path("./my-project") - -# Index a remote git repository -index = SembleIndex.from_git("https://github.com/MinishLab/model2vec") +Add Semble to Claude Code: -# Search the index with a natural-language or code query -results = index.search("save model to disk", top_k=3) +```bash +claude mcp add semble -s user -- uvx --from "semble[mcp]" semble +``` -# Find code similar to a specific result -related = index.find_related(results[0], top_k=3) +Then ask Claude Code to use Semble when navigating the codebase: -# Each result exposes the matched chunk -result = results[0] -result.chunk.file_path # "model2vec/model.py" -result.chunk.start_line # 127 -result.chunk.end_line # 150 -result.chunk.content # "def save_pretrained(self, path: PathLike, ..." ``` +Use Semble to find where authentication errors are handled. +``` + +Using another agent harness? See [MCP Server](#mcp-server) for setup instructions for Codex, OpenCode, Cursor, and other MCP clients. ## Main Features @@ -187,6 +181,31 @@ semble find-related src/auth.py 42 ./my-project If `semble` is not on `$PATH`, use `uvx --from "semble[mcp]" semble` in its place. +## Python API + +```python +from semble import SembleIndex + +# Index a local directory +index = SembleIndex.from_path("./my-project") + +# Index a remote git repository +index = SembleIndex.from_git("https://github.com/MinishLab/model2vec") + +# Search the index with a natural-language or code query +results = index.search("save model to disk", top_k=3) + +# Find code similar to a specific result +related = index.find_related(results[0], top_k=3) + +# Each result exposes the matched chunk +result = results[0] +result.chunk.file_path # "model2vec/model.py" +result.chunk.start_line # 127 +result.chunk.end_line # 150 +result.chunk.content # "def save_pretrained(self, path: PathLike, ..." +``` + ## How it works Semble splits each file into code-aware chunks using [Chonkie](https://github.com/chonkie-inc/chonkie), then scores every query against the chunks with two complementary retrievers: static [Model2Vec](https://github.com/MinishLab/model2vec) embeddings using the code-specialized [potion-code-16M](https://huggingface.co/minishlab/potion-code-16M) model for semantic similarity, and [BM25](https://github.com/xhluca/bm25s) for lexical matches on identifiers and API names. The two score lists are fused with Reciprocal Rank Fusion (RRF). From 92fcce519fd6bc03fba3eb5420c669ec777b42d4 Mon Sep 17 00:00:00 2001 From: Pringled Date: Mon, 4 May 2026 19:46:21 +0200 Subject: [PATCH 2/6] Re-order readme --- README.md | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/README.md b/README.md index 019e9c6..fcc04c1 100644 --- a/README.md +++ b/README.md @@ -43,11 +43,7 @@ Add Semble to Claude Code: claude mcp add semble -s user -- uvx --from "semble[mcp]" semble ``` -Then ask Claude Code to use Semble when navigating the codebase: - -``` -Use Semble to find where authentication errors are handled. -``` +Then ask Claude Code questions about the codebase, e.g. `How is authentication handled in this project?`? Claude Code will automatically use Semble to find the relevant code and answer the question efficiently. Using another agent harness? See [MCP Server](#mcp-server) for setup instructions for Codex, OpenCode, Cursor, and other MCP clients. From a22d36fd693c167899636b5b343f1b867d403ad5 Mon Sep 17 00:00:00 2001 From: Pringled Date: Mon, 4 May 2026 19:49:15 +0200 Subject: [PATCH 3/6] Re-order readme --- README.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index fcc04c1..369955e 100644 --- a/README.md +++ b/README.md @@ -17,11 +17,7 @@ [Quickstart](#quickstart) • -[Main Features](#main-features) • [MCP Server](#mcp-server) • -[CLI](#cli) • -[Python API](#python-api) • -[How it works](#how-it-works) • [Benchmarks](#benchmarks) @@ -43,7 +39,7 @@ Add Semble to Claude Code: claude mcp add semble -s user -- uvx --from "semble[mcp]" semble ``` -Then ask Claude Code questions about the codebase, e.g. `How is authentication handled in this project?`? Claude Code will automatically use Semble to find the relevant code and answer the question efficiently. +Then ask Claude Code questions about the codebase, e.g. `How is authentication handled in this project?`. Claude Code will automatically use Semble to find the relevant code and answer the question. Using another agent harness? See [MCP Server](#mcp-server) for setup instructions for Codex, OpenCode, Cursor, and other MCP clients. @@ -179,6 +175,8 @@ If `semble` is not on `$PATH`, use `uvx --from "semble[mcp]" semble` in its plac ## Python API +Semble can also be used as a Python library for programmatic access, useful when building custom tooling or integrating search directly into your own code. + ```python from semble import SembleIndex From 62840cfc9beb24e0917252a39bba295ef0586b09 Mon Sep 17 00:00:00 2001 From: Pringled Date: Mon, 4 May 2026 19:54:11 +0200 Subject: [PATCH 4/6] Re-order readme --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 369955e..49add7c 100644 --- a/README.md +++ b/README.md @@ -33,7 +33,7 @@ pip install semble # Install with pip uv add semble # Install with uv ``` -Add Semble to Claude Code: +Add Semble to Claude Code (requires [uv](https://docs.astral.sh/uv/getting-started/installation/)): ```bash claude mcp add semble -s user -- uvx --from "semble[mcp]" semble @@ -99,6 +99,8 @@ Add to `~/.cursor/mcp.json` (or `.cursor/mcp.json` in your project): } ``` +To upgrade to a newer version of Semble, run `uv cache clean semble` and restart your MCP client. + ### Tools | Tool | Description | From 1e0d7f30ac7ece77cf8b41dbf2f582de421cae4d Mon Sep 17 00:00:00 2001 From: Pringled Date: Mon, 4 May 2026 19:55:49 +0200 Subject: [PATCH 5/6] Re-order readme --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 49add7c..c9ba61a 100644 --- a/README.md +++ b/README.md @@ -47,7 +47,7 @@ Using another agent harness? See [MCP Server](#mcp-server) for setup instruction - **Fast**: indexes an average repo in ~250 ms and answers queries in ~1.5 ms, all on CPU. - **Accurate**: NDCG@10 of 0.854 on our [benchmarks](#benchmarks), on par with code-specialized transformer models, at a fraction of the size and cost. -- **Token-efficient**: returns only the relevant chunks, using ~98% fewer tokens than grep+read. +- **Token-efficient**: returns only the relevant chunks, using [~98% fewer tokens than grep+read](#token-efficiency). - **Zero setup**: runs on CPU with no API keys, GPU, or external services required. - **MCP server**: drop-in tool for Claude Code, Cursor, Codex, OpenCode, and any other MCP-compatible agent. - **Local and remote**: pass a local path or a git URL. From 3d487b3ffdd9d12210c53c779ca1778ee23dacc5 Mon Sep 17 00:00:00 2001 From: Pringled Date: Mon, 4 May 2026 20:03:37 +0200 Subject: [PATCH 6/6] Re-order readme --- README.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index c9ba61a..66ee5d3 100644 --- a/README.md +++ b/README.md @@ -18,6 +18,8 @@ [Quickstart](#quickstart) • [MCP Server](#mcp-server) • +[CLI](#cli) • +[Python API](#python-api) • [Benchmarks](#benchmarks) @@ -26,13 +28,6 @@ Semble is a code search library built for agents. It returns the exact code snip ## Quickstart -Install Semble: - -```bash -pip install semble # Install with pip -uv add semble # Install with uv -``` - Add Semble to Claude Code (requires [uv](https://docs.astral.sh/uv/getting-started/installation/)): ```bash @@ -155,6 +150,13 @@ If `semble` is not on `$PATH`, use `uvx --from "semble[mcp]" semble` in its plac ## CLI +Install Semble: + +```bash +pip install semble # Install with pip +uv add semble # Install with uv +``` + Semble also ships as a standalone CLI for use outside of MCP. This is useful in scripts, sub-agents, or anywhere you want search results without an MCP session. ```bash