Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ venv/
__pycache__
.pytest_cache/
examples/rag/README.md
site/
8 changes: 7 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.PHONY: lint build publish clean
.PHONY: lint build publish clean docs docs-serve

lint:
pycodestyle . --ignore=E501
Expand All @@ -11,3 +11,9 @@ publish: clean build

clean:
rm -rf .pytest_cache dist pgvector.egg-info

docs:
mkdocs build

docs-serve:
mkdocs serve
Comment on lines +14 to +19
Copy link
Owner

@jackrua jackrua Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want makefile commands I will just use the python venv @copilot

17 changes: 17 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,23 @@ Run:
pip install pgvector
```

## Documentation

Full documentation is available at the [documentation site](https://pgvector.github.io/pgvector-python/).

To build the documentation locally:

```sh
pip install mkdocs mkdocs-material
make docs
```

To serve the documentation locally:

```sh
make docs-serve
```

And follow the instructions for your database library:

- [Django](#django)
Expand Down
42 changes: 42 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Documentation

This directory contains the documentation for pgvector-python, built with [MkDocs](https://www.mkdocs.org/) and the [Material theme](https://squidfunk.github.io/mkdocs-material/).

## Building the Documentation

To build the documentation locally:

```sh
pip install mkdocs mkdocs-material
make docs
```

The built documentation will be in the `site/` directory.

## Serving the Documentation

To serve the documentation locally for development:

```sh
make docs-serve
```

This will start a development server at `http://127.0.0.1:8000/`.

## Documentation Structure

- `docs/index.md` - Home page
- `docs/getting-started/` - Getting started guides
- `installation.md` - Installation instructions for different database adapters
- `docs/examples/` - Example usage guides
- `openai.md` - OpenAI embeddings example

## Adding New Pages

1. Create a new Markdown file in the appropriate directory under `docs/`
2. Add the page to the navigation in `mkdocs.yml`
3. Build and test locally with `make docs-serve`

## Configuration

The documentation configuration is in `mkdocs.yml` at the root of the repository.
177 changes: 177 additions & 0 deletions docs/examples/openai.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# OpenAI Embeddings Example

This example demonstrates how to use pgvector with OpenAI's embedding API to store and search text embeddings.

## Overview

This example shows how to:

- Generate embeddings using OpenAI's API
- Store embeddings in PostgreSQL with pgvector
- Perform similarity search to find related documents

## Prerequisites

- OpenAI API key
- PostgreSQL with pgvector extension installed
- Python packages: `openai`, `pgvector`, `psycopg` or another supported database adapter

## Installation

Install the required packages:

```sh
pip install pgvector openai psycopg[binary]
```

## Basic Example

Here's a simple example using Psycopg 3:

```python
import openai
import psycopg
from pgvector.psycopg import register_vector

# Set up OpenAI API
openai.api_key = 'your-api-key'

# Connect to database
conn = psycopg.connect(dbname='mydb')
register_vector(conn)

# Enable the extension
conn.execute('CREATE EXTENSION IF NOT EXISTS vector')

# Create a table
conn.execute('CREATE TABLE documents (id bigserial PRIMARY KEY, content text, embedding vector(1536))')

# Generate and store embeddings
def add_document(content):
response = openai.embeddings.create(
input=content,
model="text-embedding-3-small"
)
embedding = response.data[0].embedding
conn.execute('INSERT INTO documents (content, embedding) VALUES (%s, %s)', (content, embedding))

# Add some documents
add_document('The cat sits on the mat')
add_document('A dog runs in the park')
add_document('Feline animals are independent')

conn.commit()

# Search for similar documents
def search(query, limit=5):
response = openai.embeddings.create(
input=query,
model="text-embedding-3-small"
)
embedding = response.data[0].embedding

results = conn.execute(
'SELECT content, embedding <=> %s as distance FROM documents ORDER BY distance LIMIT %s',
(embedding, limit)
).fetchall()

return results

# Find documents similar to a query
results = search('cat')
for content, distance in results:
print(f'{content}: {distance}')
```

## Using with SQLAlchemy

Here's the same example using SQLAlchemy:

```python
import openai
from sqlalchemy import create_engine, select, text
from sqlalchemy.orm import Session, DeclarativeBase, Mapped, mapped_column
from pgvector.sqlalchemy import Vector

# Set up database
engine = create_engine('postgresql://user:password@localhost/dbname')

class Base(DeclarativeBase):
pass

class Document(Base):
__tablename__ = 'documents'

id: Mapped[int] = mapped_column(primary_key=True)
content: Mapped[str]
embedding: Mapped[list] = mapped_column(Vector(1536))

# Create tables
with Session(engine) as session:
session.execute(text('CREATE EXTENSION IF NOT EXISTS vector'))
session.commit()

Base.metadata.create_all(engine)

# Generate and store embeddings
def add_document(content):
response = openai.embeddings.create(
input=content,
model="text-embedding-3-small"
)
embedding = response.data[0].embedding

with Session(engine) as session:
doc = Document(content=content, embedding=embedding)
session.add(doc)
session.commit()

# Search for similar documents
def search(query, limit=5):
response = openai.embeddings.create(
input=query,
model="text-embedding-3-small"
)
embedding = response.data[0].embedding

with Session(engine) as session:
results = session.scalars(
select(Document)
.order_by(Document.embedding.l2_distance(embedding))
.limit(limit)
).all()

return results
```

## Performance Tips

### Add an Index

For better performance with larger datasets, add an HNSW index:

```python
conn.execute('CREATE INDEX ON documents USING hnsw (embedding vector_l2_ops)')
```

### Use Half-Precision Vectors

To save storage space, you can use half-precision vectors:

```python
# Create table with halfvec
conn.execute('CREATE TABLE documents (id bigserial PRIMARY KEY, content text, embedding halfvec(1536))')

# Index with half-precision
conn.execute('CREATE INDEX ON documents USING hnsw (embedding halfvec_l2_ops)')
```

## Complete Example

For a complete working example, see the [example.py](https://github.com/pgvector/pgvector-python/blob/master/examples/openai/example.py) file in the repository.

## Next Steps

- Learn about [hybrid search](https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search/rrf.py) combining vector and keyword search
- Explore [RAG (Retrieval-Augmented Generation)](https://github.com/pgvector/pgvector-python/blob/master/examples/rag/example.py) patterns
- Try other embedding providers like [Cohere](https://github.com/pgvector/pgvector-python/blob/master/examples/cohere/example.py) or [SentenceTransformers](https://github.com/pgvector/pgvector-python/blob/master/examples/sentence_transformers/example.py)
Loading