Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
b632079
Add environment setup guide for CI/CD configuration
ijac13 Feb 27, 2026
6f406a3
Add dbt Cloud setup guide
ijac13 Feb 27, 2026
a2b7cad
Add setup choice callout for CI/CD section
ijac13 Feb 27, 2026
508bc0e
Update setup callout with per-PR explanation
ijac13 Feb 27, 2026
909b5d6
Add CI/CD platform as item 3 in setup callout
ijac13 Feb 27, 2026
0f54fe1
Move web agent intro above setup options
ijac13 Feb 27, 2026
679121f
Add 'Choose your setup' decision tree to CI/CD section
ijac13 Feb 27, 2026
99e1e4b
Fix nested list indentation in Choose your setup section
ijac13 Feb 27, 2026
94d4554
Add environment setup docs and move CI/CD setup pages
ijac13 Feb 27, 2026
7e5dec7
Add dbt-cloud-setup to nav and update setup-cd/ci links
ijac13 Feb 27, 2026
ac21260
Simplify environment-setup based on feedback
ijac13 Feb 27, 2026
65bf84c
Add environment strategy section to setup-cd and setup-ci
ijac13 Feb 27, 2026
0141423
Add environment and CI/CD pages to mkdocs.yml navigation
ijac13 Feb 27, 2026
cb8241b
Apply QA and AISEO fixes to PR2c docs
ijac13 Feb 27, 2026
4f8f80f
Apply QA and AISEO fixes to PR2b docs
ijac13 Feb 27, 2026
6876bd6
Add OSS setup guides and Cloud vs OSS comparison
ijac13 Mar 1, 2026
9179577
Add admin setup guide for team collaboration
ijac13 Mar 1, 2026
928fe1a
docs: refine OSS docs and remove installation.md
ijac13 Mar 2, 2026
35c70d7
docs: fix terminology - use PR instead of pull request
ijac13 Mar 2, 2026
428557b
docs: add expected outcomes to environment-setup steps
ijac13 Mar 2, 2026
31cdb15
docs: add redirects for renamed/moved pages
ijac13 Mar 2, 2026
9be3a91
docs: add redirect for invitation.md → admin-setup.md
ijac13 Mar 2, 2026
90c2bd4
docs: add redirects and fix broken link
ijac13 Mar 2, 2026
4744f2f
docs: fix navigation and broken links for moved files
ijac13 Mar 2, 2026
dda95f1
docs: fix prerequisites format to use [x] instead of [ ]
ijac13 Mar 2, 2026
6d8607c
docs: move rename project to step 4, add step 6 for invitee instructions
ijac13 Mar 2, 2026
7abd18a
Add setup choice callout for CI/CD section
ijac13 Feb 27, 2026
5f2b5b4
Update setup callout with per-PR explanation
ijac13 Feb 27, 2026
69f72ae
Add CI/CD platform as item 3 in setup callout
ijac13 Feb 27, 2026
d3ac5e0
Move web agent intro above setup options
ijac13 Feb 27, 2026
3afff2f
docs: simplify dbt Cloud setup - remove warehouse, use recce-cloud up…
ijac13 Mar 3, 2026
fab5b46
docs: add What the Agent Does section
ijac13 Mar 3, 2026
844f8b9
Add data developer and reviewer workflow guides
ijac13 Mar 3, 2026
e548ae4
Add redirects plugin and document cleanup tasks
ijac13 Mar 3, 2026
7cbadaf
Add Community section with support and changelog pages
ijac13 Mar 3, 2026
5fef6c4
Add Reference section with configuration, state file, and CLI documen…
ijac13 Mar 3, 2026
9f2d7b7
Merge pull request #72 from DataRecce/pr2c-dbt-cloud-setup
doriwilson Mar 6, 2026
52fbda7
Resolve merge conflicts with docs-v3 (PR #72 dbt-cloud-setup)
doriwilson Mar 6, 2026
c31c43f
Merge pull request #73 from DataRecce/pr2b-environment-setup
doriwilson Mar 6, 2026
f5fa265
Resolve merge conflicts with docs-v3 (PRs #72, #73)
doriwilson Mar 6, 2026
6935a9c
Merge pull request #75 from DataRecce/pr2d-oss-setup
doriwilson Mar 6, 2026
9b743fd
Resolve merge conflicts with docs-v3 (PRs #72, #73)
doriwilson Mar 6, 2026
98a2b06
Resolve merge conflicts with docs-v3 (PRs #72, #73, #75)
doriwilson Mar 6, 2026
b29ef81
Merge pull request #74 from DataRecce/pr3-admin-setup
doriwilson Mar 6, 2026
d8c7ec1
Resolve merge conflicts with docs-v3 (PR #74 admin-setup)
doriwilson Mar 6, 2026
cc02312
Merge pull request #79 from DataRecce/pr5-agent-docs
doriwilson Mar 6, 2026
9e99b9b
Resolve merge conflicts with docs-v3 (PRs #74, #75, #79)
doriwilson Mar 6, 2026
2afd55f
Merge pull request #80 from DataRecce/pr4-workflows
doriwilson Mar 6, 2026
5858d30
Merge remote-tracking branch 'origin/docs-v3' into pr8-reference
doriwilson Mar 6, 2026
2e3feb5
Merge pull request #82 from DataRecce/pr9-community
doriwilson Mar 6, 2026
24376d2
Resolve merge conflicts with docs-v3 (all prior PRs)
doriwilson Mar 6, 2026
a9b9692
Merge pull request #83 from DataRecce/pr8-reference
doriwilson Mar 6, 2026
309f6ad
Merge pull request #81 from DataRecce/pr10-cleanup
doriwilson Mar 6, 2026
0ed1dfe
docs(v3): Add What You Can Explore section
ijac13 Mar 7, 2026
c853766
fix: Update broken links in multi-models.md
ijac13 Mar 7, 2026
b2dbfaa
docs: Add When to Use and Related sections for AISEO compliance
ijac13 Mar 7, 2026
e8107af
Merge pull request #86 from DataRecce/pr6-what-you-can-explore
ijac13 Mar 7, 2026
d3f691a
Update collaboration pages with doc.md structure
ijac13 Mar 3, 2026
5a0237a
docs(v3): Update collaboration pages with cloud-first structure
ijac13 Mar 7, 2026
3e34715
docs: Add image placeholders for Recce Cloud preset checks
ijac13 Mar 7, 2026
88c0956
Merge pull request #84 from DataRecce/pr7-collaboration
ijac13 Mar 7, 2026
0a2e4fb
Rewrite start-free-with-cloud.md from mega-page to overview+router
doriwilson Mar 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions docs/1-whats-recce/cloud-vs-oss.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
---
title: Cloud vs Open Source
---

# Cloud vs Open Source

Validating data changes manually takes time and slows PR review. Recce is a data validation agent. Open Source gives you the core validation engine to run yourself, Cloud gives you the full Agent experience with automated validation on every PR.

```mermaid
flowchart LR
subgraph Cloud
direction LR
C1[You open PR] --> C2[Agent validates automatically]
C2 --> C3[Summary posted to PR]
end

subgraph OSS["Open Source"]
direction LR
O1[You open PR] --> O2[You run checks manually]
O2 --> O3[You copy results to PR]
end
```

## The Core Difference

| | Cloud | Open Source |
|--|-------|-------------|
| **Experience** | Recce Agent works alongside you | You run validation manually |
| **PR validation** | Agent validates automatically, posts summary | You run checks, copy results to PR |
| **During development** | CLI + Agent assistance | CLI tools only |
| **Learning curve** | Agent guides you through validation | Learn the tools, run them yourself |

## Cloud

Recce Cloud connects to your Git repository and data warehouse so the Recce Agent can validate your data changes automatically. When you open a PR, the Agent analyzes your changes, runs validation checks, and posts findings directly to your PR — no manual work required.

**On pull requests:**

The Agent runs automatically when you open a PR. It:

- Analyzes your data model changes
- Runs relevant validation checks
- Posts a summary to your PR with findings
- Updates as you push new commits

**During development:**

The Agent works with your CLI through [Recce MCP](/5-data-diffing/mcp-server/) (Model Context Protocol):

- Answers questions about your changes
- Suggests validation approaches
- Helps interpret diff results

**For your team:**

- Define what "correct" means for your repo with preset checks that apply across all PRs
- Share validation standards as institutional knowledge — everyone validates the same way
- Developers and reviewers collaborate on validation, going back and forth until the change is verified

**Pricing:**

Recce Cloud is free to start. See [Pricing](https://www.reccehq.com/pricing) for plan details.

**Choose Cloud when:**

- You want automated validation on every PR
- You want Agent assistance during development
- Your team reviews data changes in PRs

## Open Source

Recce OSS is the core validation engine you run locally. You control when and how validation happens — run checks, explore results, and decide what to share. Everything stays on your machine unless you export it.

You get:

- Lineage Diff between branches
- Data comparison (row count, schema, profile, value, top-k, histogram diff)
- Query diff for custom validations
- Checklist to track your checks

**Choose OSS when:**

- Exploring Recce before adopting Cloud
- Working in environments without external connectivity
- Contributing to Recce development

## Feature Comparison

| Feature | Cloud | OSS |
|---------|-------|-----|
| Lineage Diff | :white_check_mark: | :white_check_mark: |
| Data diff<br> (row count, schema, profile, value, top-k, histogram diff) | :white_check_mark: | :white_check_mark: |
| Query diff | :white_check_mark: | :white_check_mark: |
| Checklist | :white_check_mark: | :white_check_mark: |
| Recce Agent on PRs | :white_check_mark: | :x: |
| Agent CLI assistance | :white_check_mark: | Manual |
| Preset checks across PRs | :white_check_mark: | Manual |
| Shared validation standards | :white_check_mark: | Manual |
| Developer-reviewer collaboration | :white_check_mark: | Manual |
| PR comments & summaries | :white_check_mark: | :x: |
| LLM-powered insights | :white_check_mark: | :x: |

## FAQ

**Can I start with OSS and upgrade to Cloud later?**

Yes. OSS and Cloud use the same validation engine. Your existing checklists and workflows carry over when you connect to Cloud.

**Does Cloud require a different setup than OSS?**

Cloud connects to your Git repository and data warehouse directly. You don't need to generate artifacts locally — the Agent handles that automatically.

**What data does Recce Cloud access?**

Recce Cloud accesses your dbt artifacts (manifest.json, catalog.json) and runs queries against your data warehouse to perform validation. Your data stays in your warehouse.

## Getting Started

- **Cloud:** [Start Free with Cloud](../2-getting-started/start-free-with-cloud.md)
- **OSS:** [OSS Setup](../2-getting-started/oss-setup.md)

## Related

- [What the Agent Does](../5-what-the-agent-does/index.md) — How the Recce Agent validates your changes
- [Data Developer Workflow](../3-using-recce/data-developer.md) — Using Recce throughout development
85 changes: 85 additions & 0 deletions docs/2-getting-started/connect-git.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# Connect Your Repository

**Goal:** Connect your GitHub or GitLab repository to Recce Cloud for automated PR data review.

Recce Cloud supports GitHub and GitLab. Using a different provider? Contact us at support@reccehq.com.

## Prerequisites

- [x] Recce Cloud account (free trial at cloud.reccehq.com)
- [x] Repository admin access (required to authorize app installation)
- [x] dbt project in the repository

## How It Works

When you connect a Git provider, Recce Cloud maps your setup:

| Git Provider | Recce Cloud |
|--------------|-------------|
| Organization | Organization |
| Repository | Project |

Every Recce Cloud account starts with one organization and one project. When you connect your Git provider, you select which organization and repository to link.

**Monorepo support:** If you have multiple dbt projects in one repository, you can create multiple Recce Cloud projects that connect to the same repo.
<!-- TODO: add link to monorepo section -->

## Connect GitHub

### 1. Authorize the Recce GitHub App

Navigate to Settings → Git Provider in Recce Cloud. Click **Connect GitHub**.

**Expected result:** GitHub authorization page opens.

### 2. Select Organization and Repository

Choose which GitHub organization to connect. This becomes your Recce Cloud organization.

Then select the repository containing your dbt project. This becomes your Recce Cloud project.

**Expected result:** Repository connected. Your Recce Cloud project is ready to use.

![alt text](../assets/images/2-getting-started/connect-github.png){: .shadow}

## Connect GitLab

GitLab uses Personal Access Tokens (PAT) instead of OAuth.

### 1. Create a Personal Access Token

In GitLab: User Settings → Access Tokens → Add new token.

**Required scopes:**

- `api` - Full access (required for PR comments)
- `read_api` - Read-only alternative (limited functionality)

**Expected result:** Token string displayed (copy immediately).

### 2. Add Token to Recce Cloud

Navigate to Settings → Git Provider. Select GitLab, paste token.

## Verify Success

In Recce Cloud, navigate to your repository. You should see:

- Connection status: "Connected"
- Organization Project is linked to a git repository

![alt text](../assets/images/2-getting-started/connect-gitlab.png){: .shadow}
![alt text](../assets/images/2-getting-started/org-projects.png){: .shadow}

## Troubleshooting

| Issue | Solution |
| --- | --- |
| Repository not found | Ensure proper permissions are granted (GitLab: token access, GitHub: app authorized) |
| Invalid token (GitLab) | Generate new token with `api` scope |
| Cannot post PR comments (GitLab) | Regenerate token with `api` scope instead of `read_api` |

## Next Steps

- [Connect Data Warehouse](connect-to-warehouse.md)
- [Add Recce to CI/CD](../7-cicd/ci-cd-getting-started.md)
107 changes: 107 additions & 0 deletions docs/2-getting-started/connect-to-warehouse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Connect Data Warehouse

**Goal:** Connect your data warehouse to Recce Cloud to enable data diffing on PRs.

Recce Cloud supports **[Snowflake](#connect-snowflake), [Databricks](#connect-databricks), [BigQuery](#connect-bigquery), and [Redshift](connect-redshift)**. Using a different warehouse? Contact us at support@reccehq.com.

## Prerequisites

- [x] Warehouse credentials with read access
- [x] Network access configured (IP whitelisting if required)

## Security

Recce Cloud queries your warehouse directly to compare Base and Current environments. Recce encrypts and stores credentials securely. Read-only access is sufficient for all data diffing features.

## Connect Snowflake

### Option 1: Username/Password

| Field | Description | Example |
|-------|-------------|---------|
| Account | Snowflake account identifier | `xxxxxx.us-central1.gcp` |
| Username | Database username | `MY_USER` |
| Password | Database password | `my_password` |
| Role | Role with read access | `ANALYST_ROLE` |
| Warehouse | Compute warehouse name | `WH_LOAD` |

### Option 2: Key Pair Authentication

| Field | Description | Example |
|-------|-------------|---------|
| Account | Snowflake account identifier | `xxxxxx.us-central1.gcp` |
| Username | Service account username | `MY_USER` |
| Private Key | PEM-formatted private key | `-----BEGIN RSA PRIVATE KEY-----...` |
| Passphrase | Key passphrase (if encrypted) | `my_passphrase` |
| Role | Role with read access | `ANALYST_ROLE` |
| Warehouse | Compute warehouse name | `WH_LOAD` |

## Connect Databricks

### Option 1: Personal Access Token

| Field | Description | Example |
|-------|-------------|---------|
| Host | Workspace URL | `adb-1234567890123456.7.azuredatabricks.net` |
| HTTP Path | SQL warehouse path | `/sql/1.0/warehouses/abc123def456` |
| Token | Personal access token | `dapiXXXXXXXXXXXXXXXXXXXXXXX` |
| Catalog | Unity Catalog name (optional) | `my_catalog` |

### Option 2: OAuth (M2M)

| Field | Description | Example |
|-------|-------------|---------|
| Host | Workspace URL | `adb-1234567890123456.7.azuredatabricks.net` |
| HTTP Path | SQL warehouse path | `/sql/1.0/warehouses/abc123def456` |
| Client ID | Service principal client ID | `12345678-1234-1234-1234-123456789012` |
| Client Secret | Service principal secret | `dose1234567890abcdef` |
| Catalog | Unity Catalog name (optional) | `my_catalog` |


> **Note**: OAuth M2M is auto-enabled in Databricks accounts. For setup details, see [dbt Databricks setup](https://docs.getdbt.com/docs/core/connect-data-platform/databricks-setup#oauth-machine-to-machine-m2m-authentication).

## Connect BigQuery

| Field | Description | Example |
|-------|-------------|---------|
| Project | GCP project ID | `my-gcp-project-123456` |
| Service Account JSON | Full JSON key file contents | `{"type": "service_account", ...}` |


> **Note**: For authentication, we currently provide support for service account JSON only. More details [here](https://docs.getdbt.com/docs/core/connect-data-platform/bigquery-setup#service-account-json).

## Connect Redshift

| Field | Description | Example |
|-------|-------------|---------|
| Host | Cluster endpoint | `my-cluster.abc123xyz.us-west-2.redshift.amazonaws.com` |
| Port | Database port | `5439` (Default) |
| Database | Database name | `analytics_db` |
| Username | Database user | `admin_user` |
| Password | Database password | `my_password` |


> **Note**: We currently support Database (Password-based authentication) only. More details [here](https://docs.getdbt.com/docs/core/connect-data-platform/redshift-setup#authentication-parameters).

## Save Connection

After entering your connection details, click **Save**. Recce Cloud runs a connection test automatically and displays "Connected" on success.

## Verify Success

Navigate to Organization Settings in Recce Cloud. Your data warehouse should appear.

![alt text](../assets/images/2-getting-started/connect-dw.png){: .shadow}

## Troubleshooting

| Issue | Solution |
| --- | --- |
| Connection refused | Whitelist Recce Cloud IP ranges in your network configuration |
| Authentication failed | Verify credentials and regenerate if expired |
| Permission denied on table | Grant SELECT permissions on target schemas |

## Next Steps

- [Add Recce to CI/CD](../7-cicd/setup-ci.md)
- [Run Your First Data Diff](../5-data-diffing/row-count-diff.md)
Loading