Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
195 changes: 68 additions & 127 deletions docs/2-getting-started/start-free-with-cloud.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,179 +4,120 @@ title: Start free with Cloud

# Start Free with Recce Cloud

**Launch Recce in under 2 minutes**. Each following feature provides additional value. The progressive features help you get more value from Recce over time.
**Get started in 3 simple steps**. After signing up, you'll be guided through connecting your GitHub repository and automating your PR validation workflow.

👉 **[Start Free →](https://cloud.reccehq.com){target="_blank"}**
👉 **[Start Free →](https://cloud.reccehq.com){target="\_blank"}**

## Model Changes and Impact Analysis
## Step 1: Connect Your GitHub Repository {#github-integration}

Recce shows what changed between **base** and **current** environments and helps assess potential impact. The most common case is comparing your development branch against your production or main branch to see what your changes will impact.
After signing up, you'll enter your default project. The first step is connecting your GitHub organization and repository to enable PR tracking and validation.

You can:
### What You'll Get

- Explore with the pre-loaded Jaffle Shop data
- Upload your metadata (see below)
- **Skip manual upload go directly to [CI/CD automation](#cicd-automation)**
Once connected, you can:

<!-- insert a video -->
- View all open and draft PRs
- See PR summaries

### Setup Requirements

- GitHub repository with dbt project
- Permissions for GitHub App installations
- (Optional) Active PRs with model changes

### Connection Steps

### Upload Metadata
- Web interface: Click "Update" on the session you want to update in Recce Cloud.
1. Click "Update" in base session to upload baseline metadata
2. Click "Update" in current session to upload comparison metadata
3. Click "Launch" to compare current against base
- CLI command:
```
recce upload-session --session-id <your_session_id>
```
Find your session ID in Recce Cloud web interface when clicking "Update" on any session.
Follow the guidance to connect your GitHub organization and link your repository:

### Required Files
![Connect GitHub](../assets/images/2-getting-started/connect_github.png){ width="49%" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@even-wei I thought we agreed on moving the Connect button to the top above the Step 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, I will update it.

![Link Repository](../assets/images/2-getting-started/link_repository.png){ width="49%" }

Recce needs `manifest.json` and `catalog.json` from both **base** and **current** environments for comparison.
Once connected, your PRs will appear in the dashboard with basic change summaries:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"with basic change summaries:"

do users see the summary automatically or they need to click the "Summarize"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The need to click "Summarize", how about

Once connected, your PRs will appear in the dashboard, and you can summarize them:


#### Base Metadata
![Connected Projects](../assets/images/2-getting-started/connected_projects.png)

Production environment is commonly used as the baseline, but any environment can serve as the base.
## Step 2: Automate Metadata with CI/CD {#cicd-automation}

Choose one method:
To enable easy validation with Recce and enrich PR summaries, configure CI/CD automation to automatically prepare metadata for every PR.

**Method 1: Generate locally**
### What You'll Get

```
dbt docs generate --target-path target-base --target <your_prod_target>
```
With CI/CD configured, you get:

- Automatic metadata upload on every PR
- **One-click "Launch Recce"** to validate changes interactively
- **Enriched PR summaries** with comprehensive model impact analysis
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest we remove the ** in line 48 and line 49

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


**Method 2: dbt Cloud**<br>
Deploy → Jobs → Production job → Recent run → Download artifacts
### Setup Requirements

**Method 3: dbt Docs server**<br>
Download the artifacts directly from dbt docs server:
- GitHub integration completed (Step 1)
- CI/CD jobs that generate dbt docs

- `<dbt_docs_url>/manifest.json`
- `<dbt_docs_url>/catalog.json`
### How It Works

#### Current Metadata
The CI/CD integration automates the metadata upload process:

Use development environment or PR branch as current to compare against the base.
1. **Automatic upload**: Metadata is uploaded to Recce Cloud for both base and current environments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest we align the environments name with the UI. We have production Metadata and PR sessions on UI, but no "base" and "current" environments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I will align it

2. **Ready to validate**: PR appears in your dashboard "Launch Recce" button
3. **Enriched summaries**: Enriched PR summaries with detailed model impact analysis
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 62, does it mean we support Metadata currently?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not yet, let me remove it and add it back when we support it.


Choose one method:
### Setup Guides

**Method 1: Generate locally**
Follow the detailed setup guides for your CI/CD platform:

```
dbt docs generate --target <your_dev_target>
```
- [Setup CD](/7-cicd/setup-cd/) - Configure continuous deployment for base environment metadata
- [Setup CI](/7-cicd/setup-ci/) - Configure continuous integration for PR environment metadata

**Method 2: dbt Cloud**<br>
Deploy → Jobs → CI job → Recent run → Download artifacts
<!-- insert a video -->

## Advanced Features
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get rid of this line, and replace it with something like "You have the metadata, the next is to have data diffing. See Data Diffing"

and then make Data diffing to a stand alone page.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I think we can remove all these after line 73 since there is https://docs.reccehq.com/5-data-diffing/connect-to-warehouse/


## Data Warehouse Diffing {#data-diffing}
### Data Warehouse Diffing {#data-diffing}

Go beyond metadata to see how changes affect your actual data. Configure your data warehouse connection to compare query results and catch issues before they impact production.

### Setup Requirements
#### Why Use Data Diffing?

While metadata comparison shows structural changes, data diffing reveals:

- Unexpected changes in row counts or data distributions
- Breaking changes in downstream transformations
- Data quality issues before production deployment

#### Setup Requirements

- Data warehouse credentials with read access
- Network connectivity to your warehouse
- Base and current environments configured in previous step
- CI/CD automation configured (Step 2 recommended)

### Supported Warehouses
#### Supported Warehouses

- Snowflake
- Databricks
- Others coming in future releases

### Warehouse Connection

Configure connection to your data warehouse to enable query result comparisons. For detailed connection settings, see [Connect to Warehouse](../5-data-diffing/connect-to-warehouse.md).
#### Configure Warehouse Connection

**Connection setup:**

1. Navigate to [settings](https://cloud.reccehq.com/settings#organization){target="_blank"}
2. Add Connection
3. Navigate to your [project home](https://cloud.datarecce.io/) and open the project settings by clicking the gear icon
1. Navigate to [organization settings](https://cloud.reccehq.com/settings#organization){target="\_blank"}
2. Click "Add Connection" and enter your warehouse credentials
3. Go to your [project home](https://cloud.reccehq.com/) and click the gear icon
4. Link the newly added connection to your project

Your connection credentials are secure. See our [security practices](https://reccehq.com/security/){target="_blank"} for details.

<!-- insert a video -->

### How to Use Data Diffing

Recce supports several data diffing methods. See Data Diffing sections for details:

- [Row Count Diff](/5-data-diffing/row-count-diff)
- [Profile Diff](/5-data-diffing/profile-diff/)
- [Value Diff](/5-data-diffing/value-diff/)
- [Top-K Diff](/5-data-diffing/topK-diff/)
- [Histogram Diff](/5-data-diffing/histogram-diff/)
- [Query](/5-data-diffing/query/)

## GitHub Integration {#github-integration}

Connect your GitHub repo to see all PRs in one place and validate changes before they hit production.

### Setup Requirements

- GitHub repository with dbt project
- Repository admin access for initial setup
- Active PRs with model changes

!!!Note
You'll need administrative access to the GitHub organization you want to connect. Please ensure you have the necessary permissions for **GitHub App installations**.

### GitHub Connection

Connect your repository to track pull requests and validate changes.

**Connection setup:**
Your connection credentials are secure. See our [security practices](https://reccehq.com/security/){target="\_blank"} for details.

1. Navigate to settings
2. Connect GitHub repository
3. Authorize Recce access
4. Select repository
For detailed connection settings, see [Connect to Warehouse](../5-data-diffing/connect-to-warehouse.md).

<!-- insert a video -->

### How to Use PR Tracking

Once connected, Recce displays all open and draft PRs in your dashboard.

### PR Validation Workflow

- View open/draft PRs in dashboard
- Select PR to validate
- Upload PR metadata (until CI/CD is configured)
- Launch Recce to analyze changes


## CI/CD Automation {#cicd-automation}

Set up CI/CD to automatically upload metadata and run validation checks on every PR.

!!!Note
Available with Team plan (free trial included).

### Setup Requirements
See the CI/CD sections for complete setup guides:

- [Setup CD](/7-cicd/setup-cd/)
- [Setup CI](/7-cicd/setup-ci/)

- GitHub integration configured
- Team plan subscription or free trial

### Automation Benefits

- Automatic metadata upload on every PR
- Consistent validation across all PRs
- Reduced manual setup steps
- Integrated PR status checks
- Validation results directly in PR


#### Available Diffing Methods

Recce supports multiple data diffing approaches:

- [Row Count Diff](/5-data-diffing/row-count-diff) - Compare total row counts
- [Profile Diff](/5-data-diffing/profile-diff/) - Statistical profiling of columns
- [Value Diff](/5-data-diffing/value-diff/) - Detailed value-level comparison
- [Top-K Diff](/5-data-diffing/topK-diff/) - Compare top values and frequencies
- [Histogram Diff](/5-data-diffing/histogram-diff/) - Distribution analysis
- [Query](/5-data-diffing/query/) - Custom SQL queries for validation
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.