-
Notifications
You must be signed in to change notification settings - Fork 476
Migration docs restructure #21245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
bsanchez-the-roach
wants to merge
63
commits into
main
Choose a base branch
from
molt-redux
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+10,731
−3,011
Open
Migration docs restructure #21245
Changes from all commits
Commits
Show all changes
63 commits
Select commit
Hold shift + click to select a range
3521d6f
Migration docs restructure
bsanchez-the-roach c392ac8
Merge branch 'main' into molt-redux
bsanchez-the-roach d2a9b90
worked out the IA for migration variables, basically
bsanchez-the-roach 3b21e54
reverting molt-fetch to main
bsanchez-the-roach 91df8d5
moved the splitting up of Fetch into a separate PR, fixed links for t…
bsanchez-the-roach b5ec9c6
moved the splitting up of Fetch into a separate PR, fixed links for t…
bsanchez-the-roach c57d19b
Merge branch 'main' into molt-redux
bsanchez-the-roach 258ed81
Merge branch 'main' into molt-redux
bsanchez-the-roach d1196eb
more progress on considerations: granularity, rollback, replication
bsanchez-the-roach a9dc413
Merge branch 'main' into molt-redux
bsanchez-the-roach d7722fc
added validation strategy consideration
bsanchez-the-roach d3ffe00
removed dead links
bsanchez-the-roach 95de160
added data transformation strategy
bsanchez-the-roach 8b21e2c
Merge branch 'main' into molt-redux
bsanchez-the-roach d5b0a56
Merge branch 'main' into molt-redux
bsanchez-the-roach 323ec2e
did main splitting of Fetch docs
bsanchez-the-roach 8facfdb
Merge branch 'main' into molt-redux
bsanchez-the-roach f698b63
merged in recent changes to replicator docs
bsanchez-the-roach b52e1a5
merging in recent replicator changes to phase 2
bsanchez-the-roach 122a74d
Update pr-reviews.yml to allow deployment of draft
bsanchez-the-roach b05a46a
Update pr-reviews.yml, returning to previous
bsanchez-the-roach 97dd735
separated Replicator to match Fetch, fixed links
bsanchez-the-roach b91f400
Merge branch 'molt-redux-phase-2' of github.com:cockroachdb/docs into…
bsanchez-the-roach 4a52019
Made Fetch docs much more compact and clean, improved linking, added …
bsanchez-the-roach c6e8a4c
Merge pull request #21819 from cockroachdb/molt-redux-phase-2
bsanchez-the-roach 011cf41
added small intro paragraphs to pages
bsanchez-the-roach 7af2c98
Merge pull request #22071 from cockroachdb/molt-redux-phase-2
bsanchez-the-roach 3200a6a
restarting build
bsanchez-the-roach a1da992
Merge branch 'main' into molt-redux
bsanchez-the-roach 19da8e9
WIP on splitting pages by source db type
bsanchez-the-roach dfc0035
merged in main mostly userscripts
bsanchez-the-roach fdab650
classic bulk load split up by source db type
bsanchez-the-roach 237d128
fixed two includes to pass linkcheck
bsanchez-the-roach ad802e7
added phased bulk load per source type
bsanchez-the-roach 7761ea0
added delta migration per source type
bsanchez-the-roach 20e5a8d
Merge branch 'main' into molt-redux
bsanchez-the-roach d1acbb2
fixed sidebar, updated diagrams, added start of phased delta migration
bsanchez-the-roach 327104e
Merge branch 'molt-redux' of github.com:cockroachdb/docs into molt-redux
bsanchez-the-roach 725f4e2
added phased delta with failback
bsanchez-the-roach e38a5c7
Merge branch 'main' into molt-redux
bsanchez-the-roach 5dcc410
removed original Migration Flows pages and all references/links to them
bsanchez-the-roach 3a07cf0
Merge branch 'main' into molt-redux
bsanchez-the-roach 30a1824
added new diagrams, moved type mapping, removed duplicate info in Con…
bsanchez-the-roach 259e680
removed molt-setup.md
bsanchez-the-roach 5cf7b8c
Merge branch 'main' into molt-redux
bsanchez-the-roach 1178142
rebuilding deploy preview
bsanchez-the-roach 649a761
Merge branch 'molt-redux' of github.com:cockroachdb/docs into molt-redux
bsanchez-the-roach b1a5528
Merge branch 'main' into molt-redux
bsanchez-the-roach 899e502
line edits on the migration considerations section
bsanchez-the-roach 7335c98
Merge branch 'main' into molt-redux
bsanchez-the-roach 60da1f0
fixed broken link
bsanchez-the-roach 00b29b3
Merge branch 'molt-redux' of github.com:cockroachdb/docs into molt-redux
bsanchez-the-roach 32cab1b
removed draft fetch flow image
bsanchez-the-roach 15daa0e
merged from main, resolved conflicts from metrics snapshot PR
bsanchez-the-roach 0cae90e
moved crdb-to-crdb callout
bsanchez-the-roach 81bd421
round one of changes based on Ryan Luu, Tuan, and Steven's feedback
bsanchez-the-roach 75501d8
added rollback details in migration walkthrough descriptions
bsanchez-the-roach d6e6648
updated metrics
bsanchez-the-roach d0a99b7
Merge branch 'main' into molt-redux
bsanchez-the-roach ae59e85
made changes from Ryan Luu's feedback
bsanchez-the-roach fc64b65
Merge branch 'molt-redux' of github.com:cockroachdb/docs into molt-redux
bsanchez-the-roach 68a29ce
improved limitation visibility
bsanchez-the-roach d6ea525
added limitation and fixed links
bsanchez-the-roach File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
254 changes: 254 additions & 0 deletions
254
src/current/_includes/molt/classic-bulk-load-all-sources.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,254 @@ | ||
| A [*Classic Bulk Load Migration*]({% link molt/migration-approach-classic-bulk-load.md %}) is the simplest way of [migrating data to CockroachDB]({% link molt/migration-overview.md %}). In this approach, you stop application traffic to the source database and migrate data to the target cluster using [MOLT Fetch]({% link molt/molt-fetch.md %}) during a **significant downtime window**. Application traffic is then cut over to the target after schema finalization and data verification. | ||
|
|
||
| - All source data is migrated to the target [at once]({% link molt/migration-considerations-granularity.md %}). | ||
|
|
||
| - This approach does not utilize [continuous replication]({% link molt/migration-considerations-replication.md %}). | ||
|
|
||
| - [Rollback]({% link molt/migration-considerations-rollback.md %}) is manual, but in most cases it's simple, as the source database is preserved and write traffic begins on the target all at once. If you wish to roll back before the target has received any writes that are not present on the source database, nothing needs to be done. If you wish to roll back after the target has received writes that are not present on the source database, you must manually replicate these new rows on the source. | ||
|
|
||
| This approach is best for small databases (<100 GB), internal tools, dev/staging environments, and production environments that can handle business disruption. It's a simple approach that guarantees full data consistency and is easy to execute with limited resources, but it can only be performed if your system can handle significant downtime. | ||
|
|
||
| This page describes an example scenario. While the commands provided can be copy-and-pasted, they may need to be altered or reconsidered to suit the needs of your specific environment. | ||
|
|
||
| <div style="text-align: center;"> | ||
| <img src="{{ 'images/molt/molt_classic_bulk_load_flow.svg' | relative_url }}" alt="Classic Bulk Load Migration flow" style="max-width:100%" /> | ||
| </div> | ||
|
|
||
| ## Example scenario | ||
|
|
||
| You have a small (50 GB) database that provides the data store for a web application. You want to migrate the entirety of this database to a new CockroachDB cluster. You schedule a maintenance window for Saturday from 2 AM to 6 AM, and announce it to your users several weeks in advance. | ||
|
|
||
| The application runs on a Kubernetes cluster. | ||
|
|
||
| **Estimated system downtime:** 4 hours. | ||
|
|
||
| ## Before the migration | ||
|
|
||
| - Install the [MOLT (Migrate Off Legacy Technology)]({% link molt/molt-fetch-installation.md %}#installation) tools. | ||
| - Review the [MOLT Fetch]({% link molt/molt-fetch-best-practices.md %}) documentation. | ||
| - [Develop a migration plan]({% link molt/migration-strategy.md %}#develop-a-migration-plan) and [prepare for the migration]({% link molt/migration-strategy.md %}#prepare-for-migration). | ||
| - **Recommended:** Perform a dry run of this full set of instructions in a development environment that closely resembles your production environment. This can help you get a realistic sense of the time and complexity it requires. | ||
| - Announce the maintenance window to your users. | ||
| - Understand the prequisites and limitations of the MOLT tools: | ||
|
|
||
| <section class="filter-content" markdown="1" data-scope="oracle"> | ||
| {% include molt/oracle-migration-prerequisites.md %} | ||
| </section> | ||
|
|
||
| {% include molt/molt-limitations.md %} | ||
|
|
||
| ## Step 1: Prepare the source database | ||
|
|
||
| In this step, you will: | ||
|
|
||
| - [Create a dedicated migration user on your source database](#create-migration-user-on-source-database). | ||
|
|
||
| {% include molt/migration-prepare-database.md %} | ||
|
|
||
| ## Step 2: Prepare the target database | ||
|
|
||
| In this step, you will: | ||
|
|
||
| - [Provision and run a new CockroachDB cluster](#provision-a-cockroachdb-cluster). | ||
| - [Define the tables on the target cluster](#define-the-target-tables) to match those on the source. | ||
| - [Create a SQL user on the target cluster](#create-the-sql-user) with the necessary write permissions. | ||
|
|
||
| ### Provision a CockroachDB cluster | ||
|
|
||
| Use one of the following options to create and run a new CockroachDB cluster. This is your migration **target**. | ||
|
|
||
| #### Option 1: Create a secure cluster locally | ||
|
|
||
| If you have the CockroachDB binary installed locally, you can manually deploy a multi-node, self-hosted CockroachDB cluster on your local machine. | ||
|
|
||
| Learn how to [deploy a CockroachDB cluster locally]({% link {{ site.versions["stable"] }}/secure-a-cluster.md %}). | ||
|
|
||
| #### Option 2: Create a CockroachDB Self-Hosted cluster on AWS | ||
|
|
||
| You can manually deploy a multi-node, self-hosted CockroachDB cluster on Amazon's AWS EC2 platform, using AWS's managed load-balancing service to distribute client traffic. | ||
|
|
||
| Learn how to [deploy a CockroachDB cluster on AWS]({% link {{ site.versions["stable"] }}/deploy-cockroachdb-on-aws.md %}). | ||
|
|
||
| #### Option 3: Create a CockroachDB Cloud cluster | ||
|
|
||
| CockroachDB Cloud is a fully-managed service run by Cockroach Labs, which simplifies the deployment and management of CockroachDB. | ||
|
|
||
| [Sign up for a CockroachDB Cloud account](https://cockroachlabs.cloud) and [create a cluster]({% link cockroachcloud/create-your-cluster.md %}) using [trial credits]({% link cockroachcloud/free-trial.md %}). | ||
|
|
||
| ### Define the target tables | ||
|
|
||
| {% include molt/migration-prepare-schema.md %} | ||
|
|
||
| ### Create the SQL user | ||
|
|
||
| {% include molt/migration-create-sql-user.md %} | ||
|
|
||
| ## Step 3: Stop application traffic | ||
|
|
||
| With both the source and target databases prepared for the data load, it's time to stop application traffic to the source. At the start of the maintenance window, scale down the Kubernetes cluster to zero pods. | ||
|
|
||
| {% include_cached copy-clipboard.html %} | ||
| ~~~shell | ||
| kubectl scale deployment app --replicas=0 | ||
| ~~~ | ||
|
|
||
| {{ site.data.alerts.callout_danger }} | ||
| Application downtime begins now. | ||
|
|
||
| It is strongly recommended that you perform a dry run of this migration in a test environment. This will allow you to practice using the MOLT tools in real time, and it will give you an accurate sense of how long application downtime might last. | ||
| {{ site.data.alerts.end }} | ||
|
|
||
| ## Step 4: Load data into CockroachDB | ||
bsanchez-the-roach marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| In this step, you will: | ||
|
|
||
| - [Configure MOLT Fetch with the flags needed for your migration](#configure-molt-fetch). | ||
| - [Run MOLT Fetch](#run-molt-fetch). | ||
| - [Understand how to continue a load after an interruption](#continue-molt-fetch-after-an-interruption). | ||
|
|
||
| ### Configure MOLT Fetch | ||
|
|
||
| The [MOLT Fetch documentation]({% link molt/molt-fetch.md %}) includes detailed information about how to [configure MOLT Fetch]({% link molt/molt-fetch.md %}#run-molt-fetch), and how to [monitor MOLT Fetch metrics]({% link molt/molt-fetch-monitoring.md %}). | ||
|
|
||
| When you run `molt fetch`, you can configure the following options for data load: | ||
|
|
||
| <a id="schema-and-table-filtering"></a> | ||
| <a id="source-connection-string"></a> | ||
| <a id="table-handling-mode"></a> | ||
| <a id="target-connection-string"></a> | ||
| <a id="cloud-storage-authentication"></a> | ||
| <a id="secure-connections"></a> | ||
| <a id="intermediate-file-storage"></a> | ||
| <a id="data-load-mode"></a> | ||
| <a id="connection-strings"></a> | ||
|
|
||
| - [Specify source and target databases]({% link molt/molt-fetch.md %}#specify-source-and-target-databases): Specify URL‑encoded source and target connections. | ||
| - [Select data to migrate]({% link molt/molt-fetch.md %}#select-data-to-migrate): Specify schema and table names to migrate. | ||
| - [Define intermediate file storage]({% link molt/molt-fetch.md %}#define-intermediate-storage): Export data to cloud storage or a local file server. | ||
| - [Define fetch mode]({% link molt/molt-fetch.md %}#define-fetch-mode): Specifies whether data will only be loaded into/from intermediate storage. | ||
| - [Shard tables]({% link molt/molt-fetch.md %}#shard-tables-for-concurrent-export): Divide larger tables into multiple shards during data export. | ||
| - [Data load mode]({% link molt/molt-fetch.md %}#import-into-vs-copy-from): Choose between `IMPORT INTO` and `COPY FROM`. | ||
| - [Table handling mode]({% link molt/molt-fetch.md %}#handle-target-tables): Determine how existing target tables are initialized before load. | ||
| - [Define data transformations]({% link molt/molt-fetch.md %}#define-transformations): Define any row-level transformations to apply to the data before it reaches the target. | ||
| - [Monitor fetch metrics]({% link molt/molt-fetch-monitoring.md %}): Configure metrics collection during initial data load. | ||
|
|
||
| Read through the documentation to understand how to configure your `molt fetch` command and its flags. Follow [best practices]({% link molt/molt-fetch-best-practices.md %}), especially those related to security. | ||
|
|
||
| At minimum, the `molt fetch` command should include the source, target, data path, and [`--ignore-replication-check`]({% link molt/molt-fetch-commands-and-flags.md %}#ignore-replication-check) flags: | ||
|
|
||
| {% include_cached copy-clipboard.html %} | ||
| ~~~ shell | ||
| molt fetch \ | ||
| --source $SOURCE \ | ||
| --target $TARGET \ | ||
| --bucket-path 's3://bucket/path' \ | ||
| --ignore-replication-check | ||
| ~~~ | ||
|
|
||
| However, depending on the needs of your migration, you may have many more flags set, and you may need to prepare some accompanying .json files. | ||
|
|
||
| ### Run MOLT Fetch | ||
|
|
||
| Perform the bulk load of the source data. | ||
|
|
||
| 1. Run the [MOLT Fetch]({% link molt/molt-fetch.md %}) command to move the source data into CockroachDB. This example command passes the source and target connection strings [as environment variables](#secure-connections), writes [intermediate files](#intermediate-file-storage) to S3 storage, and uses the `truncate-if-exists` [table handling mode](#table-handling-mode) to truncate the target tables before loading data. It limits the migration to a single schema and filters for three specific tables. The [data load mode]({% link molt/molt-fetch.md %}#import-into-vs-copy-from) defaults to `IMPORT INTO`. Include the `--ignore-replication-check` flag to skip replication checkpoint queries, which eliminates the need to configure the source database for logical replication. | ||
|
|
||
| <section class="filter-content" markdown="1" data-scope="postgres"> | ||
| {% include_cached copy-clipboard.html %} | ||
| ~~~ shell | ||
| molt fetch \ | ||
| --source $SOURCE \ | ||
| --target $TARGET \ | ||
| --schema-filter 'migration_schema' \ | ||
| --table-filter 'employees|payments|orders' \ | ||
| --bucket-path 's3://migration/data/cockroach' \ | ||
| --table-handling truncate-if-exists \ | ||
| --ignore-replication-check | ||
| ~~~ | ||
| </section> | ||
|
|
||
| <section class="filter-content" markdown="1" data-scope="mysql"> | ||
| {% include_cached copy-clipboard.html %} | ||
| ~~~ shell | ||
| molt fetch \ | ||
| --source $SOURCE \ | ||
| --target $TARGET \ | ||
| --table-filter 'employees|payments|orders' \ | ||
| --bucket-path 's3://migration/data/cockroach' \ | ||
| --table-handling truncate-if-exists \ | ||
| --ignore-replication-check | ||
| ~~~ | ||
| </section> | ||
|
|
||
| <section class="filter-content" markdown="1" data-scope="oracle"> | ||
| The command assumes an Oracle Multitenant (CDB/PDB) source. [`--source-cdb`]({% link molt/molt-fetch-commands-and-flags.md %}#source-cdb) specifies the container database (CDB) connection string. | ||
|
|
||
| {% include_cached copy-clipboard.html %} | ||
| ~~~ shell | ||
| molt fetch \ | ||
| --source $SOURCE \ | ||
| --source-cdb $SOURCE_CDB \ | ||
| --target $TARGET \ | ||
| --schema-filter 'migration_schema' \ | ||
| --table-filter 'employees|payments|orders' \ | ||
| --bucket-path 's3://migration/data/cockroach' \ | ||
| --table-handling truncate-if-exists \ | ||
| --ignore-replication-check | ||
| ~~~ | ||
| </section> | ||
|
|
||
| {% include molt/fetch-data-load-output.md %} | ||
|
|
||
| ### Continue MOLT Fetch after an interruption | ||
|
|
||
| {% include molt/fetch-continue-after-interruption.md %} | ||
|
|
||
| ## Step 5: Verify the data | ||
|
|
||
| In this step, you will use [MOLT Verify]({% link molt/molt-verify.md %}) to confirm that the source and target data is consistent. This ensures that the data load was successful. | ||
|
|
||
| ### Run MOLT Verify | ||
|
|
||
| {% include molt/verify-output.md %} | ||
|
|
||
| ## Step 6: Finalize the target schema | ||
|
|
||
| ### Add constraints and indexes | ||
|
|
||
| {% include molt/migration-modify-target-schema.md %} | ||
|
|
||
| ## Step 7: Cut over application traffic | ||
|
|
||
| With the target cluster verified and finalized, it's time to resume application traffic. | ||
|
|
||
| ### Modify application code | ||
|
|
||
| In the application back end, make sure that the application now directs traffic to the CockroachDB cluster. For example: | ||
|
|
||
| ~~~yml | ||
| env: | ||
| - name: DATABASE_URL | ||
| value: postgres://root@localhost:26257/defaultdb?sslmode=verify-full | ||
| ~~~ | ||
|
|
||
| ### Resume application traffic | ||
|
|
||
| Scale up the Kubernetes deployment to the original number of replicas: | ||
|
|
||
| {% include_cached copy-clipboard.html %} | ||
| ~~~shell | ||
| kubectl scale deployment app --replicas=3 | ||
| ~~~ | ||
|
|
||
| This ends downtime. | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| {% include molt/molt-troubleshooting-fetch.md %} | ||
|
|
||
| ## See also | ||
|
|
||
| - [MOLT Fetch]({% link molt/molt-fetch.md %}) | ||
| - [MOLT Verify]({% link molt/molt-verify.md %}) | ||
| - [Migration Overview]({% link molt/migration-overview.md %}) | ||
| - [MOLT Schema Conversion Tool]({% link cockroachcloud/migrations-page.md %}) | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.