-
Notifications
You must be signed in to change notification settings - Fork 525
docs: document phone number normalization migration #2510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
b681f9b
fc9869f
ca76591
1db2440
e4eb1f1
11a798b
bcd209a
7d5df3d
52f7095
0d7af17
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,126 @@ | ||||||
| --- | ||||||
| id: normalize-phone-numbers | ||||||
| title: Normalize phone numbers to E.164 | ||||||
| sidebar_label: Normalize phone numbers | ||||||
| --- | ||||||
|
|
||||||
| Ory Kratos normalizes phone numbers to [E.164 format](https://en.wikipedia.org/wiki/E.164) when they're used as identifiers, | ||||||
| verifiable addresses, or recovery addresses. New data is normalized on write. Existing data continues to work through a | ||||||
| backward-compatible lookup, but you should run the `normalize-phone-numbers` migration command after upgrading to converge all | ||||||
| rows to E.164. | ||||||
|
|
||||||
| This guide is for self-hosted Kratos administrators (OSS and OEL). Ory Network customers don't need to take any action. | ||||||
|
|
||||||
| :::important | ||||||
|
|
||||||
| Back up your database before running the migration. The migration doesn't store the original value, therefore there's no automatic | ||||||
| rollback after migration. To revert, you will need to restore your backed-up database. | ||||||
|
|
||||||
| ::: | ||||||
|
|
||||||
| ## Why normalize | ||||||
|
|
||||||
| Before this change, Kratos stored phone numbers exactly as users entered them. A user who registered with `+49 176 671 11 638` and | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| another who registered with `+4917667111638` would create two separate identities for the same phone number. Lookups, recovery, | ||||||
| and verification could behave inconsistently depending on the input format. | ||||||
|
|
||||||
| After normalization, all phone numbers are stored in E.164 format (for example, `+4917667111638`). Lookups match regardless of how | ||||||
| the user formatted the input. | ||||||
|
|
||||||
| ## Rollout sequence | ||||||
|
|
||||||
| :::caution | ||||||
|
|
||||||
| Don't run the migration before deploying the new Kratos version. The previous version does exact-string matching on identifiers. | ||||||
| If you normalize the database first, users who type their phone number in the original (non-E.164) format won't be able to log in | ||||||
| until the new code is deployed. | ||||||
|
|
||||||
| ::: | ||||||
|
|
||||||
| Run the steps in this exact order: | ||||||
|
|
||||||
| 1. **Deploy the new Kratos version.** | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Please check the styleguide in README.md or give it to your LLM - no bolding. |
||||||
| The new code normalizes phone numbers on write and uses a backward-compatible lookup that matches both E.164 and legacy | ||||||
| formats. Existing users can still log in with whatever format they originally registered with. | ||||||
|
|
||||||
| 2. **Run the migration command.** | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
See above |
||||||
| After the deploy completes and traffic is stable, run: | ||||||
|
|
||||||
| ``` | ||||||
| kratos migrate normalize-phone-numbers <database-url> | ||||||
| ``` | ||||||
|
|
||||||
| Or with the DSN from the environment: | ||||||
|
|
||||||
| ``` | ||||||
| export DSN=... | ||||||
| kratos migrate normalize-phone-numbers -e | ||||||
| ``` | ||||||
|
|
||||||
| The command iterates over `identity_credential_identifiers`, `identity_verifiable_addresses`, and `identity_recovery_addresses` | ||||||
| and rewrites any non-E.164 phone numbers in place. | ||||||
|
|
||||||
| ## What the command does | ||||||
|
|
||||||
| The command uses keyset pagination to scan three tables in batches: | ||||||
|
|
||||||
| | Table | Column | Filter | | ||||||
| | --------------------------------- | ------------ | ---------------------- | | ||||||
| | `identity_credential_identifiers` | `identifier` | `identifier LIKE '+%'` | | ||||||
| | `identity_verifiable_addresses` | `value` | `via = 'sms'` | | ||||||
| | `identity_recovery_addresses` | `value` | `via = 'sms'` | | ||||||
|
|
||||||
| For each row, the command parses the value with the [`nyaruka/phonenumbers`](https://github.com/nyaruka/phonenumbers) library and | ||||||
| rewrites it to E.164 if parsing succeeds. Rows that fail to parse (for example, an OIDC subject that happens to start with `+`) | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| are left untouched and counted as skipped. | ||||||
|
|
||||||
| The command is **idempotent**: running it twice is safe. The second run only reports skipped rows. | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
see above |
||||||
|
|
||||||
| ## Flags | ||||||
|
|
||||||
| | Flag | Default | Description | | ||||||
| | ----------------------- | ------- | ------------------------------------------------------------------------ | | ||||||
| | `-e`, `--read-from-env` | `false` | Read the database connection string from the `DSN` environment variable. | | ||||||
| | `-b`, `--batch-size` | `1000` | Number of rows to process per batch. | | ||||||
| | `--dry-run` | `false` | Report what would change without writing. | | ||||||
|
|
||||||
| Use `--dry-run` first to preview the changes: | ||||||
|
|
||||||
| ``` | ||||||
| kratos migrate normalize-phone-numbers --dry-run -e | ||||||
| ``` | ||||||
|
|
||||||
| Each row that would be updated is printed in the form: | ||||||
|
|
||||||
| ``` | ||||||
| [dry-run] identity_credential_identifiers <id>: "+49 176 671 11 638" -> "+4917667111638" | ||||||
| ``` | ||||||
|
|
||||||
| ## Output | ||||||
|
|
||||||
| After processing all three tables, the command prints a summary: | ||||||
|
|
||||||
| ``` | ||||||
| === Summary === | ||||||
| identity_credential_identifiers: scanned=1234 updated=42 skipped=1192 errors=0 | ||||||
| identity_verifiable_addresses: scanned=987 updated=15 skipped=972 errors=0 | ||||||
| identity_recovery_addresses: scanned=987 updated=15 skipped=972 errors=0 | ||||||
| ``` | ||||||
|
|
||||||
| - `scanned`: rows examined. | ||||||
| - `updated`: rows rewritten to E.164 (or rows that _would_ be rewritten in dry-run mode). | ||||||
| - `skipped`: rows already in E.164 format, or values that aren't valid phone numbers. | ||||||
| - `errors`: rows that failed to update. Errors are logged to stderr with the row ID and source value. | ||||||
|
|
||||||
| ## Duplicate handling | ||||||
|
|
||||||
| If the migration finds two rows that normalize to the same E.164 value (for example, `+49 176 671 11 638` and `+4917667111638` for | ||||||
| the same user), the update fails on the second row with a unique constraint violation, which the command logs as an error and | ||||||
| skips. You can resolve the duplicate manually and re-run the command. | ||||||
|
|
||||||
| In practice, duplicates are rare. Most identities have only one phone identifier per credential type. | ||||||
|
|
||||||
| ## Rolling back | ||||||
|
|
||||||
| The migration only converts non-E.164 values to E.164. It doesn't store the original value, so there's no automatic rollback. If | ||||||
| you need to revert, restore from the backup you took before running the command. | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.