Add a new column enumerating rows by adding a column of incremental or uuid identifiers. Can also be used to copy a column or fill a new column with a constant value.
Table of Contents | Source: src/cmd/enumerate.rs | 👆
Description | Examples | Usage | Enum Options | Common Options
Description ↩
Add a new column enumerating the lines of a CSV file. This can be useful to keep track of a specific line order, give a unique identifier to each line or even make a copy of the contents of a column.
The enum function has four modes of operation:
- INCREMENT. Add an incremental identifier to each of the lines:
$ qsv enum file.csv- UUID4. Add a uuid v4 to each of the lines:
$ qsv enum --uuid4 file.csv- UUID7. Add a uuid v7 to each of the lines:
$ qsv enum --uuid7 file.csv- CONSTANT. Create a new column filled with a given value:
$ qsv enum --constant 0- COPY. Copy the contents of a column to a new one:
$ qsv enum --copy names- HASH. Create a new column with the deterministic hash of the given column/s. The hash uses the xxHash algorithm and is platform-agnostic. (see https://github.com/DoumanAsh/xxhash-rust for more information):
$ qsv enum --hash 1- // hash all columns, auto-ignores existing "hash" column$ qsv enum --hash col2,col3,col4 // hash specific columns$ qsv enum --hash col2 // hash a single column$ qsv enum --hash /record_id|name|address/ // hash columns that match a regex$ qsv enum --hash !/record_id/ // hash all columns except the record_id columnFinally, you should also be able to shuffle the lines of a CSV file by sorting on the generated uuid4s:
$ qsv enum --uuid4 file.csv | qsv sort -s uuid4 > shuffled.csvThis will shuffle the lines of the file.csv file as uuids generated using the v4 specification are random and for practical purposes, are unique (1 in 2^122). See https://en.wikipedia.org/wiki/Universally_unique_identifier#Collisions
However, sorting on uuid7 identifiers will not work as they are time-based and monotonically increasing, and will not shuffle the lines.
Examples ↩
Add an incremental index column starting from 0 (default)
qsv enum data.csvAdd an incremental index column starting from 100 and incrementing by 10
qsv enum --start 100 --increment 10 data.csvAdd a uuid v4 column
qsv enum --uuid4 data.csvAdd a uuid v7 column
qsv enum --uuid7 data.csvAdd a constant column with the value "active"
qsv enum --constant active data.csvAdd a constant column with null values
qsv enum --constant "<NULL>" data.csvAdd a copy of the "username" column as "username_copy"
qsv enum --copy username data.csvAdd a hash column with the hash of columns "first_name" and "last_name"
qsv enum --hash first_name,last_name data.csvAdd a hash column with the hash of all columns except an existing "hash" column
qsv enum --hash 1- data.csvAdd a hash column with the hash of all columns except "id" and "uuid" columns
qsv enum --hash "!id,!uuid" data.csvAdd a hash column with the hash of all columns that match the regex "record|name|address"
qsv enum --hash "/record|name|address/" data.csvFor more examples, see tests.
Usage ↩
qsv enum [options] [<input>]
qsv enum --helpEnum Options ↩
| Option | Type | Description | Default |
|---|---|---|---|
‑c,‑‑new‑column |
string | Name of the column to create. Will default to "index". | |
‑‑start |
string | The value to start the enumeration from. Only applies in Increment mode. (default: 0) | |
‑‑increment |
string | The value to increment the enumeration by. Only applies in Increment mode. (default: 1) | |
‑‑constant |
string | Fill a new column with the given value. Changes the default column name to "constant" unless overridden by --new-column. To specify a null value, pass the literal "". | |
‑‑copy |
string | Name of a column to copy. Changes the default column name to "{column}_copy" unless overridden by --new-column. | |
‑‑uuid4 |
flag | When set, the column will be populated with uuids (v4) instead of the incremental identifier. Changes the default column name to "uuid4" unless overridden by --new-column. | |
‑‑uuid7 |
flag | When set, the column will be populated with uuids (v7) instead of the incremental identifier. uuid v7 is a time-based uuid and is monotonically increasing. See https://buildkite.com/blog/goodbye-integers-hello-uuids Changes the default column name to "uuid7" unless overridden by --new-column. | |
‑‑hash |
string | Create a new column filled with the hash of the given column/s. Use "1-" to hash all columns. Changes the default column name to "hash" unless overridden by --new-column. Will remove an existing "hash" column if it exists. |
Common Options ↩
| Option | Type | Description | Default |
|---|---|---|---|
‑h,‑‑help |
flag | Display this message | |
‑o,‑‑output |
string | Write output to instead of stdout. | |
‑n,‑‑no‑headers |
flag | When set, the first row will not be interpreted as headers. | |
‑d,‑‑delimiter |
string | The field delimiter for reading CSV data. Must be a single character. (default: ,) |
Source: src/cmd/enumerate.rs
| Table of Contents | README