-
Notifications
You must be signed in to change notification settings - Fork 278
update Bytehouse result #905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gaoyuanning
wants to merge
1
commit into
ClickHouse:main
Choose a base branch
from
gaoyuanning:clickbench_bytehouse
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,73 +1,18 @@ | ||
| Bytehouse is a derivative of ClickHouse. | ||
| It is based on very old ClickHouse version (20.4.54418) and many features are unsupported. | ||
| # ByteHouse ClickBench Reproduction | ||
|
|
||
| ## Status | ||
| ## Reproduce The Result | ||
|
|
||
| ByteHouse's international cloud (bytehouse.cloud) is no longer reachable | ||
| from outside the China region. The service still operates within China | ||
| via Volcengine. All existing results in this directory were collected | ||
| against the international cloud and have been re-tagged with | ||
| `"historical"`. Future submissions running against a self-managed | ||
| ByteHouse instance (or via Volcengine) should not be tagged historical. | ||
| If you want to reproduce the benchmark result, please send an email to [gaoyuanning@bytedance.com](mailto:gaoyuanning@bytedance.com) to get the EC2 login information. | ||
|
|
||
| https://bytehouse.cloud/signup | ||
| After logging in to the EC2 instance: | ||
|
|
||
| Sign Up. Only Asia-Pacific South-East 1 AWS region is available. Verify email. | ||
|
|
||
| Create virtual warehouse. Size L. | ||
|
|
||
| Go to "Databases" and create database "test". | ||
|
|
||
| Go to "SQL Worksheet" and copy-paste create.sql query there. | ||
|
|
||
| Note: S3 import does not support public buckets. And it requires pasting secret access key, which we are not going to do. So, switch to using CLI. | ||
|
|
||
| Create a machine in ap-southeast-1 region and install Bytehouse CLI: | ||
|
|
||
| ``` | ||
| wget --continue --progress=dot:giga https://github.com/bytehouse-cloud/cli/releases/download/v1.5.34/bytehouse-cli_1.5.34_Linux_x86_64.tar.gz | ||
| tar xvf bytehouse-cli_1.5.34_Linux_x86_64.tar.gz | ||
| ``` | ||
|
|
||
| ``` | ||
| export user='...' | ||
| export password='...' | ||
| export account='AWS...' | ||
| export warehouse='test' | ||
| ``` | ||
|
|
||
| ``` | ||
| wget --continue --progress=dot:giga 'https://datasets.clickhouse.com/hits_compatible/hits.csv.gz' | ||
| gzip -d -f hits.csv.gz | ||
| ```bash | ||
| git clone <github_repo_url> | ||
| cd ClickBench/bytehouse | ||
| ./benchmark.sh | ||
| ``` | ||
|
|
||
| Load the data: | ||
|
|
||
| ``` | ||
| echo -n "Load time: " | ||
| command time -f '%e' ./bytehouse-cli --user "$user" --account "$account" --password "$password" --region ap-southeast-1 --secure --warehouse "$warehouse" --query "INSERT INTO test.hits FORMAT CSV" < hits.csv | ||
| ``` | ||
|
|
||
| ``` | ||
| 99,997,497 total rows sent, 0 rows/s (81.14 GB, 0.00 B/s) | ||
| total rows sent: 99,997,497, average speed = 134,320 rows/s | ||
| Elapsed: 12m24.754608947s. 81.14 GB (108.94 MB/s). | ||
| ─── End of Execution ─── | ||
|
|
||
| real 12m25.310s | ||
| ``` | ||
|
|
||
| Run the benchmark: | ||
|
|
||
| ``` | ||
| ./run.sh 2>&1 | tee log.txt | ||
|
|
||
| cat log.txt | grep --text -F 'Elapsed' | | ||
| grep --text -oP 'Elapsed: [\d\.]+(ms|s)\. Processed: \d+ row' | | ||
| sed -r -e 's/Elapsed: ([0-9\.]+)(ms|s)\. Processed: ([0-9]+) row/\1 \2 \3/' | | ||
| awk '{ if ($3 == 0) { print "null" } else if ($2 == "ms") { print $1 / 1000 } else { print $1 } }' | | ||
| awk '{ if (i % 3 == 0) { printf "[" }; printf $1; if (i % 3 != 2) { printf "," } else { print "]," }; ++i; }' | ||
| ``` | ||
| ## Notes | ||
|
|
||
| Note: cluster size L is the maximum that can be created. | ||
| An attempt to create XL gives "Failed AWAITING RESOURCES". | ||
| - Please use the EC2 environment provided through email for reproduction. | ||
| - Run the benchmark inside the `bytehouse` directory. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This violates the reproducibility requirement.