Skip to content

fix: leaking password in logs#75

Merged
yyzxw merged 2 commits into
BaizeAI:mainfrom
usernameisnull:fix/leaking-password-in-log
Apr 13, 2026
Merged

fix: leaking password in logs#75
yyzxw merged 2 commits into
BaizeAI:mainfrom
usernameisnull:fix/leaking-password-in-log

Conversation

@usernameisnull
Copy link
Copy Markdown
Contributor

Sanitize sensitive credentials across all data-source loaders and the shared command utility to prevent passwords, tokens, and keys from appearing in logs and error messages.

when use fakeuser/fake/password in git, it outputs the https://fakeuser:fakepass%2F123@gitlab.daocloud.cn/ndx/mcamel/mcamel-mysql.git

➜ kl dataset-git0401-round-1-6sfdk|grep fake
time="2026-04-01T06:34:18Z" level=debug msg="executing command" func="utils.ExecuteCommandWithAllOutput()" file="command.go:16" alteredFromURI="https://******:******@gitlab.daocloud.cn/ndx/mcamel/mcamel-mysql.git" applicationWorkingDirectory=/workspace branch= cloneToPath=. command="/usr/bin/git clone https://fakeuser:fakepass%2F123@gitlab.daocloud.cn/ndx/mcamel/mcamel-mysql.git . -v" commit=HEAD depth= fromURI="https://gitlab.daocloud.cn/ndx/mcamel/mcamel-mysql.git" path=. root=/baize/dataset/data submodules= type=GIT workingDirectory=/baize/dataset/data
time="2026-04-01T06:34:18Z" level=debug msg="command output: " func="utils.ExecuteCommandWithAllOutput()" file="command.go:43" alteredFromURI="https://******:******@gitlab.daocloud.cn/ndx/mcamel/mcamel-mysql.git" applicationWorkingDirectory=/workspace branch= cloneToPath=. command="/usr/bin/git clone https://fakeuser:fakepass%2F123@gitlab.daocloud.cn/ndx/mcamel/mcamel-mysql.git . -v" commit=HEAD depth= fromURI="https://gitlab.daocloud.cn/ndx/mcamel/mcamel-mysql.git" path=. root=/baize/dataset/data submodules= type=GIT workingDirectory=/baize/dataset/data
time="2026-04-01T06:34:18Z" level=error msg="command failed to execute, error: Cloning into '.'...\nremote: HTTP Basic: Access denied\nfatal: Authentication failed for 'https://gitlab.daocloud.cn/ndx/mcamel/mcamel-mysql.git/'\n" func="utils.ExecuteCommandWithAllOutput()" file="command.go:45" alteredFromURI="https://******:******@gitlab.daocloud.cn/ndx/mcamel/mcamel-mysql.git" applicationWorkingDirectory=/workspace branch= cloneToPath=. command="/usr/bin/git clone https://fakeuser:fakepass%2F123@gitlab.daocloud.cn/ndx/mcamel/mcamel-mysql.git . -v" commit=HEAD depth= fromURI="https://gitlab.daocloud.cn/ndx/mcamel/mcamel-mysql.git" path=. root=/baize/dataset/data submodules= type=GIT workingDirectory=/baize/dataset/data
failed to load data: failed to execute command /usr/bin/git clone https://fakeuser:fakepass%2F123@gitlab.daocloud.cn/ndx/mcamel/mcamel-mysql.git . -v, err: exit status 128

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request upgrades the project to Go 1.26.1 and Python 3.14, while also updating several core dependencies. The primary functional change is the implementation of more robust secret masking across various data sources (Git, HTTP, HuggingFace, ModelScope, and S3) to prevent sensitive credentials from leaking into logs and error messages. The feedback focuses on extending this masking to URIs that might contain embedded credentials and optimizing the secret obscuration logic to handle large keys more efficiently and avoid redundant processing.

Comment thread internal/pkg/datasources/datasource_http.go
Comment thread internal/pkg/datasources/datasource_git.go
Comment thread internal/pkg/datasources/datasource_s3.go
@yyzxw
Copy link
Copy Markdown
Contributor

yyzxw commented Apr 13, 2026

/lgtm

@yyzxw yyzxw merged commit 5fe78c8 into BaizeAI:main Apr 13, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants