Skip to content

Backport(v1.19): benchmark: Use IO#size instead of File.size to speed up file generation on Windows (#5277)#5290

Open
github-actions[bot] wants to merge 1 commit intov1.19from
backport-to-v1.19/pr5277
Open

Backport(v1.19): benchmark: Use IO#size instead of File.size to speed up file generation on Windows (#5277)#5290
github-actions[bot] wants to merge 1 commit intov1.19from
backport-to-v1.19/pr5277

Conversation

@github-actions
Copy link

Which issue(s) this PR fixes:
Backport #5277
Fixes #

What this PR does / why we need it:
Using File.size(path) in a tight loop is very slow, especially on Windows,
because it resolves the path and calls stat every time.

  • benchmark code
require 'bundler/inline'
gemfile do
  source 'https://rubygems.org'
  gem 'benchmark-ips'
end

FILE_PATH = "dummy_file.txt"

File.open(FILE_PATH, "w") do |f|
  Benchmark.ips do |x|
    x.time = 10

    x.report("File.size(path)") { File.size(FILE_PATH) }
    x.report("File#size") { f.size }
  end
end
  • result
ruby 4.0.1 (2026-01-13 revision e04267a14b) +PRISM [x64-mingw-ucrt]
Warming up --------------------------------------
     File.size(path)     3.104k i/100ms
           File#size    17.874k i/100ms
Calculating -------------------------------------
     File.size(path)     31.099k (± 1.5%) i/s   (32.16 μs/i) -    313.504k in  10.083209s
           File#size    182.821k (± 1.4%) i/s    (5.47 μs/i) -      1.841M in  10.072168s

Using the instance method IO#size (f.size) avoids this overhead and
speeds up the 1GB test file generation by about 5x (from ~45s to ~9s) in

task :prepare_1GB do
FileUtils.mkdir_p(File.dirname(BENCHMARK_FILE_PATH))
File.open(BENCHMARK_FILE_PATH, "w") do |f|
data = { "message": "a" * 1024 }.to_json
loop do
f.puts data
break if File.size(BENCHMARK_FILE_PATH) > BENCHMARK_FILE_SIZE
end
end
end

Docs Changes:
N/A

Release Note:
N/A

…on on Windows (#5277)

**Which issue(s) this PR fixes**:
Fixes #

**What this PR does / why we need it**:
Using `File.size(path)` in a tight loop is very slow, especially on
Windows,
because it resolves the path and calls `stat` every time.

* benchmark code
```ruby
require 'bundler/inline'
gemfile do
  source 'https://rubygems.org'
  gem 'benchmark-ips'
end

FILE_PATH = "dummy_file.txt"

File.open(FILE_PATH, "w") do |f|
  Benchmark.ips do |x|
    x.time = 10

    x.report("File.size(path)") { File.size(FILE_PATH) }
    x.report("File#size") { f.size }
  end
end
```

* result
```
ruby 4.0.1 (2026-01-13 revision e04267a14b) +PRISM [x64-mingw-ucrt]
Warming up --------------------------------------
     File.size(path)     3.104k i/100ms
           File#size    17.874k i/100ms
Calculating -------------------------------------
     File.size(path)     31.099k (± 1.5%) i/s   (32.16 μs/i) -    313.504k in  10.083209s
           File#size    182.821k (± 1.4%) i/s    (5.47 μs/i) -      1.841M in  10.072168s
```

Using the instance method `IO#size` (f.size) avoids this overhead and
speeds up the 1GB test file generation by about 5x (from ~45s to ~9s) in

https://github.com/fluent/fluentd/blob/b819ccf772e1036adb17f49682c6b7053713066d/tasks/benchmark.rb#L13-L23

**Docs Changes**:
N/A

**Release Note**:
N/A

Signed-off-by: Shizuo Fujita <fujita@clear-code.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant