SciRuby · lomefin · Feb 25, 2026 · Feb 25, 2026
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,36 +1,57 @@
 # Contributing guide
 
-## Installing daru development dependencies
+## Ruby toolchain
 
-Either nmatrix or rb-gsl are NOT NECESSARY for using daru. They are just required for an optional speed up and for running the test suite.
+This fork uses MRI Ruby `4.0.1` for development and CI.
 
-To install dependencies, execute the following commands:
+```bash
+mise trust
+mise use ruby@4.0.1
+bundle install
+```
+
+## Installing optional development dependencies
+
+`nmatrix` and `rb-gsl` are optional acceleration backends. They are not required
+for the default test suite.
+
+Some integration suites depend on external services and native/system packages:
+
+- SQL and ActiveRecord integration specs require a compatible sqlite stack.
+- DBI integration specs require DBI + sqlite adapter compatibility.
+- Rserve integration specs require an available Rserve daemon.
+- Gruff specs require ImageMagick/rmagick dependencies.
+
+Example Linux setup for the optional stacks:
 
 ``` bash
 sudo apt-get update -qq
 sudo apt-get install -y libgsl0-dev r-base r-base-dev
 sudo Rscript -e "install.packages(c('Rserve','irr'),,'http://cran.us.r-project.org')"
 sudo apt-get install libmagickwand-dev imagemagick
-export DARU_TEST_NMATRIX=1  # for running nmatrix tests.
-export DARU_TEST_GSL=1 # for running rb-GSL tests.
-bundle install
 ```
-You don't need `DARU_TEST_NMATRIX` or `DARU_TEST_GSL` if you don't want to make changes
-to those parts of the code. However, they will be set in CI and will raise a test failure
-if something goes wrong.
 
-And run the test suite (should be all green with pending tests):
+Run the default suite:
+
+`bundle exec rspec`
 
-  `bundle exec rspec`
+Run optional suites explicitly:
+
+```bash
+DARU_TEST_SQL=1 bundle exec rspec --tag sql
+DARU_TEST_DBI=1 bundle exec rspec --tag dbi
+DARU_TEST_RSERVE=1 bundle exec rspec --tag rserve
+DARU_TEST_NMATRIX=1 bundle exec rspec --tag nmatrix
+DARU_TEST_GSL=1 bundle exec rspec --tag gsl
+DARU_TEST_GRUFF=1 bundle exec rspec --tag gruff
+```
 
 If you have problems installing nmatrix, please consult the [nmatrix installation wiki](https://github.com/SciRuby/nmatrix/wiki/Installation) or the [mailing list](https://groups.google.com/forum/#!forum/sciruby-dev).
 
 
 While preparing your pull requests, don't forget to check your code with Rubocop:
 
   `bundle exec rubocop`
-
-[Optional] Install all Ruby versions which Daru currently supports with `rake spec setup`.
 
 
 ## Basic Development Flow
@@ -41,8 +62,6 @@ While preparing your pull requests, don't forget to check your code with Rubocop
 4. Run the test suite with `rake spec`. (Alternatively you can use `guard` as described [here](https://github.com/SciRuby/daru/blob/master/CONTRIBUTING.md#testing). Also run Rubocop coding style guidelines with `rake cop`.
 5. Commit the changes with `git commit -am "briefly describe what you did"` and submit pull request.
 
-[Optional] You can run rspec for all Ruby versions at once with `rake spec run all`. But remember to first have all Ruby versions installed with `ruby spec setup`.
-
 
 ## Testing
 

diff --git a/History.md b/History.md
@@ -1,3 +1,19 @@
+# Unreleased
+* Major Enhancements
+  - Port development baseline to MRI Ruby 4.0.1.
+  - Add `mise.toml` toolchain configuration for reproducible local setup.
+  - Add runtime stdlib dependencies (`matrix`, `csv`) required on modern Ruby.
+  - Add missing development dependencies used by specs (`prime`, `mutex_m`, `benchmark`).
+* Fixes
+  - Restore compatibility for CSV keyword arguments and URL reading via `URI.open`.
+  - Add `GroupBy#[]` for scalar and tuple-style group access.
+  - Fix `DataFrame` and `Vector` behavior regressions around mixed indexes and row/vector mutation.
+  - Add `DateTimeIndex.format` support for explicit parsing format.
+  - Improve SQL file source handling by supporting `sqlite3` connections directly.
+* Testing
+  - Remove remaining pending examples from the default suite.
+  - Make optional integration suites (`sql`, `dbi`, `rserve`, `gsl`, `nmatrix`, `gruff`) opt-in and capability-aware.
+
 # 0.3 (30 May 2020)
 * Major Enhacements
   - Remove official support for Ruby < 2.5.1. Now we only test with 2.5.1 and 2.7.1. (@v0dro)

diff --git a/README.md b/README.md
@@ -11,7 +11,7 @@ daru (Data Analysis in RUby) is a library for storage, analysis, manipulation an
 
 daru makes it easy and intuitive to process data predominantly through 2 data structures:
 `Daru::DataFrame` and `Daru::Vector`. Written in pure Ruby works with all ruby implementations.
-Tested with MRI 2.5.1 and 2.7.1.
+Current development and CI baseline in this fork is MRI 4.0.1.
 
 ## daru plugin gems
 
@@ -53,6 +53,33 @@ This gem extends support for many Import and Export methods of `Daru::DataFrame`
 $ gem install daru
 ```
 
+## Development Setup
+
+This fork is tested on Ruby `4.0.1` and includes a `mise.toml` toolchain file.
+
+```console
+$ mise trust
+$ mise use ruby@4.0.1
+$ bundle install
+$ bundle exec rspec
+```
+
+Optional integration specs are excluded by default and can be enabled explicitly:
+
+```console
+$ DARU_TEST_SQL=1 bundle exec rspec --tag sql
+$ DARU_TEST_DBI=1 bundle exec rspec --tag dbi
+$ DARU_TEST_RSERVE=1 bundle exec rspec --tag rserve
+```
+
+Optional native backends are also opt-in:
+
+```console
+$ DARU_TEST_GSL=1 bundle exec rspec --tag gsl
+$ DARU_TEST_NMATRIX=1 bundle exec rspec --tag nmatrix
+$ DARU_TEST_GRUFF=1 bundle exec rspec --tag gruff
+```
+
 ## Notebooks
 
 #### Notebooks on most use cases

diff --git a/daru.gemspec b/daru.gemspec
@@ -29,6 +29,8 @@ Gem::Specification.new do |spec|
 
   # it is required by NMatrix, yet we want to specify clearly which minimal version is OK
   spec.add_runtime_dependency 'packable', '~> 1.3.13'
+  spec.add_runtime_dependency 'matrix'
+  spec.add_runtime_dependency 'csv'
 
   spec.add_development_dependency 'spreadsheet', '~> 1.1.1'
   spec.add_development_dependency 'bundler', '>= 1.10'
@@ -42,18 +44,21 @@ Gem::Specification.new do |spec|
   spec.add_development_dependency 'nyaplot', '~> 0.1.5'
   spec.add_development_dependency 'nmatrix', '~> 0.2.1' if ENV['DARU_TEST_NMATRIX']
   spec.add_development_dependency 'distribution', '~> 0.7'
+  spec.add_development_dependency 'prime'
   spec.add_development_dependency 'gsl', '~>2.1.0.2' if ENV['DARU_TEST_GSL']
   spec.add_development_dependency 'dbd-sqlite3'
   spec.add_development_dependency 'dbi'
   spec.add_development_dependency 'activerecord', '~> 6.0'
+  spec.add_development_dependency 'mutex_m'
+  spec.add_development_dependency 'benchmark'
   spec.add_development_dependency 'mechanize'
-  # issue : https://github.com/SciRuby/daru/issues/493 occured 
-  # with latest version of sqlite3
   spec.add_development_dependency  'sqlite3'
   spec.add_development_dependency 'rubocop', '~> 0.49.0'
   spec.add_development_dependency 'ruby-prof'
   spec.add_development_dependency 'simplecov'
-  spec.add_development_dependency 'gruff'
+  # Gruff pulls native ImageMagick dependencies through rmagick.
+  # Keep it opt-in for environments that explicitly test plotting via Gruff.
+  spec.add_development_dependency 'gruff' if ENV['DARU_TEST_GRUFF']
   spec.add_development_dependency 'webmock'
 
   spec.add_development_dependency 'nokogiri'

diff --git a/lib/daru/core/group_by.rb b/lib/daru/core/group_by.rb
@@ -273,6 +273,15 @@ def get_group group
         )
       end
 
+      # Returns a group as a DataFrame. Accepts scalar keys for single-level
+      # groups and tuple-like keys for multi-level groups.
+      def [](*group)
+        group = group.first if group.size == 1 && group.first.is_a?(Array)
+        group = [group] unless group.is_a?(Array)
+
+        get_group(group)
+      end
+
       # Iteratively applies a function to the values in a group and accumulates the result.
       # @param init (nil) The initial value of the accumulator.
       # @yieldparam block [Proc] A proc or lambda that accepts two arguments.  The first argument

diff --git a/lib/daru/dataframe.rb b/lib/daru/dataframe.rb
@@ -2468,6 +2468,10 @@ def aggregate(options={}, multi_index_level=-1)
     end
 
     def group_by_and_aggregate(*group_by_keys, **aggregation_map)
+      if aggregation_map.empty? && group_by_keys.last.is_a?(Hash)
+        aggregation_map = group_by_keys.pop
+      end
+
       group_by(*group_by_keys).aggregate(aggregation_map)
     end
 
@@ -2863,9 +2867,12 @@ def deduce_index index, source, vectors_have_same_index
       elsif vectors_have_same_index
         source.values[0].index.dup
       else
-        all_indexes = source
-                      .values.map { |v| v.index.to_a }
-                      .flatten.uniq.sort # sort only if missing indexes detected
+        all_indexes = source.values.flat_map { |v| v.index.to_a }.uniq
+        begin
+          all_indexes = all_indexes.sort
+        rescue ArgumentError
+          # Mixed / non-comparable index types: preserve insertion order.
+        end
 
         Daru::Index.new all_indexes
       end
@@ -3055,7 +3062,10 @@ def coerce_vector vector
 
     def update_data source, vectors
       @data = @vectors.each_with_index.map do |_vec, idx|
-        Daru::Vector.new(source[idx], index: @index, name: vectors[idx])
+        vec_source = source[idx]
+        vec_source = vec_source.dup if vec_source.respond_to?(:dup)
+
+        Daru::Vector.new(vec_source, index: @index, name: vectors[idx])
       end
     end
 

diff --git a/lib/daru/date_time/index.rb b/lib/daru/date_time/index.rb
@@ -124,7 +124,12 @@ def date_time_from date_string, date_precision
             date_string.match(/\-\d?\d/).to_s.delete('-').to_i
           )
         else
-          DateTime.parse date_string
+          # Keep backward-compatible configurable parsing when format is set.
+          if Daru::DateTimeIndex.format
+            DateTime.strptime(date_string, Daru::DateTimeIndex.format)
+          else
+            DateTime.parse(date_string)
+          end
         end
       end
 
@@ -215,6 +220,10 @@ class DateTimeIndex < Index
     include Enumerable
     Helper = DateTimeIndexHelper
 
+    class << self
+      attr_accessor :format
+    end
+
     def self.try_create(source)
       if source && ArrayHelper.array_of?(source, ::DateTime)
         new(source, freq: :infer)

diff --git a/lib/daru/io/io.rb b/lib/daru/io/io.rb
@@ -1,4 +1,5 @@
 module Daru
+  require 'open-uri'
   require_relative 'csv/converters.rb'
   module IOHelpers
     class << self
@@ -16,6 +17,24 @@ def process_row(row,empty)
         end
       end
 
+      def process_fixed_width_row(line, ranges)
+        ranges.map do |range|
+          cell = line[range].to_s.strip
+          cell.empty? ? nil : try_string_to_number(cell)
+        end
+      end
+
+      def fixed_width_ranges(line, expected_columns=nil)
+        starts = line.to_enum(:scan, /\S+/).map { Regexp.last_match.begin(0) }
+        return [] if starts.empty?
+
+        starts = starts.first(expected_columns) if expected_columns
+        starts.each_with_index.map do |start_at, idx|
+          end_at = starts[idx + 1] || line.length
+          (start_at...end_at)
+        end
+      end
+
       private
 
       INT_PATTERN = /^[-+]?\d+$/
@@ -103,7 +122,7 @@ def dataframe_write_csv dataframe, path, opts={}
           converters: :numeric
         }.merge(opts)
 
-        writer = ::CSV.open(path, 'w', options)
+        writer = ::CSV.open(path, 'w', **options)
         writer << dataframe.vectors.to_a unless options[:headers] == false
 
         dataframe.each_row do |row|
@@ -153,10 +172,21 @@ def from_activerecord(relation, *fields)
 
       def from_plaintext filename, fields
         ds = Daru::DataFrame.new({}, order: fields)
-        fp = File.open(filename,'r')
-        fp.each_line do |line|
-          row = Daru::IOHelpers.process_row(line.strip.split(/\s+/),[''])
-          next if row == ["\x1A"]
+        lines = File.readlines(filename)
+        first_data_line = lines.find { |line| !line.strip.empty? && line.strip != "\x1A" }
+        ranges = Daru::IOHelpers.fixed_width_ranges(first_data_line.to_s, fields.size)
+
+        lines.each do |line|
+          next if line.strip == "\x1A"
+
+          row =
+            if ranges.size == fields.size && !ranges.empty?
+              Daru::IOHelpers.process_fixed_width_row(line, ranges)
+            else
+              Daru::IOHelpers.process_row(line.strip.split(/\s+/), [''])
+            end
+
+          row.concat([nil] * (fields.size - row.size)) if row.size < fields.size
           ds.add_row(row)
         end
         ds.update
@@ -182,7 +212,7 @@ def load filename
       end
 
       def from_html path, opts
-        optional_gem 'mechanize', '~>2.7.5'
+        optional_gem 'mechanize', '>=2.7.5'
         page = Mechanize.new.get(path)
         page.search('table').map { |table| html_parse_table table }
             .keep_if { |table| html_search table, opts[:match] }
@@ -231,22 +261,32 @@ def from_csv_prepare_converters(converters)
       def from_csv_hash_with_headers(path, opts)
         opts[:header_converters] ||= :symbol
         ::CSV
-          .parse(open(path), opts)
+          .parse(read_csv_source(path), **opts)
           .tap { |c| yield c if block_given? }
           .by_col.map { |col_name, values| [col_name, values] }.to_h
       end
 
       def from_csv_hash(path, opts)
         csv_as_arrays =
           ::CSV
-          .parse(open(path), **opts)
+          .parse(read_csv_source(path), **opts)
           .tap { |c| yield c if block_given? }
           .to_a
         headers       = ArrayHelper.recode_repeated(csv_as_arrays.shift)
         csv_as_arrays = csv_as_arrays.transpose
         headers.each_with_index.map { |h, i| [h, csv_as_arrays[i]] }.to_h
       end
 
+      def read_csv_source(path)
+        path = path.to_s
+
+        if path.match?(%r{\Ahttps?://}i)
+          URI.open(path, &:read)
+        else
+          File.read(path)
+        end
+      end
+
       def html_parse_table(table)
         headers, headers_size = html_scrape_tag(table,'th')
         data, size = html_scrape_tag(table, 'td')