Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion matrix-bom/bom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@
<matrixLoggingVersion>0.1.0-SNAPSHOT</matrixLoggingVersion>
<matrixParquetVersion>0.5.0</matrixParquetVersion>
<matrixSmileVersion>0.1.1-SNAPSHOT</matrixSmileVersion>
<matrixSpreadsheetVersion>2.3.1-SNAPSHOT</matrixSpreadsheetVersion>
<matrixSpreadsheetVersion>2.4.0</matrixSpreadsheetVersion>
<matrixSqlVersion>2.4.0</matrixSqlVersion>
<matrixStatsVersion>2.4.0</matrixStatsVersion>
<matrixTablesawVersion>0.3.0</matrixTablesawVersion>
Expand Down
19 changes: 9 additions & 10 deletions matrix-spreadsheet/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,27 +9,27 @@ for [Renjin R](https://github.com/bedatadriven/renjin).

To use it, add the following to your gradle build script:
```groovy
implementation 'org.apache.groovy:groovy:5.0.4'
implementation 'se.alipsa.matrix:matrix-core:3.6.0'
implementation 'se.alipsa.matrix:matrix-spreadsheet:2.3.0'
implementation 'org.apache.groovy:groovy:5.0.5'
implementation 'se.alipsa.matrix:matrix-core:3.7.1'
implementation 'se.alipsa.matrix:matrix-spreadsheet:2.4.0'
```
or if you use maven:
```xml
<dependencies>
<dependency>
<groupId>org.apache.groovy</groupId>
<artifactId>groovy</artifactId>
<version>5.0.4</version>
<version>5.0.5</version>
</dependency>
<dependency>
<groupId>se.alipsa.matrix</groupId>
<artifactId>matrix-core</artifactId>
<version>3.6.0</version>
<version>3.7.1</version>
</dependency>
<dependency>
<groupId>se.alipsa.matrix</groupId>
<artifactId>matrix-spreadsheet</artifactId>
<version>2.3.0</version>
<version>2.4.0</version>
</dependency>
</dependencies>
```
Expand Down Expand Up @@ -210,6 +210,7 @@ The following table illustrates the version compatibility of the matrix-spreadsh

| Matrix spreadsheet | Matrix core |
|-------------------:|---------------:|
| 2.4.0 | 3.7.1 |
| 2.3.0 | 3.4.1 |
| 2.2.0 | 3.2.0 -> 3.3.0 |
| 2.1.0 | 3.1.0 |
Expand Down Expand Up @@ -245,7 +246,5 @@ Used to define the data format i.e. the result from an import or the data to exp
- URL: https://github.com/Alipsa/matrix
- License: MIT

### Log4j
Used to handle logging
- URL: https://logging.apache.org/log4j/2.x/
- License: Apache 2.0
### matrix-core Logger
Uses the lightweight JDK System.Logger facade via matrix-core; no direct logging framework dependency.
2 changes: 1 addition & 1 deletion matrix-spreadsheet/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ plugins {
}

group = 'se.alipsa.matrix'
version = '2.3.1-SNAPSHOT'
version = '2.4.0'
description = 'Matrix spreadsheet import/export'

codenarc {
Expand Down
29 changes: 29 additions & 0 deletions matrix-spreadsheet/release.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,34 @@
# Release history

## v2.4.0, 2026-05-05
**Refactoring, code quality, and usability improvements**

### Breaking Changes
- `Sheet.name` field type tightened from `Object` to `String`
- `Importer` interface: `startColumn`/`endColumn` parameters renamed to `startCol`/`endCol` for consistency

### New Features
- add whole-sheet convenience imports with auto-detected dimensions: `SpreadsheetImporter.importSpreadsheet(file, sheetNumber)` and `importSpreadsheet(file, sheetName)`
- add `File`-accepting overloads to `SpreadsheetImporter` matching `SpreadsheetWriter`'s API shape

### Bug Fixes
- remove dead I/O and wasted workbook allocation in `FExcelExporter.exportExcel` when target file already exists
- replace undeclared `commons-io` transitive dependency in `FOdsImporter` with Groovy's native `is.bytes`
- improve error messages when a named sheet does not exist (names the missing sheet instead of generic "No value present")
- standardise rounding mode in `TableUtil.round` to `HALF_EVEN` (banker's rounding) across all numeric column types

### Code Quality
- add class-level GroovyDoc and fix `@param` mismatches in `SpreadsheetImporter`
- add GroovyDoc to `SpreadsheetWriter.writeSheets(Map)`
- fix stale/misleading GroovyDoc on `FExcelExporter.exportExcel` overloads
- make `SpreadsheetImporter.validateNotNull` private
- add `@CompileStatic` to `ValueExtractor`
- replace generic `Exception` throws in `FileUtil` with `SpreadsheetImportException`
- document why `FExcelExporter` cannot use `@CompileStatic` (fastexcel internal `GenericStyleSetter` access)

### Test Coverage
- 124 tests passing (up from 105 in v2.3.0)

## v2.3.0, 2026-01-31
**Major architectural refactoring with significant performance improvements**

Expand Down
155 changes: 155 additions & 0 deletions matrix-spreadsheet/req/v2.4.0-roadmap.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# Matrix Spreadsheet v2.4.0 Release Roadmap

This document outlines the planned work for matrix-spreadsheet v2.4.0.

## Summary

v2.4.0 focuses on targeted refactoring, stronger test coverage (>= 70%),
performance benchmarking, release hygiene, and usability improvements identified
in the v2.3.1 code review.

---

## Priority 1: Testing & Performance (Must Do)

### 1.1 [x] Test coverage plan (target >= 70%)
- ✅ Completed in v2.3.0: Coverage achieved 79.74% (exceeds 70% target)
- See the detailed plan in v2.3.0 roadmap under "Test Coverage & Performance Plan".

### 1.2 [x] Add performance benchmarks for XLSX/ODS read/write
- ✅ Completed in v2.3.0: SpreadsheetBenchmark added with 50k x 12 and Crime_Data tests
- Baseline read/write timing for realistic datasets documented in v2.3.0 roadmap.

### 1.3 [x] Fix any performance regressions found in benchmarks
- ✅ Completed in v2.3.0: ODS performance improved 65-79% via Aalto parser and optimizations
- Before/after metrics captured: 4.86s → 1.70s (medium), 262s → 53s (large)

---

## Priority 2: Refactoring (Should Do)

### 2.1 [ ] Consider splitting FExcelAppender
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/fastexcel/FExcelAppender.groovy`
- **Issue**: ~750 lines with 11 inner classes
- **Consideration**: Could extract `WorkbookState`, `WorkbookPlan`, `SheetTemplate` to separate files
- **Note**: The class is well-structured internally; this is optional cleanup

### 2.2 [ ] Consider using Optional<Integer> in FExcelReader
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/fastexcel/FExcelReader.groovy`
- **Issue**: Returns -1 when sheet/content not found (lines 85, 97, 146)
- **Consideration**: Modern Java practice would use Optional<Integer>
- **Note**: This is a breaking API change; defer to v3.0 if needed

### 2.3 [x] Consider more specific exception types in FileUtil
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/FileUtil.groovy`
- **Issue**: Throws generic `Exception` in several places
- **Fix**: Replaced with `SpreadsheetImportException` (see Priority 4 below)

---

## Priority 3: Bug Fixes & Code Quality (from v2.3.1 review)

### 3.1 [x] Remove dead code and wasted I/O in `FExcelExporter.exportExcel`
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/fastexcel/FExcelExporter.groovy` lines 83–97
- **Issue**: When the target file already exists, the method opens a `ReadableWorkbook` (allocating I/O resources), runs sheet-collision avoidance logic, and then unconditionally throws `IllegalArgumentException`. The collision-avoidance `if/while` block is completely dead — the throw always fires regardless. The workbook is opened and closed for nothing.
- **Fix**: Remove the entire `try` block. Replace the `if (file.exists() && file.length() > 0)` branch with a direct throw:
```groovy
if (file.exists() && file.length() > 0) {
throw new IllegalArgumentException("Appending to an existing Excel file is not supported by FExcelExporter. Use FExcelAppender or SpreadsheetWriter for append/replace operations.")
}
```

### 3.2 [x] Replace undeclared `commons-io` transitive dependency in `FOdsImporter`
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/fastods/FOdsImporter.groovy` line 5, 197
- **Issue**: `IOUtils.toByteArray(is)` is used via `org.apache.commons.io.IOUtils`, which is only available as a transitive dependency from `fast.excel.reader`. It is not declared in `build.gradle`. If the transitive chain changes this dependency will break silently.
- **Fix**: Replace with Groovy's built-in `is.bytes` and remove the import:
```groovy
// Before
import org.apache.commons.io.IOUtils
byte[] bytes = IOUtils.toByteArray(is)
// After
byte[] bytes = is.bytes
```

### 3.3 [x] Make `SpreadsheetImporter.validateNotNull` private
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/SpreadsheetImporter.groovy` line 291
- **Issue**: `static void validateNotNull(Object paramVal, String paramName)` has default (public) Groovy visibility and is therefore part of the public API surface, which is unintentional.
- **Fix**: Add `private` modifier.

### 3.4 [x] Add `@CompileStatic` to `FExcelExporter` or document why it is absent
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/fastexcel/FExcelExporter.groovy` lines 21–22
- **Issue**: `@CompileStatic` is commented out with only a brief note about a module access error on `GenericStyleSetter`. The entire exporter runs with dynamic dispatch as a result.
- **Fix**: Investigate whether the access error is avoidable (e.g. `--add-opens` JVM arg, or avoiding the offending call). If unavoidable, expand the comment to explain the root cause and link to the upstream fastexcel issue so the limitation is tracked rather than forgotten.

### 3.5 [x] Add class-level GroovyDoc to `SpreadsheetImporter`
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/SpreadsheetImporter.groovy`
- **Issue**: `SpreadsheetImporter` is the primary import facade but has no class-level documentation. `SpreadsheetWriter` has thorough class doc including a usage example; `SpreadsheetImporter` should follow the same pattern.

### 3.6 [x] Fix `@param` mismatches in `SpreadsheetImporter.importOds` / `importExcel`
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/SpreadsheetImporter.groovy` lines 41–123
- **Issue**: The `importOds(URL url, ...)` and `importExcel(URL url, ...)` overloads copy-paste the doc block from the file-path variants and say `@param file the ods or excel file to import`, but the parameter is `url`, not `file`. Several overloads also have no GroovyDoc at all.
- **Fix**: Correct the `@param` tag to `@param url` with an appropriate description; add missing GroovyDoc to undocumented overloads.

### 3.7 [x] Add GroovyDoc to `SpreadsheetWriter.writeSheets(Map params)`
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/SpreadsheetWriter.groovy` line 210
- **Issue**: The map-params overload is undocumented — which keys are valid, what types, which are required?
- **Fix**: Add a `@param params` block listing `file` (File), `data` (List&lt;Matrix&gt;), and `sheetNames` (List&lt;String&gt;), matching the style used for the other overloads.

### 3.8 [x] Standardise `startCol`/`endCol` parameter naming across `Importer` and `SpreadsheetImporter`
- **Files**: `src/main/groovy/se/alipsa/matrix/spreadsheet/Importer.groovy`, `SpreadsheetImporter.groovy`
- **Issue**: Some overloads use `startColumn`/`endColumn` (e.g. `importSpreadsheet(String, int, int, int, String, String, boolean)`) while the majority use `startCol`/`endCol`. The inconsistency appears in both the interface and the facade.
- **Fix**: Standardise on `startCol`/`endCol` throughout.

### 3.9 [x] Fix misleading GroovyDoc on `FExcelExporter.exportExcel` overloads
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/fastexcel/FExcelExporter.groovy` lines 29–78
- **Issue**: All four `exportExcel` overloads claim "if the excel file exists and is not empty, a new sheet will be added to the excel". Since v2.3.0 the method unconditionally throws `IllegalArgumentException` when the file already exists. The GroovyDoc is stale and misleading.
- **Fix**: Update the GroovyDoc to state that existing files cause an `IllegalArgumentException` and direct users to `SpreadsheetWriter` or `FExcelAppender` for append/replace behavior.

### 3.10 [x] Add `@CompileStatic` to `ValueExtractor`
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/ValueExtractor.groovy`
- **Issue**: The abstract utility class lacks `@CompileStatic`, running with dynamic dispatch despite being pure type-conversion logic. This is inconsistent with the project's static-compilation preference.
- **Fix**: Add `@CompileStatic` and resolve any typing issues that surface.

### 3.11 [x] Tighten `Sheet.name` field type in `fastods`
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/fastods/Sheet.groovy`
- **Issue**: The `name` field and `getName()` return type are `Object`, but the class only ever stores `String` or `Integer` (via the typed `setName` overloads). Using `Object` hides the actual domain from callers and skips IDE assistance.
- **Fix**: Store name as `String` internally ( callers can pass `Integer` and it is converted immediately in `setName(Number)` ), or split into `sheetName` / `sheetNumber` fields.

### 3.12 [x] Improve error messages when a named sheet does not exist
- **Files**: `src/main/groovy/se/alipsa/matrix/spreadsheet/fastexcel/FExcelImporter.groovy` lines 160, 172; `fastods/FOdsImporter.groovy` equivalent paths
- **Issue**: `workbook.findSheet(sheetName).orElseThrow()` and `odsDataReader.readOds(..., sheetName, ...)` produce generic `NoSuchElementException: No value present` (or null-sheet errors) instead of naming the missing sheet. This compounds the `'Sheet1'` default problem (4.3) by giving users no clue which sheet name was sought.
- **Fix**: Supply explicit error messages such as `"Sheet '${sheetName}' does not exist in the workbook"`.

---

## Priority 4: Usability Improvements

### 4.1 [x] Add `File`-based overloads to `SpreadsheetImporter`
- **Files**: `src/main/groovy/se/alipsa/matrix/spreadsheet/SpreadsheetImporter.groovy`, `Importer.groovy`
- **Issue**: `SpreadsheetWriter.write` accepts `File` as its primary target type, but every `SpreadsheetImporter.importSpreadsheet` overload takes `String` file paths. Users with a `File` object must call `.absolutePath` at every call site.
- **Fix**: Add `File`-accepting overloads that delegate via `file.absolutePath`, matching the writer's API shape.

### 4.2 [x] Add whole-sheet convenience import (auto-detect dimensions)
- **File**: `src/main/groovy/se/alipsa/matrix/spreadsheet/SpreadsheetImporter.groovy`
- **Issue**: Every import method requires explicit `endRow` and `endCol`. The `SpreadsheetFormatProvider` already auto-detects these via `SpreadsheetReader.findLastRow`/`findLastCol`, but this is not exposed through `SpreadsheetImporter`. Importing a whole sheet currently requires 3–4 extra method calls (open reader, find last row, find last col, close, then import).
- **Fix**: Add convenience overloads that use `SpreadsheetReader` internally to detect dimensions:
```groovy
static Matrix importSpreadsheet(File file, int sheetNumber = 1, boolean firstRowAsColNames = true)
static Matrix importSpreadsheet(String file, int sheetNumber = 1, boolean firstRowAsColNames = true)
static Matrix importSpreadsheet(String file, String sheetName, boolean firstRowAsColNames = true)
```

### 4.3 [N/A] Fix `'Sheet1'` hardcoded default in static overloads
- **Decision**: `'Sheet1'` is the universal default sheet name in both Excel and Calc; kept as-is.

### 4.4 [x] Refresh README dependency versions and remove stale Log4j claim
- **File**: `matrix-spreadsheet/README.md`
- **Issue**: The README references outdated versions (`matrix-core:3.6.0`, `groovy:5.0.4`, `matrix-spreadsheet:2.3.0`). It also lists **Log4j** as a third-party library (line 248), but the module migrated to `matrix-core` Logger in v2.3.0 and has no direct Log4j dependency.
- **Fix**: Update version references to current aligned versions and remove the Log4j section.

---

## Priority 5: Release Checklist

- [x] Remove SNAPSHOT suffix from build.gradle version when cutting release
- [x] Update `release.md` v2.4.0 notes with current test count (124 tests) and completed fixes
Original file line number Diff line number Diff line change
Expand Up @@ -18,21 +18,21 @@ class FileUtil {
* Verify that the filePath exists and is reachable
* @param filePath the path + file name of the resource to find
* @return a File of found
* @throws Exception if the filePath cannot be found
* @throws SpreadsheetImportException if the filePath cannot be found
*/
static File checkFilePath(String filePath) throws Exception {
static File checkFilePath(String filePath) throws SpreadsheetImportException {
File excelFile
URL url = getResourceUrl(filePath)
if (url == null) {
throw new Exception(filePath + " does not exist")
throw new SpreadsheetImportException("'${filePath}' does not exist")
}
try {
excelFile = Paths.get(url.toURI()).toFile()
} catch (URISyntaxException | RuntimeException e) {
throw new Exception(filePath + " does not exist", e)
throw new SpreadsheetImportException("'${filePath}' does not exist", e)
}
if (!excelFile.exists()) {
throw new Exception(filePath + " does not exist")
throw new SpreadsheetImportException("'${filePath}' does not exist")
}
return excelFile
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ interface Importer {

Matrix importSpreadsheet(String file, int sheetNumber,
int startRow, int endRow,
String startColumn, String endColumn,
String startCol, String endCol,
boolean firstRowAsColNames)

Matrix importSpreadsheet(String file, String sheetName,
Expand Down
Loading