Phase 3: Command Refactoring - Detailed Specification

Created: 2026-01-16 Status: Planning Phase Priority: HIGH Estimated Effort: 15-20 hours Target: Reduce all command files to <100 lines

Executive Summary
Current State Analysis
Operation-to-Command Mapping
Refactoring Strategy
Refactoring Patterns
Implementation Roadmap
Testing Strategy
Risk Assessment
Code Examples
Success Metrics

Executive Summary

Objective

Refactor all CLI command files to use the newly implemented Operations Layer, achieving:

Clean separation of concerns: CLI handles I/O and user interaction only
Reduced complexity: Each command file <100 lines
Improved testability: Commands can be tested with mocked operations
Code reuse: Business logic available for external use

Current Situation

23 command files total
5,915 lines of code across all commands
Average file size: 257 lines (largest: validate.py at 497 lines)
Business logic duplicated across multiple commands
Hard to test: Business logic embedded in CLI code

Target State

23 command files refactored
Target: ~2,300 lines (avg 100 lines per file)
~60% code reduction (~3,600 lines removed)
All business logic in operations layer
Commands as thin wrappers around operations

Business Value

Maintainability: Changes to business logic in one place
Testability: Unit test operations without CLI overhead
Reusability: Operations can be imported by external packages
Clarity: Commands focused on CLI concerns only
Safety: Type-safe error handling with Result types

Current State Analysis

Code Metrics by Command

Command	Lines	Operations Available	Reduction Potential	Priority
validate.py	497	✅ validation.py (6 functions)	~400 lines (80%)	HIGH
stats.py	401	❌ None (needs new)	~300 lines (75%)	MEDIUM
filter.py	314	✅ filtering.py (4 functions)	~220 lines (70%)	HIGH
compare.py	323	✅ comparing.py (5 functions)	~230 lines (71%)	HIGH
clean.py	264	✅ cleaning.py (5 functions)	~170 lines (64%)	HIGH
transform.py	228	✅ transforming.py (3 functions)	~140 lines (61%)	HIGH
fill.py	230	✅ cleaning.py (fill_missing_values)	~140 lines (61%)	HIGH
select.py	240	❌ Simple (df[column])	~150 lines (63%)	LOW
group.py	226	✅ aggregating.py (3 functions)	~140 lines (62%)	HIGH
join.py	224	✅ joining.py (3 functions)	~130 lines (58%)	HIGH
pivot.py	219	✅ pivoting.py (5 functions)	~130 lines (59%)	HIGH
aggregate.py	210	✅ aggregating.py (3 functions)	~120 lines (57%)	HIGH
sort.py	214	✅ sorting.py (2 functions)	~120 lines (56%)	HIGH
dedupe.py	181	✅ cleaning.py (remove_duplicates)	~90 lines (50%)	MEDIUM
append.py	185	✅ joining.py (merge_dataframes)	~100 lines (54%)	MEDIUM
merge.py	141	✅ joining.py (join_dataframes)	~60 lines (43%)	MEDIUM
search.py	186	❌ None (uses filter)	~120 lines (65%)	MEDIUM
rename.py	171	❌ Simple (df.rename)	~90 lines (53%)	LOW
strip.py	148	✅ cleaning.py (trim_whitespace)	~70 lines (47%)	MEDIUM
tail.py	156	❌ Simple (df.tail)	~80 lines (51%)	LOW
head.py	148	❌ Simple (df.head)	~70 lines (47%)	LOW
export.py	153	❌ Core functionality	~80 lines (52%)	LOW
count.py	164	❌ Simple (len(df))	~90 lines (55%)	LOW
convert.py	107	❌ Core functionality	~50 lines (47%)	LOW
info.py	204	❌ Core/metadata	~120 lines (59%)	MEDIUM
unique.py	155	❌ Simple (df.unique)	~80 lines (52%)	LOW

Summary:

15 commands have operations available → ~2,180 lines reduction potential
8 commands need new operations or are simple → ~690 lines reduction potential
Total reduction potential: ~2,870 lines (48%)

Code Duplication Analysis

Pattern 1: File Reading Logic (Present in ~20 commands)

# Repeated in filter.py, sort.py, clean.py, etc.
if isinstance(handler, ExcelHandler):
    sheet_name = sheet
    kwargs = {"sheet_name": sheet_name} if sheet_name else {}
    read_result = handler.read(path, **kwargs)
elif isinstance(handler, CSVHandler):
    encoding_result = handler.detect_encoding(path)
    encoding = unwrap(encoding_result) if is_ok(encoding_result) else "utf-8"
    delimiter_result = handler.detect_delimiter(path, encoding)
    delimiter = unwrap(delimiter_result) if is_ok(delimiter_result) else ","
    read_result = handler.read(path, encoding=encoding, delimiter=delimiter)

Impact: ~15 lines × 20 commands = 300 lines of duplicated code

Pattern 2: Filter Validation Logic (Present in filter.py, sort.py, group.py, etc.)

# Repeated validation logic
DANGEROUS_PATTERNS = ["import", "exec", "eval", "__", ...]
def _validate_condition(condition: str) -> Result[str, str]:
    # Check for dangerous patterns
    # Check length
    # Check balanced parentheses/brackets/quotes

Impact: Security validation duplicated in 5+ commands (~80 lines)

Pattern 3: Display/Output Logic (Present in all commands)

# Repeated in all commands
if output:
    output_path = Path(output)
    write_result = factory.write_file(df_processed, output_path)
    if is_err(write_result):
        error = unwrap_err(write_result)
        typer.echo(f"Error writing file: {error}", err=True)
        raise typer.Exit(1)
    typer.echo(f"Written to: {output}")
else:
    if format == "table":
        display_table(df_processed)
    elif format == "csv":
        display_csv(df_processed)
    elif format == "json":
        display_json(df_processed)

Impact: ~15 lines × 23 commands = 345 lines of duplicated code

Pattern 4: Error Handling (Present in all commands)

# Repeated Result unwrapping
if is_err(some_result):
    error = unwrap_err(some_result)
    typer.echo(f"Error: {error}", err=True)
    raise typer.Exit(1)
value = unwrap(some_result)

Impact: ~4 lines × 50 occurrences = 200 lines of duplicated code

Architecture Issues

Issue 1: Mixed Concerns

Commands currently handle:

❌ File I/O (core responsibility)
❌ User interaction (core responsibility)
❌ Business logic (should be in operations)
❌ Validation (should be in operations)
❌ Data transformation (should be in operations)

Issue 2: Tight Coupling

Commands directly import and use pandas operations
Business logic embedded in command flow
Cannot reuse logic without CLI dependencies

Issue 3: Testing Challenges

Business logic requires CLI setup to test
Cannot mock operations easily
Tests are slower and more complex

Issue 4: Error Handling Inconsistency

Some commands use Result types
Some use try/except with pandas exceptions
Error messages inconsistent across commands

Operation-to-Command Mapping

Phase 1: Core Operations (HIGH Priority)

1. filter.py → operations/filtering.py

Current Functions:

_validate_condition() → validate_condition() ✅
_normalize_condition() → normalize_condition() ✅
Filter logic with df.query() → apply_filter() ✅
Column selection → apply_filter(columns=...) ✅
Row limiting → apply_filter(limit=...) ✅

Lines to Remove: ~220 lines Remaining CLI Code: ~95 lines

Migration Path:

# BEFORE (filter.py - 314 lines)
# Lots of validation and filter logic here
# ...

# AFTER (filter.py - ~95 lines)
from excel_toolkit.operations.filtering import validate_condition, normalize_condition, apply_filter

def filter(file_path: str, condition: str, ...):
    # 1. Read file (keep in CLI)
    df = read_file(path, sheet)

    # 2. Validate condition (use operation)
    validation = validate_condition(condition)
    if is_err(validation):
        handle_error(unwrap_err(validation))

    # 3. Normalize condition (use operation)
    normalized = unwrap(normalize_condition(condition))

    # 4. Apply filter (use operation)
    result = apply_filter(df, normalized, columns=col_list, limit=rows)
    if is_err(result):
        handle_error(unwrap_err(result))

    df_filtered = unwrap(result)

    # 5. Write/display output (keep in CLI)
    write_or_display(df_filtered, output, format)

2. sort.py → operations/sorting.py

Current Functions:

Column validation → validate_sort_columns() ✅
Sort logic → sort_dataframe() ✅
Row limiting → sort_dataframe(limit=...) ✅

Lines to Remove: ~120 lines Remaining CLI Code: ~95 lines

Migration Path:

# BEFORE: Manual column validation and sorting
missing_cols = [c for c in column_list if c not in df.columns]
if missing_cols:
    typer.echo(f"Error: Columns not found: {', '.join(missing_cols)}", err=True)
    raise typer.Exit(1)
df_sorted = df.sort_values(by=column_list, ascending=not desc, na_position=na_placement)

# AFTER: Use operations
from excel_toolkit.operations.sorting import validate_sort_columns, sort_dataframe

# Validate
validation = validate_sort_columns(df, column_list)
if is_err(validation):
    handle_error(unwrap_err(validation))

# Sort
sort_columns = [{"column": col, "ascending": not desc} for col in column_list]
result = sort_dataframe(df, sort_columns, na_placement=na_placement, limit=rows)
df_sorted = unwrap(result)

3. pivot.py → operations/pivoting.py

Current Functions:

Aggregation validation → validate_aggregation_function() ✅
Column validation → validate_pivot_columns() ✅
Fill value parsing → parse_fill_value() ✅
Pivot logic → create_pivot_table() ✅

Lines to Remove: ~130 lines Remaining CLI Code: ~90 lines

4. aggregate.py → operations/aggregating.py

Current Functions:

Spec parsing → parse_aggregation_specs() ✅
Column validation → validate_aggregation_columns() ✅
Aggregation logic → aggregate_groups() ✅

Lines to Remove: ~120 lines Remaining CLI Code: ~90 lines

5. compare.py → operations/comparing.py

Current Functions:

Key validation → validate_key_columns() ✅
Comparison logic → compare_dataframes() ✅

Lines to Remove: ~230 lines Remaining CLI Code: ~95 lines

Phase 2: Support Operations (HIGH Priority)

6. clean.py → operations/cleaning.py

Current Functions:

Trim whitespace → trim_whitespace() ✅
Remove duplicates → remove_duplicates() ✅
Fill missing → fill_missing_values() ✅
Standardize columns → standardize_columns() ✅
Combined cleaning → clean_dataframe() ✅

Lines to Remove: ~170 lines Remaining CLI Code: ~95 lines

Migration Path:

# BEFORE: Individual operations with lots of validation
if trim:
    for col in df.columns if columns is None else columns:
        if df[col].dtype == 'object':
            df[col] = df[col].str.strip()
# ... more individual operations

# AFTER: Use clean_dataframe operation
from excel_toolkit.operations.cleaning import clean_dataframe

operations = []
if trim: operations.append({"type": "trim", "columns": col_list})
if lowercase: operations.append({"type": "standardize", "case": "lower", "columns": col_list})

result = clean_dataframe(df, trim=trim, standardize=case_op, ...)
df_clean = unwrap(result)

7. transform.py → operations/transforming.py

Current Functions:

Expression validation → Built into apply_expression() ✅
Expression logic → apply_expression() ✅
Type casting → cast_columns() ✅
Transformations → transform_column() ✅

Lines to Remove: ~140 lines Remaining CLI Code: ~90 lines

8. fill.py → operations/cleaning.py

Current Functions:

Fill logic → fill_missing_values() ✅

Lines to Remove: ~140 lines Remaining CLI Code: ~90 lines

9. join.py → operations/joining.py

Current Functions:

Join validation → validate_join_columns() ✅
Join logic → join_dataframes() ✅

Lines to Remove: ~130 lines Remaining CLI Code: ~95 lines

10. dedupe.py → operations/cleaning.py

Current Functions:

Dedupe logic → remove_duplicates() ✅

Lines to Remove: ~90 lines Remaining CLI Code: ~90 lines

11. strip.py → operations/cleaning.py

Current Functions:

Strip logic → trim_whitespace() ✅

Lines to Remove: ~70 lines Remaining CLI Code: ~80 lines

12. append.py → operations/joining.py

Current Functions:

Append logic → merge_dataframes() ✅

Lines to Remove: ~100 lines Remaining CLI Code: ~85 lines

13. validate.py → operations/validation.py

Current Functions:

Column exists → validate_column_exists() ✅
Type check → validate_column_type() ✅
Range check → validate_value_range() ✅
Null check → check_null_values() ✅
Unique check → validate_unique() ✅
Rule validation → validate_dataframe() ✅

Lines to Remove: ~400 lines (BIGGEST WIN!) Remaining CLI Code: ~100 lines

Migration Path:

# BEFORE: 497 lines with custom validation logic
# Each validation type implemented separately
# Lots of error handling and reporting logic

# AFTER: ~100 lines using operations
from excel_toolkit.operations.validation import validate_dataframe

rules = []
if columns:
    rules.append({"type": "column_exists", "columns": column_list})
if types:
    rules.append({"type": "column_type", "column_types": type_dict})
if range:
    rules.append({"type": "value_range", "column": range_col, "min": min_val, "max": max_val})

result = validate_dataframe(df, rules)
if is_err(result):
    handle_error(unwrap_err(result))

report = unwrap(result)
display_validation_report(report)

Phase 3: Simple Operations (MEDIUM Priority)

Commands That Don't Need New Operations

These commands are simple pandas operations that don't warrant separate operation modules:

14. select.py - Simple column selection (df[col_list]) 15. search.py - Can use filter operation 16. rename.py - Simple df.rename() 17. head.py - Simple df.head() 18. tail.py - Simple df.tail() 19. count.py - Simple len(df) 20. unique.py - Simple df.unique() 21. export.py - Uses core write functionality 22. convert.py - Uses core format conversion 23. info.py - Uses metadata operations

Strategy for Simple Commands:

Keep minimal logic in command
Extract common patterns to helper functions in commands/common.py
Focus on consistency and error handling

Lines to Remove: ~690 lines across these commands Remaining CLI Code: ~1,200 lines (avg ~70 lines per command)

Commands Needing New Operations

stats.py (401 lines) - MEDIUM Priority

Current Functionality:

Column statistics (mean, median, std, min, max, quartiles)
Data type analysis
Null value analysis
Unique value counts

Recommendation: Create operations/statistics.py with:

compute_column_statistics() - Calculate stats for each column
analyze_data_types() - Type distribution
analyze_null_values() - Null value patterns
compute_value_counts() - Unique value analysis

Lines to Remove: ~300 lines Remaining CLI Code: ~100 lines

Refactoring Strategy

Three-Phase Approach

Phase 3.1: High-Impact Commands (HIGH Priority)

Target: Commands with operations already available Commands: filter, sort, pivot, aggregate, compare, clean, transform, fill, join, validate Estimated Time: 10-12 hours Code Reduction: ~2,180 lines (~70% of total reduction)

Phase 3.2: Medium-Impact Commands (MEDIUM Priority)

Target: Commands that need new operations or moderate refactoring Commands: dedupe, strip, append, stats, search Estimated Time: 4-5 hours Code Reduction: ~500 lines

Phase 3.3: Low-Impact Commands (LOW Priority)

Target: Simple commands with minimal refactoring needed Commands: select, rename, head, tail, count, unique, export, convert, info Estimated Time: 1-2 hours Code Reduction: ~190 lines

Refactoring Principles

1. Single Responsibility Principle

Each command should have ONE responsibility:

CLI concerns: Argument parsing, file I/O, user interaction, output formatting
NOT: Business logic, data transformation, validation

2. Don't Repeat Yourself (DRY)

Extract common patterns to commands/common.py
Use operations for reusable business logic
Create helper functions for repeated CLI patterns

3. Explicit Error Handling

Always use Result types from operations
Unwrap results with proper error handling
Provide user-friendly error messages

4. Type Safety

Preserve type hints from operations
Use Result types consistently
Don't unwrap without checking

5. Backward Compatibility

Maintain same CLI interface
Keep all command-line options
Preserve error messages where possible

Refactoring Patterns

Pattern 1: Basic Command Refactoring

Applicability: Commands with single operation call

Structure:

"""Command brief description."""

from pathlib import Path
import typer
from excel_toolkit.core import HandlerFactory
from excel_toolkit.fp import is_ok, is_err, unwrap, unwrap_err
from excel_toolkit.operations.{module} import {operation}
from excel_toolkit.commands.common import display_table, display_csv, display_json

def command_name(
    # CLI arguments
) -> None:
    \"\"\"Command help text.\"\"\\"\

    # 1. Read input file (20-30 lines)
    path = Path(file_path)
    factory = HandlerFactory()

    if not path.exists():
        typer.echo(f"File not found: {file_path}", err=True)
        raise typer.Exit(1)

    handler_result = factory.get_handler(path)
    if is_err(handler_result):
        error = unwrap_err(handler_result)
        typer.echo(f"{error}", err=True)
        raise typer.Exit(1)

    handler = unwrap(handler_result)
    read_result = handler.read(path, ...)
    df = unwrap(read_result)

    # 2. Call operation (5-15 lines)
    result = {operation}(df, ...)
    if is_err(result):
        error = unwrap_err(result)
        typer.echo(f"Error: {error}", err=True)
        raise typer.Exit(1)

    df_processed = unwrap(result)

    # 3. Write/display output (20-30 lines)
    if output:
        write_result = factory.write_file(df_processed, Path(output))
        if is_err(write_result):
            error = unwrap_err(write_result)
            typer.echo(f"Error writing file: {error}", err=True)
            raise typer.Exit(1)
        typer.echo(f"Written to: {output}")
    else:
        if format == "table":
            display_table(df_processed)
        elif format == "csv":
            display_csv(df_processed)
        elif format == "json":
            display_json(df_processed)

# Create CLI app
app = typer.Typer(help="Command help")
app.command()(command_name)

Total Lines: ~70-90 lines

Pattern 2: Command with Multiple Operations

Applicability: Commands that chain multiple operations

Example: clean.py

"""Clean command implementation."""

from pathlib import Path
import typer
from excel_toolkit.core import HandlerFactory
from excel_toolkit.fp import is_ok, is_err, unwrap, unwrap_err
from excel_toolkit.operations.cleaning import (
    trim_whitespace,
    remove_duplicates,
    fill_missing_values,
    standardize_columns,
)
from excel_toolkit.commands.common import display_table, display_csv, display_json

def clean(
    file_path: str,
    trim: bool = False,
    lowercase: bool = False,
    # ... other options
) -> None:
    \"\"\"Clean data by applying various operations.\"\"\"

    # 1. Read file (Pattern 1, step 1)
    df = read_file(file_path, sheet)

    # 2. Apply operations sequentially (30-40 lines)
    if trim:
        result = trim_whitespace(df, columns=col_list, side="both")
        if is_err(result):
            handle_error(unwrap_err(result))
        df = unwrap(result)

    if lowercase or uppercase or titlecase:
        case = "lower" if lowercase else "upper" if uppercase else "title"
        result = standardize_columns(df, case=case, columns=col_list)
        if is_err(result):
            handle_error(unwrap_err(result))
        df = unwrap(result)

    if remove_dup:
        result = remove_duplicates(df, subset=col_list, keep=keep_strategy)
        if is_err(result):
            handle_error(unwrap_err(result))
        df = unwrap(result)

    # 3. Write/display output (Pattern 1, step 3)
    write_or_display(df, output, format)

app = typer.Typer(help="Clean data files")
app.command()(clean)

Total Lines: ~90-100 lines

Pattern 3: Command with Complex Error Handling

Applicability: Commands that need to handle multiple error types

Example: validate.py

"""Validate command implementation."""

from pathlib import Path
import typer
from excel_toolkit.core import HandlerFactory
from excel_toolkit.fp import is_ok, is_err, unwrap, unwrap_err
from excel_toolkit.operations.validation import validate_dataframe, ValidationReport
from excel_toolkit.commands.common import display_table, display_csv, display_json

def validate(
    file_path: str,
    columns: str | None = None,
    types: str | None = None,
    # ... other options
) -> None:
    \"\"\"Validate data against quality rules.\"\"\"

    # 1. Read file
    df = read_file(file_path, sheet)

    # 2. Build validation rules (20-30 lines)
    rules = []

    if columns:
        col_list = [c.strip() for c in columns.split(",")]
        rules.append({"type": "column_exists", "columns": col_list})

    if types:
        # Parse type specifications
        type_dict = parse_types(types)
        rules.append({"type": "column_type", "column_types": type_dict})

    if range_spec:
        # Parse range specification
        rules.append({"type": "value_range", "column": range_col, "min": min_val, "max": max_val})

    if unique:
        rules.append({"type": "unique", "columns": unique_cols})

    if null_threshold is not None:
        rules.append({"type": "null_threshold", "columns": col_list, "threshold": null_threshold})

    # 3. Run validation (10-15 lines)
    result = validate_dataframe(df, rules)
    if is_err(result):
        error = unwrap_err(result)
        typer.echo(f"Validation error: {error}", err=True)
        raise typer.Exit(1)

    report: ValidationReport = unwrap(result)

    # 4. Display results (30-40 lines)
    display_validation_report(report, verbose, fail_fast)

app = typer.Typer(help="Validate data quality")
app.command()(validate)

def display_validation_report(report: ValidationReport, verbose: bool, fail_fast: bool):
    \"\"\"Display validation report in user-friendly format.\"\"\"
    typer.echo(f"Validation Results: {report.passed} passed, {report.failed} failed")

    if report.errors:
        typer.echo("\n❌ Errors:", err=True)
        for error in report.errors:
            typer.echo(f"  - {error}", err=True)

    if report.warnings and verbose:
        typer.echo("\n⚠️  Warnings:")
        for warning in report.warnings:
            typer.echo(f"  - {warning}")

    if report.failed > 0 and fail_fast:
        raise typer.Exit(1)

Total Lines: ~100-110 lines

Pattern 4: Command Helper Functions

Extract repeated CLI patterns to commands/common.py:

"""Common utilities for command refactoring."""

from pathlib import Path
from excel_toolkit.core import HandlerFactory, ExcelHandler, CSVHandler
from excel_toolkit.fp import is_ok, is_err, unwrap, unwrap_err
import typer
import pandas as pd

def read_data_file(
    file_path: str,
    sheet: str | None = None,
) -> pd.DataFrame:
    \"\"\"Read a data file (Excel or CSV) with auto-detection.

    Args:
        file_path: Path to input file
        sheet: Sheet name for Excel files

    Returns:
        DataFrame with file contents

    Raises:
        typer.Exit: If file cannot be read
    \"\"\"
    path = Path(file_path)

    if not path.exists():
        typer.echo(f"File not found: {file_path}", err=True)
        raise typer.Exit(1)

    factory = HandlerFactory()

    # Get handler
    handler_result = factory.get_handler(path)
    if is_err(handler_result):
        error = unwrap_err(handler_result)
        typer.echo(f"{error}", err=True)
        raise typer.Exit(1)

    handler = unwrap(handler_result)

    # Read file
    if isinstance(handler, ExcelHandler):
        kwargs = {"sheet_name": sheet} if sheet else {}
        read_result = handler.read(path, **kwargs)
    elif isinstance(handler, CSVHandler):
        encoding_result = handler.detect_encoding(path)
        encoding = unwrap(encoding_result) if is_ok(encoding_result) else "utf-8"

        delimiter_result = handler.detect_delimiter(path, encoding)
        delimiter = unwrap(delimiter_result) if is_ok(delimiter_result) else ","

        read_result = handler.read(path, encoding=encoding, delimiter=delimiter)
    else:
        typer.echo("Unsupported file type", err=True)
        raise typer.Exit(1)

    if is_err(read_result):
        error = unwrap_err(read_result)
        typer.echo(f"Error reading file: {error}", err=True)
        raise typer.Exit(1)

    return unwrap(read_result)


def write_or_display(
    df: pd.DataFrame,
    factory: HandlerFactory,
    output: str | None,
    format: str,
) -> None:
    \"\"\"Write DataFrame to file or display to console.

    Args:
        df: DataFrame to write/display
        factory: HandlerFactory for writing files
        output: Output file path (None = display)
        format: Display format (table, csv, json)

    Raises:
        typer.Exit: If write operation fails
    \"\"\"
    if output:
        output_path = Path(output)
        write_result = factory.write_file(df, output_path)
        if is_err(write_result):
            error = unwrap_err(write_result)
            typer.echo(f"Error writing file: {error}", err=True)
            raise typer.Exit(1)
        typer.echo(f"Written to: {output}")
    else:
        from excel_toolkit.commands.common import display_table, display_csv, display_json

        if format == "table":
            display_table(df)
        elif format == "csv":
            display_csv(df)
        elif format == "json":
            display_json(df)
        else:
            typer.echo(f"Unknown format: {format}", err=True)
            raise typer.Exit(1)


def handle_operation_error(error: Exception) -> None:
    \"\"\"Handle operation errors with user-friendly messages.

    Args:
        error: Error from operation

    Raises:
        typer.Exit: Always exits with error code 1
    \"\"\"
    error_msg = str(error)

    # Map error types to user-friendly messages
    if "ColumnNotFoundError" in type(error).__name__:
        typer.echo(f"Error: {error_msg}", err=True)
    elif "TypeMismatchError" in type(error).__name__:
        typer.echo(f"Type mismatch: {error_msg}", err=True)
    elif "ValueOutOfRangeError" in type(error).__name__:
        typer.echo(f"Value out of range: {error_msg}", err=True)
    else:
        typer.echo(f"Error: {error_msg}", err=True)

    raise typer.Exit(1)

With these helpers, commands become even simpler:

"""Simplified filter command using helpers."""

from pathlib import Path
import typer
from excel_toolkit.core import HandlerFactory
from excel_toolkit.fp import is_ok, is_err, unwrap, unwrap_err
from excel_toolkit.operations.filtering import validate_condition, normalize_condition, apply_filter
from excel_toolkit.commands.common import read_data_file, write_or_display, handle_operation_error

def filter(
    file_path: str,
    condition: str,
    output: str | None = None,
    # ... other options
) -> None:
    \"\"\"Filter rows from a data file.\"\"\"

    # 1. Read file (1 line!)
    df = read_data_file(file_path, sheet)

    # 2. Validate and normalize (5 lines)
    validation = validate_condition(condition)
    if is_err(validation):
        handle_operation_error(unwrap_err(validation))

    normalized = unwrap(normalize_condition(condition))

    # 3. Apply filter (10 lines)
    col_list = [c.strip() for c in columns.split(",")] if columns else None
    result = apply_filter(df, normalized, columns=col_list, limit=rows)
    if is_err(result):
        handle_operation_error(unwrap_err(result))

    df_filtered = unwrap(result)

    # 4. Write/display (3 lines)
    factory = HandlerFactory()
    write_or_display(df_filtered, factory, output, format)

app = typer.Typer(help="Filter rows from data files")
app.command()(filter)

Total Lines: ~50-60 lines!

Implementation Roadmap

Week 1: High-Impact Commands

Day 1: Setup & Foundation

Add helper functions to commands/common.py
- read_data_file()
- write_or_display()
- handle_operation_error()
Update imports in existing commands
Create refactoring checklist template
Set up branch for refactoring: feature/command-refactoring

Deliverable: Helper functions ready, all tests passing

Day 2-3: Filter & Sort Commands

Refactor filter.py (314 → ~60 lines)
- Use validate_condition()
- Use normalize_condition()
- Use apply_filter()
- Test all filter scenarios
Refactor sort.py (214 → ~70 lines)
- Use validate_sort_columns()
- Use sort_dataframe()
- Test all sort scenarios

Deliverable: filter.py and sort.py refactored, tests passing

Day 4-5: Pivot, Aggregate, Compare Commands

Refactor pivot.py (219 → ~90 lines)
- Use all 5 pivot operations
- Test multi-dimensional pivots
Refactor aggregate.py (210 → ~90 lines)
- Use parse_aggregation_specs()
- Use validate_aggregation_columns()
- Use aggregate_groups()
- Test groupby scenarios
Refactor compare.py (323 → ~95 lines)
- Use validate_key_columns()
- Use compare_dataframes()
- Test difference detection

Deliverable: 3 commands refactored, all tests passing

Week 2: Support Operations

Day 6-7: Clean & Transform Commands

Refactor clean.py (264 → ~95 lines)
- Use all 5 cleaning operations
- Test combined cleaning scenarios
Refactor transform.py (228 → ~90 lines)
- Use apply_expression()
- Use cast_columns()
- Use transform_column()
- Test transformation scenarios

Deliverable: 2 commands refactored, tests passing

Day 8: Fill, Join, Dedupe Commands

Refactor fill.py (230 → ~90 lines)
- Use fill_missing_values()
- Test all fill strategies
Refactor join.py (224 → ~95 lines)
- Use validate_join_columns()
- Use join_dataframes()
- Test all join types
Refactor dedupe.py (181 → ~90 lines)
- Use remove_duplicates()
- Test dedupe scenarios

Deliverable: 3 commands refactored, tests passing

Day 9: Strip, Append, Validate Commands

Refactor strip.py (148 → ~80 lines)
- Use trim_whitespace()
- Test trim scenarios
Refactor append.py (185 → ~85 lines)
- Use merge_dataframes()
- Test append scenarios
Refactor validate.py (497 → ~100 lines) MAJOR WIN!
- Use all 6 validation operations
- Test comprehensive validation

Deliverable: 3 commands refactored, including the largest one!

Week 3: Final Commands & Polish

Day 10-11: Stats Command (New Operation Needed)

Create operations/statistics.py
- compute_column_statistics()
- analyze_data_types()
- analyze_null_values()
- compute_value_counts()
Write tests for statistics operations
Refactor stats.py (401 → ~100 lines)
- Use new statistics operations
- Test statistics scenarios

Deliverable: New operation module + stats.py refactored

Day 12: Simple Commands

Refactor select.py (240 → ~70 lines)
Refactor rename.py (171 → ~70 lines)
Refactor head.py, tail.py, count.py, unique.py
Refactor export.py, convert.py, info.py
Refactor search.py (can use filter operation)

Deliverable: All remaining commands refactored

Day 13-14: Testing & Documentation

Deliverable: All tests passing, documentation updated

Day 15: Code Review & Merge

Final code review
Performance testing
Merge to main branch
Tag release v0.3.0

Deliverable: Phase 3 complete, ready for release

Testing Strategy

Testing Levels

Level 1: Operation Tests (Already Complete ✅)

441 tests covering all operations
Test business logic independently
No CLI dependencies
Fast execution

Level 2: Integration Tests (New)

Test command → operation integration:

"""Integration tests for filter command."""

import pytest
from typer.testing import CliRunner
from excel_toolkit.commands.filter import app

runner = CliRunner()

def test_filter_command_with_valid_condition():
    \"\"\"Test filter command works end-to-end.\"\"\"
    result = runner.invoke(app, ["test_data.csv", "age > 30"])
    assert result.exit_code == 0
    assert "Filtered" in result.stdout

def test_filter_command_with_invalid_column():
    \"\"\"Test filter command handles invalid column.\"\"\"
    result = runner.invoke(app, ["test_data.csv", "nonexistent > 30"])
    assert result.exit_code == 1
    assert "not found" in result.stdout

def test_filter_command_with_security_violation():
    \"\"\"Test filter command rejects dangerous patterns.\"\"\"
    result = runner.invoke(app, ["test_data.csv", "__import__('os')"])
    assert result.exit_code == 1
    assert "Unsafe" in result.stdout

Estimated: ~150 integration tests (5-8 per command)

Level 3: End-to-End Tests (New)

Test complete workflows:

"""End-to-end tests for command chains."""

def test_filter_then_sort_workflow():
    \"\"\"Test filtering then sorting data.\"\"\"
    # Filter data
    filter_result = runner.invoke(filter_app, ["data.csv", "age > 30", "-o", "filtered.csv"])
    assert filter_result.exit_code == 0

    # Sort filtered data
    sort_result = runner.invoke(sort_app, ["filtered.csv", "--columns", "age", "-o", "sorted.csv"])
    assert sort_result.exit_code == 0

    # Verify final output
    df = pd.read_csv("sorted.csv")
    assert all(df["age"] > 30)
    assert list(df["age"]) == sorted(df["age"])

Estimated: ~30 end-to-end tests

Test Data Fixtures

Create comprehensive test data:

"""Test fixtures for command testing."""

@pytest.fixture
def sample_csv(tmp_path):
    \"\"\"Create sample CSV file for testing.\"\"\"
    data = pd.DataFrame({
        'id': [1, 2, 3, 4, 5],
        'name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
        'age': [25, 30, 35, 40, 45],
        'city': ['Paris', 'London', 'Paris', 'Berlin', 'London'],
        'salary': [50000, 60000, 70000, 80000, 90000]
    })
    path = tmp_path / "test_data.csv"
    data.to_csv(path, index=False)
    return str(path)

@pytest.fixture
def sample_excel(tmp_path):
    \"\"\"Create sample Excel file for testing.\"\"\"
    data = pd.DataFrame({
        'id': [1, 2, 3],
        'value': [100, 200, 300]
    })
    path = tmp_path / "test_data.xlsx"
    data.to_excel(path, index=False)
    return str(path)

@pytest.fixture
def sample_with_nulls(tmp_path):
    \"\"\"Create CSV with null values for testing.\"\"\"
    data = pd.DataFrame({
        'id': [1, 2, 3, 4, 5],
        'name': ['Alice', None, 'Charlie', None, 'Eve'],
        'age': [25, 30, None, 40, 45]
    })
    path = tmp_path / "test_nulls.csv"
    data.to_csv(path, index=False)
    return str(path)

@pytest.fixture
def sample_with_duplicates(tmp_path):
    \"\"\"Create CSV with duplicates for testing.\"\"\"
    data = pd.DataFrame({
        'id': [1, 2, 1, 3, 2],
        'name': ['Alice', 'Bob', 'Alice', 'Charlie', 'Bob']
    })
    path = tmp_path / "test_duplicates.csv"
    data.to_csv(path, index=False)
    return str(path)

Regression Testing

Ensure refactored commands maintain exact same behavior:

"""Regression tests for command refactoring."""

def test_filter_output_matches_before_refactor():
    \"\"\"Ensure refactored filter produces same output.\"\"\"
    # Run old command (backup before refactoring)
    old_output = subprocess_capture(
        ["xl", "filter", "data.csv", "age > 30"]
    )

    # Run new command
    new_output = runner.invoke(filter_app, ["data.csv", "age > 30"])

    # Compare outputs
    assert old_output == new_output.stdout

Risk Assessment

High Severity Risks

Risk 1: Breaking Changes in CLI Interface

Probability: MEDIUM (30%) Impact: HIGH Description: Refactoring could inadvertently change CLI behavior

Mitigation:

✅ Maintain exact same function signatures
✅ Preserve all command-line options
✅ Keep same error messages
✅ Comprehensive regression testing
✅ Beta testing period before merge

Contingency Plan:

Keep backup of original commands
Feature flags for gradual rollout
Quick revert capability

Risk 2: Performance Degradation

Probability: LOW (10%) Impact: MEDIUM Description: Additional function call overhead could slow commands

Mitigation:

✅ Operations are already optimized
✅ Minimal overhead from Result types
✅ Performance benchmarking before/after
✅ Profile critical paths

Acceptable Performance:

<5% overhead is acceptable
Operations are I/O bound anyway
User experience should be unchanged

Risk 3: Test Failures

Probability: MEDIUM (40%) Impact: MEDIUM Description: Existing tests may fail after refactoring

Mitigation:

✅ Update tests to use operations
✅ Add integration tests
✅ Run tests frequently during refactoring
✅ Fix tests immediately

Strategy:

Test-first refactoring (write new tests, then refactor)
Keep tests green at all times
One command at a time

Medium Severity Risks

Risk 4: Incomplete Error Handling

Probability: MEDIUM (35%) Impact: MEDIUM Description: New error types from operations may not be handled

Mitigation:

✅ Comprehensive error type mapping
✅ Generic error handler as fallback
✅ Test all error paths

Risk 5: Loss of Functionality

Probability: LOW (15%) Impact: HIGH Description: Some edge case handling may be lost

Mitigation:

✅ Document all edge cases
✅ Verify edge case handling
✅ Add tests for edge cases

Low Severity Risks

Risk 6: Code Review Overhead

Probability: HIGH (70%) Impact: LOW Description: Large diff may be hard to review

Mitigation:

✅ One command per commit
✅ Clear commit messages
✅ Reference this specification

Risk 7: Documentation Lag

Probability: MEDIUM (50%) Impact: LOW Description: Documentation may not be updated

Mitigation:

✅ Update docs as part of refactoring
✅ Include docstring updates
✅ Update README examples

Code Examples

Example 1: filter.py - Complete Refactoring

BEFORE (314 lines):

"""Filter command implementation."""

from pathlib import Path
from typing import Any
import re
import typer
import pandas as pd
from excel_toolkit.core import HandlerFactory, ExcelHandler, CSVHandler
from excel_toolkit.fp import is_ok, is_err, unwrap, unwrap_err, ok, err
from excel_toolkit.fp._result import Result
from excel_toolkit.commands.common import (
    display_table, display_csv, display_json, format_file_info,
)

# Security: allowed patterns in conditions
ALLOWED_PATTERNS = [
    r"\w+\s*[=!<>]+\s*[\w'\"]+",
    r"\w+\s+in\s+\[[^\]]+\]",
    r"\w+\.isna\(\)",
    r"\w+\.notna\(\)",
    # ... more patterns
]

DANGEROUS_PATTERNS = [
    "import", "exec", "eval", "__",
    "open(", "file(", "os.", "sys.",
    "subprocess", "pickle",
]

def filter(
    file_path: str = typer.Argument(..., help="Path to input file"),
    condition: str = typer.Argument(..., help="Filter condition"),
    output: str | None = typer.Option(None, "--output", "-o"),
    rows: int | None = typer.Option(None, "--rows", "-n"),
    columns: str | None = typer.Option(None, "--columns", "-c"),
    format: str = typer.Option("table", "--format", "-f"),
    sheet: str | None = typer.Option(None, "--sheet", "-s"),
    dry_run: bool = typer.Option(False, "--dry-run"),
) -> None:
    \"\"\"Filter rows from a data file based on a condition.\"\"\"
    path = Path(file_path)
    factory = HandlerFactory()

    # Step 1: Validate file exists
    if not path.exists():
        typer.echo(f"File not found: {file_path}", err=True)
        raise typer.Exit(1)

    # Step 2: Validate condition
    validation_result = _validate_condition(condition)
    if is_err(validation_result):
        error = unwrap_err(validation_result)
        typer.echo(f"Invalid condition: {error}", err=True)
        raise typer.Exit(1)

    # Step 3: Get handler
    handler_result = factory.get_handler(path)
    if is_err(handler_result):
        error = unwrap_err(handler_result)
        typer.echo(f"{error}", err=True)
        raise typer.Exit(1)

    handler = unwrap(handler_result)

    # Step 4: Read file
    if isinstance(handler, ExcelHandler):
        sheet_name = sheet
        kwargs = {"sheet_name": sheet_name} if sheet_name else {}
        read_result = handler.read(path, **kwargs)
    elif isinstance(handler, CSVHandler):
        encoding_result = handler.detect_encoding(path)
        encoding = unwrap(encoding_result) if is_ok(encoding_result) else "utf-8"
        delimiter_result = handler.detect_delimiter(path, encoding)
        delimiter = unwrap(delimiter_result) if is_ok(delimiter_result) else ","
        read_result = handler.read(path, encoding=encoding, delimiter=delimiter)
    else:
        typer.echo("Unsupported handler type", err=True)
        raise typer.Exit(1)

    if is_err(read_result):
        error = unwrap_err(read_result)
        typer.echo(f"Error reading file: {error}", err=True)
        raise typer.Exit(1)

    df = unwrap(read_result)
    original_count = len(df)

    # Step 5: Handle empty file
    if df.empty:
        typer.echo("File is empty (no data rows)")
        raise typer.Exit(0)

    # Step 6: Normalize condition
    normalized_condition = _normalize_condition(condition)

    # Step 7: Apply filter
    try:
        df_filtered = df.query(normalized_condition)
    except pd.errors.UndefinedVariableError as e:
        error_str = str(e)
        col_match = re.search(r"'([^']+)'", error_str)
        if col_match:
            col = col_match.group(1)
            typer.echo(f"Error: Column '{col}' not found", err=True)
            typer.echo(f"Available columns: {', '.join(df.columns)}")
        else:
            typer.echo(f"Error: {error_str}", err=True)
        raise typer.Exit(1)
    except Exception as e:
        error_msg = str(e)
        if "could not convert" in error_msg:
            typer.echo("Error: Type mismatch in condition", err=True)
        else:
            typer.echo(f"Error filtering data: {error_msg}", err=True)
        raise typer.Exit(1)

    filtered_count = len(df_filtered)

    # Step 8: Select columns
    if columns:
        try:
            col_list = [c.strip() for c in columns.split(",")]
            missing_cols = [c for c in col_list if c not in df_filtered.columns]
            if missing_cols:
                typer.echo(f"Error: Columns not found: {', '.join(missing_cols)}", err=True)
                raise typer.Exit(1)
            df_filtered = df_filtered[col_list]
        except Exception as e:
            typer.echo(f"Error selecting columns: {str(e)}", err=True)
            raise typer.Exit(1)

    # Step 9: Limit rows
    if rows is not None:
        df_filtered = df_filtered.head(rows)

    # Step 10-13: Handle dry-run, empty result, output
    # ... (more code)

def _validate_condition(condition: str) -> Result[str, str]:
    \"\"\"Validate filter condition for security and syntax.\"\"\"
    # Check for dangerous patterns
    condition_lower = condition.lower()
    for pattern in DANGEROUS_PATTERNS:
        if pattern in condition_lower:
            return err(f"Unsafe pattern detected: {pattern}")

    # Check length
    if len(condition) > 1000:
        return err("Condition too long (max 1000 characters)")

    # Basic syntax validation
    if condition.count("(") != condition.count(")"):
        return err("Unbalanced parentheses")
    # ... more validation

    return ok(condition)

def _normalize_condition(condition: str) -> str:
    \"\"\"Normalize condition syntax for pandas.query().\"\"\"
    # Convert 'value is None' to 'value.isna()'
    condition = re.sub(r"(\w+)\s+is\s+None\b", r"\1.isna()", condition)
    # ... more normalizations
    return condition

app = typer.Typer(help="Filter rows from data files")
app.command()(filter)

AFTER (~60 lines):

"""Filter command implementation."""

from pathlib import Path
import typer
from excel_toolkit.core import HandlerFactory
from excel_toolkit.fp import is_ok, is_err, unwrap, unwrap_err
from excel_toolkit.operations.filtering import (
    validate_condition,
    normalize_condition,
    apply_filter,
)
from excel_toolkit.commands.common import (
    read_data_file,
    write_or_display,
)

def filter(
    file_path: str = typer.Argument(..., help="Path to input file"),
    condition: str = typer.Argument(..., help="Filter condition (e.g., 'age > 30')"),
    output: str | None = typer.Option(None, "--output", "-o", help="Output file path"),
    rows: int | None = typer.Option(None, "--rows", "-n", help="Limit number of results"),
    columns: str | None = typer.Option(None, "--columns", "-c", help="Comma-separated columns to keep"),
    format: str = typer.Option("table", "--format", "-f", help="Output format (table, csv, json)"),
    sheet: str | None = typer.Option(None, "--sheet", "-s", help="Sheet name for Excel files"),
    dry_run: bool = typer.Option(False, "--dry-run", help="Show preview without writing"),
) -> None:
    """Filter rows from a data file based on a condition.

    Uses pandas query syntax for conditions:
    - Numeric: age > 30, price >= 100
    - String: name == 'Alice', category in ['A', 'B']
    - Logical: age > 25 and city == 'Paris'
    - Null: value.isna(), value.notna()

    Examples:
        xl filter data.xlsx "age > 30"
        xl filter data.csv "price > 100" --output filtered.xlsx
        xl filter data.xlsx "city == 'Paris'" --columns name,age
        xl filter data.csv "status == 'active'" --dry-run
    """
    # 1. Read file
    df = read_data_file(file_path, sheet)
    original_count = len(df)

    if df.empty:
        typer.echo("File is empty (no data rows)")
        raise typer.Exit(0)

    # 2. Validate condition
    validation = validate_condition(condition)
    if is_err(validation):
        error = unwrap_err(validation)
        typer.echo(f"Invalid condition: {error}", err=True)
        raise typer.Exit(1)

    # 3. Normalize condition
    normalized = unwrap(normalize_condition(condition))

    # 4. Parse columns
    col_list = None
    if columns:
        col_list = [c.strip() for c in columns.split(",")]

    # 5. Apply filter
    result = apply_filter(df, normalized, columns=col_list, limit=rows)
    if is_err(result):
        error = unwrap_err(result)
        typer.echo(f"Error filtering data: {error}", err=True)
        raise typer.Exit(1)

    df_filtered = unwrap(result)
    filtered_count = len(df_filtered)

    # 6. Handle dry-run
    if dry_run:
        percentage = (filtered_count / original_count * 100) if original_count > 0 else 0
        typer.echo(f"Would filter {filtered_count} of {original_count} rows ({percentage:.1f}%)")
        typer.echo(f"Condition: {condition}")
        typer.echo("")
        if filtered_count > 0:
            from excel_toolkit.commands.common import display_table
            preview_rows = min(5, filtered_count)
            typer.echo("Preview of first matches:")
            display_table(df_filtered.head(preview_rows))
        else:
            typer.echo("No rows match the condition")
        raise typer.Exit(0)

    # 7. Handle empty result
    if filtered_count == 0:
        typer.echo("No rows match the filter condition")
        typer.echo(f"Condition: {condition}")
        if output:
            factory = HandlerFactory()
            write_or_display(df_filtered, factory, output, format)
        raise typer.Exit(0)

    # 8. Display summary
    percentage = (filtered_count / original_count * 100) if original_count > 0 else 0
    typer.echo(f"Filtered {filtered_count} of {original_count} rows ({percentage:.1f}%)")
    typer.echo(f"Condition: {condition}")

    if filtered_count == original_count:
        typer.echo("Warning: All rows match the condition", err=True)

    typer.echo("")

    # 9. Write or display
    factory = HandlerFactory()
    write_or_display(df_filtered, factory, output, format)

# Create CLI app
app = typer.Typer(help="Filter rows from data files")
app.command()(filter)

Reduction: 314 → 60 lines (81% reduction)

Example 2: validate.py - Largest Command

BEFORE (497 lines):

"""Validate command implementation."""

from pathlib import Path
import typer
import pandas as pd
from excel_toolkit.core import HandlerFactory, ExcelHandler, CSVHandler
from excel_toolkit.fp import is_ok, is_err, unwrap, unwrap_err
from excel_toolkit.commands.common import display_table

def validate(
    file_path: str,
    columns: str | None = None,
    types: str | None = None,
    range: str | None = None,
    unique: str | None = None,
    null_threshold: float | None = None,
    # ... many more options
) -> None:
    \"\"\"Validate data quality.\"\"\"

    # Read file (20-30 lines)
    # ...

    # Column validation (40-50 lines)
    if columns:
        col_list = [c.strip() for c in columns.split(",")]
        missing_cols = [c for c in col_list if c not in df.columns]
        if missing_cols:
            typer.echo(f"❌ Columns not found: {missing_cols}", err=True)
            raise typer.Exit(1)

    # Type validation (60-70 lines)
    if types:
        # Parse types
        # Check each column type
        # Report mismatches

    # Range validation (60-70 lines)
    if range:
        # Parse range spec
        # Check values in range
        # Report violations

    # Unique validation (50-60 lines)
    if unique:
        # Check unique values
        # Report duplicates

    # Null threshold validation (40-50 lines)
    if null_threshold:
        # Check null percentages
        # Report violations

    # Generate report (40-50 lines)
    # Display results
    # ...

    # Total: ~497 lines

AFTER (~100 lines):

"""Validate command implementation."""

from pathlib import Path
import typer
from excel_toolkit.core import HandlerFactory
from excel_toolkit.fp import is_ok, is_err, unwrap, unwrap_err
from excel_toolkit.operations.validation import (
    validate_dataframe,
    ValidationReport,
)
from excel_toolkit.commands.common import read_data_file

def validate(
    file_path: str = typer.Argument(..., help="Path to input file"),
    columns: str | None = typer.Option(None, "--columns", "-c", help="Comma-separated columns to check"),
    types: str | None = typer.Option(None, "--types", "-t", help="Type checks (format: col:type,col:type)"),
    range: str | None = typer.Option(None, "--range", "-r", help="Range check (format: col:min:max)"),
    unique: str | None = typer.Option(None, "--unique", "-u", help="Check uniqueness of column(s)"),
    null_threshold: float | None = typer.Option(None, "--null-threshold", help="Max null percentage (0-1)"),
    min_value: float | None = typer.Option(None, "--min", help="Minimum value for range check"),
    max_value: float | None = typer.Option(None, "--max", help="Maximum value for range check"),
    verbose: bool = typer.Option(False, "--verbose", "-v", help="Show detailed validation info"),
    fail_fast: bool = typer.Option(False, "--fail-fast", help="Stop on first validation failure"),
    sheet: str | None = typer.Option(None, "--sheet", "-s", help="Sheet name for Excel files"),
) -> None:
    """Validate data quality against various rules.

    Performs comprehensive validation checks:
    - Column existence: Verify columns exist
    - Type checking: Validate data types (int, float, str, bool, datetime, numeric)
    - Range validation: Ensure values within specified range
    - Uniqueness: Check for duplicate values
    - Null threshold: Verify null values don't exceed threshold

    Examples:
        xl validate data.xlsx --columns id,name,email
        xl validate data.csv --types "age:int,salary:float"
        xl validate data.xlsx --range "age:0:120"
        xl validate data.csv --unique id --null-threshold 0.1
        xl validate data.xlsx --columns id --types "id:int" --unique id --verbose
    """
    # 1. Read file
    df = read_data_file(file_path, sheet)

    # 2. Build validation rules
    rules = []

    # Column existence rule
    if columns:
        col_list = [c.strip() for c in columns.split(",")]
        rules.append({
            "type": "column_exists",
            "columns": col_list
        })

    # Type validation rule
    if types:
        type_dict = {}
        for spec in types.split(","):
            col, col_type = spec.split(":")
            type_dict[col.strip()] = col_type.strip()
        rules.append({
            "type": "column_type",
            "column_types": type_dict
        })

    # Range validation rule
    if range or (min_value is not None or max_value is not None):
        if range:
            # Parse range spec "col:min:max"
            col_name, min_val, max_val = range.split(":")
            range_col = col_name.strip()
            range_min = float(min_val)
            range_max = float(max_val)
        else:
            # Use --min and --max options (need a column)
            if not columns:
                typer.echo("Error: Must specify --columns with --min/--max", err=True)
                raise typer.Exit(1)
            range_col = columns.split(",")[0].strip()
            range_min = min_value
            range_max = max_value

        rules.append({
            "type": "value_range",
            "column": range_col,
            "min": range_min,
            "max": range_max
        })

    # Uniqueness rule
    if unique:
        unique_cols = [c.strip() for c in unique.split(",")]
        rules.append({
            "type": "unique",
            "columns": unique_cols
        })

    # Null threshold rule
    if null_threshold is not None:
        cols_to_check = [c.strip() for c in columns.split(",")] if columns else None
        rules.append({
            "type": "null_threshold",
            "columns": cols_to_check,
            "threshold": null_threshold
        })

    # 3. Run validation
    result = validate_dataframe(df, rules)
    if is_err(result):
        error = unwrap_err(result)
        typer.echo(f"Validation error: {error}", err=True)
        raise typer.Exit(1)

    report: ValidationReport = unwrap(result)

    # 4. Display results
    _display_validation_report(report, verbose)

    # 5. Exit with error if failures
    if report.failed > 0 and fail_fast:
        raise typer.Exit(1)

def _display_validation_report(report: ValidationReport, verbose: bool) -> None:
    \"\"\"Display validation report in user-friendly format.

    Args:
        report: Validation report from validate_dataframe
        verbose: Whether to show detailed warnings
    \"\"\"
    # Summary
    typer.echo(f"✅ Passed: {report.passed}")
    if report.failed > 0:
        typer.echo(f"❌ Failed: {report.failed}", err=True)
    else:
        typer.echo("❌ Failed: 0")

    typer.echo("")

    # Errors
    if report.errors:
        typer.echo("Errors:", err=True)
        for i, error in enumerate(report.errors, 1):
            rule_num = error.get("rule", "?")
            error_type = error.get("type", "unknown")
            error_msg = error.get("error", str(error))
            typer.echo(f"  {i}. Rule #{rule_num} ({error_type}): {error_msg}", err=True)
        typer.echo("")

    # Warnings (only if verbose)
    if report.warnings and verbose:
        typer.echo("Warnings:")
        for i, warning in enumerate(report.warnings, 1):
            col = warning.get("column", "?")
            null_count = warning.get("null_count", 0)
            null_percent = warning.get("null_percent", 0.0)
            typer.echo(f"  {i}. Column '{col}': {null_count} nulls ({null_percent:.1%})")
        typer.echo("")

# Create CLI app
app = typer.Typer(help="Validate data quality")
app.command()(validate)

Reduction: 497 → 100 lines (80% reduction)

Success Metrics

Quantitative Metrics

Metric	Current	Target	How to Measure
Total lines of code	5,915	~2,300	`wc -l excel_toolkit/commands/*.py`
Average file size	257 lines	<100 lines	Average across all commands
Largest file	497 lines (validate.py)	<110 lines	Max file size
Code duplication	~800 lines	<100 lines	Duplicate pattern analysis
Test coverage	~40%	>80%	pytest-cov
Operation usage	0%	100%	All commands use operations
Commands refactored	0/23	23/23	Count completed commands

Qualitative Metrics

✅ Separation of Concerns

Commands handle CLI only
Operations handle business logic
Clear boundaries

✅ Testability

Operations tested independently (441 tests)
Commands tested with mocked operations
Integration tests for workflows

✅ Maintainability

Changes in one place
Clear code ownership
Easy to understand

✅ Reusability

Operations can be imported
No CLI dependencies in operations
External package friendly

✅ Consistency

Same error handling patterns
Same Result type usage
Same code structure

Performance Metrics

Operation	Before (avg)	After (target)	Acceptance
Filter command	100ms	<105ms	<5% overhead
Sort command	150ms	<155ms	<5% overhead
Pivot command	200ms	<210ms	<5% overhead
Validation command	50ms	<52ms	<5% overhead

Benchmark Strategy:

Use large test files (10K+ rows)
Measure wall-clock time
Compare before/after refactoring
Acceptable overhead: <5%

Summary

What We'll Achieve

Clean Architecture: Complete separation of CLI and business logic
Reduced Complexity: 60% code reduction (~3,600 lines removed)
Improved Testability: 441 operation tests + ~150 integration tests
Better Maintainability: Changes in one place
Enhanced Reusability: Operations available for external use

Effort Estimate

Total Time: 15-20 hours
Phases: 3 (High-Impact, Medium-Impact, Low-Impact)
Commands: 23
New Operations: 1 (statistics.py)
New Tests: ~180 (integration + e2e)

Risk Level: MEDIUM

Main Risks:

Breaking changes (mitigated by regression tests)
Performance degradation (mitigated by benchmarking)
Test failures (mitigated by test-first approach)

Confidence Level: HIGH (85%)

Reasons for Confidence:

Operations already battle-tested (441 tests)
Clear refactoring patterns defined
Comprehensive testing strategy
Incremental approach (one command at a time)

Next Steps

Review this specification with team
Set up feature branch: feature/command-refactoring
Begin with helper functions (Day 1)
Start high-impact commands (Day 2)
Track progress in ROADMAP.md

Appendix

A. Command Priority Matrix

HIGH PRIORITY (Operations Available):
1. validate.py   - 497 lines → ~100 lines (80% reduction) ⭐⭐⭐
2. filter.py     - 314 lines → ~60 lines  (81% reduction) ⭐⭐⭐
3. compare.py    - 323 lines → ~95 lines  (71% reduction) ⭐⭐⭐
4. clean.py      - 264 lines → ~95 lines  (64% reduction) ⭐⭐⭐
5. transform.py  - 228 lines → ~90 lines  (61% reduction) ⭐⭐
6. fill.py       - 230 lines → ~90 lines  (61% reduction) ⭐⭐
7. pivot.py      - 219 lines → ~90 lines  (59% reduction) ⭐⭐
8. aggregate.py  - 210 lines → ~90 lines  (57% reduction) ⭐⭐
9. sort.py       - 214 lines → ~70 lines  (67% reduction) ⭐⭐
10. group.py     - 226 lines → ~90 lines  (60% reduction) ⭐⭐
11. join.py      - 224 lines → ~95 lines  (58% reduction) ⭐⭐

MEDIUM PRIORITY (Need Work):
12. stats.py     - 401 lines → ~100 lines (75% reduction) - NEW OP NEEDED ⭐⭐
13. dedupe.py    - 181 lines → ~90 lines  (50% reduction) ⭐
14. append.py    - 185 lines → ~85 lines  (54% reduction) ⭐
15. strip.py     - 148 lines → ~80 lines  (47% reduction) ⭐
16. search.py    - 186 lines → ~100 lines (46% reduction) ⭐
17. info.py      - 204 lines → ~120 lines (41% reduction) ⭐

LOW PRIORITY (Simple):
18. select.py    - 240 lines → ~70 lines  (71% reduction)
19. rename.py    - 171 lines → ~70 lines  (59% reduction)
20. count.py     - 164 lines → ~70 lines  (57% reduction)
21. export.py    - 153 lines → ~70 lines  (54% reduction)
22. tail.py      - 156 lines → ~70 lines  (55% reduction)
23. head.py      - 148 lines → ~70 lines  (53% reduction)
24. unique.py    - 155 lines → ~70 lines  (55% reduction)
25. convert.py   - 107 lines → ~60 lines  (44% reduction)

B. Quick Reference: Operation Imports

# Filtering
from excel_toolkit.operations.filtering import (
    validate_condition,
    normalize_condition,
    apply_filter,
)

# Sorting
from excel_toolkit.operations.sorting import (
    validate_sort_columns,
    sort_dataframe,
)

# Pivoting
from excel_toolkit.operations.pivoting import (
    validate_aggregation_function,
    validate_pivot_columns,
    parse_fill_value,
    create_pivot_table,
)

# Aggregating
from excel_toolkit.operations.aggregating import (
    parse_aggregation_specs,
    validate_aggregation_columns,
    aggregate_groups,
)

# Comparing
from excel_toolkit.operations.comparing import (
    validate_key_columns,
    compare_dataframes,
)

# Cleaning
from excel_toolkit.operations.cleaning import (
    trim_whitespace,
    remove_duplicates,
    fill_missing_values,
    standardize_columns,
    clean_dataframe,
)

# Transforming
from excel_toolkit.operations.transforming import (
    apply_expression,
    cast_columns,
    transform_column,
)

# Joining
from excel_toolkit.operations.joining import (
    validate_join_columns,
    join_dataframes,
    merge_dataframes,
)

# Validation
from excel_toolkit.operations.validation import (
    validate_column_exists,
    validate_column_type,
    validate_value_range,
    check_null_values,
    validate_unique,
    validate_dataframe,
)

C. Error Type Reference

# All errors inherit from excel_toolkit.models.error_types
# Use isinstance() to check error types:

from excel_toolkit.fp import is_err, unwrap_err
from excel_toolkit.models.error_types import (
    ColumnNotFoundError,
    TypeMismatchError,
    ValueOutOfRangeError,
    # ... etc
)

if is_err(result):
    error = unwrap_err(result)
    if isinstance(error, ColumnNotFoundError):
        # Handle missing column
    elif isinstance(error, TypeMismatchError):
        # Handle type mismatch
    # ... etc

Document Status: ✅ COMPLETE Next Action: Review and approve specification Owner: Development Team Last Updated: 2026-01-16

FilesExpand file tree

PHASE_3_DETAILED_SPEC.md

Latest commit

History

PHASE_3_DETAILED_SPEC.md

File metadata and controls

Phase 3: Command Refactoring - Detailed Specification

Table of Contents

Executive Summary

Objective

Current Situation

Target State

Business Value

Current State Analysis

Code Metrics by Command

Code Duplication Analysis

Pattern 1: File Reading Logic (Present in ~20 commands)

Pattern 2: Filter Validation Logic (Present in filter.py, sort.py, group.py, etc.)

Pattern 3: Display/Output Logic (Present in all commands)

Pattern 4: Error Handling (Present in all commands)

Architecture Issues

Issue 1: Mixed Concerns

Issue 2: Tight Coupling

Issue 3: Testing Challenges

Issue 4: Error Handling Inconsistency

Operation-to-Command Mapping

Phase 1: Core Operations (HIGH Priority)

1. filter.py → operations/filtering.py

2. sort.py → operations/sorting.py

3. pivot.py → operations/pivoting.py

4. aggregate.py → operations/aggregating.py

5. compare.py → operations/comparing.py

Phase 2: Support Operations (HIGH Priority)

6. clean.py → operations/cleaning.py

7. transform.py → operations/transforming.py

8. fill.py → operations/cleaning.py

9. join.py → operations/joining.py

10. dedupe.py → operations/cleaning.py

11. strip.py → operations/cleaning.py

12. append.py → operations/joining.py

13. validate.py → operations/validation.py

Phase 3: Simple Operations (MEDIUM Priority)

Commands That Don't Need New Operations

Commands Needing New Operations

stats.py (401 lines) - MEDIUM Priority

Refactoring Strategy

Three-Phase Approach

Phase 3.1: High-Impact Commands (HIGH Priority)

Phase 3.2: Medium-Impact Commands (MEDIUM Priority)

Phase 3.3: Low-Impact Commands (LOW Priority)

Refactoring Principles

1. Single Responsibility Principle

2. Don't Repeat Yourself (DRY)

3. Explicit Error Handling

4. Type Safety

5. Backward Compatibility

Refactoring Patterns

Pattern 1: Basic Command Refactoring

Pattern 2: Command with Multiple Operations

Pattern 3: Command with Complex Error Handling

Pattern 4: Command Helper Functions

Implementation Roadmap

Week 1: High-Impact Commands

Day 1: Setup & Foundation

Day 2-3: Filter & Sort Commands

Day 4-5: Pivot, Aggregate, Compare Commands

Week 2: Support Operations

Day 6-7: Clean & Transform Commands

Day 8: Fill, Join, Dedupe Commands

Day 9: Strip, Append, Validate Commands

Week 3: Final Commands & Polish

Day 10-11: Stats Command (New Operation Needed)

Day 12: Simple Commands

Day 13-14: Testing & Documentation

Day 15: Code Review & Merge

Testing Strategy

Testing Levels

Level 1: Operation Tests (Already Complete ✅)

Level 2: Integration Tests (New)

Level 3: End-to-End Tests (New)