Skip to content

Fix critical bugs, optimize performance, add test suite#1

Open
ghostyappzeta[bot] wants to merge 3 commits intomasterfrom
feat/huffman-gaps-and-optimizations
Open

Fix critical bugs, optimize performance, add test suite#1
ghostyappzeta[bot] wants to merge 3 commits intomasterfrom
feat/huffman-gaps-and-optimizations

Conversation

@ghostyappzeta
Copy link
Copy Markdown

@ghostyappzeta ghostyappzeta bot commented Mar 5, 2026

This pull request was generated by @kiro-agent 👻

Comment with /kiro fix to address specific feedback or /kiro all to address everything.
Learn about Kiro autonomous agent


Summary

A thorough exploration of the Huffman Coding codebase identified several gaps and optimization opportunities. This PR addresses them across three areas:

1. Critical Bug Fixes

  • Decoder padding corruption: Encoder now writes total symbol count to code_table.txt; decoder stops after decoding exactly that many symbols instead of producing garbage from padding bits
  • Single-symbol crash: All three build_tree_* methods detect single-symbol input and wrap the leaf in an internal node
  • Empty input crash: Both Encoder and Gen_huffman_code validate for empty input gracefully
  • Static count accumulation: count is now reset to 0 at the start of build_freq_table()
  • Resource leaks: All BufferedReader instances wrapped in try-with-resources
  • Input validation: NumberFormatException caught with line-number context; negative integers detected and reported

2. Performance and Code Quality

  • Wasteful allocations removed: Eliminated new Node(0,-1) temp variables that were immediately overwritten in all heap swap/del_min operations
  • Recursive to iterative: manage_heap_upwards and manage_heap_downwards in BinaryHeap and D_aryHeap converted from tail recursion to while loops (prevents StackOverflowError on large heaps)
  • Comparison operators: Changed >= to > in manage_heap_upwards to avoid unnecessary swaps on equal frequencies
  • String concatenation optimized: gen_codes() and fillNwrite_code_table() now use StringBuilder with backtracking instead of code+"0" / code+"1"
  • Dead code removal: Removed stub() methods and commented-out debug code
  • Java naming conventions: Renamed encoder to Encoder, decoder to Decoder
  • Encapsulation: heap_size made private with getSize() accessor

3. Automated Test Suite (41 JUnit 5 tests, previously zero)

Test Class Tests Coverage
BinaryHeapTest 8 Insert/extract ordering, empty state, single element, equal frequencies, 100 random elements
D_aryHeapTest 9 Same patterns + d=2, d=4, d=8 variants
PairingHeapTest 8 Insert/extract ordering, all heap operations
HuffmanTreeTest 8 Prefix-free property (all 3 builders), frequency-code-length relationship, single/two-symbol trees
RoundtripTest 4 Full encode/decode pipeline for various inputs
EdgeCaseTest 4 Empty input, single line, large numbers, all-same-symbol

Testing

  • Build compiles cleanly with make
  • All 41 JUnit 5 tests pass
  • Encode/decode roundtrip verified for small, large (10M lines), single-symbol, two-symbol, and multi-symbol cases
  • Edge cases (empty file, non-integer input, negative numbers) produce user-friendly messages

Files Changed

16 files changed, 1,060 insertions, 232 deletions

kiro-agent and others added 3 commits March 5, 2026 23:22
- Fix decoder padding bit corruption: encoder writes total symbol count
  as first line of code_table.txt, decoder stops after that many symbols
- Fix single-symbol input crash: wrap lone leaf in internal node so
  gen_codes produces code '0', handle in encoder fillNwrite_code_table
- Fix BitSet.toByteArray() returning empty array for all-zero bits
- Fix empty input crash: guard against empty freq_data before
  Collections.max() call, return early with friendly message
- Fix static count accumulation: reset count=0 at start of build_freq_table
- Fix resource leaks: use try-with-resources for BufferedReader instances,
  remove @SuppressWarnings annotations
- Add input validation: catch NumberFormatException with line number context,
  reject negative values with helpful error message

Co-authored-by: Vishal Gupta <vishal.gupta4081@gmail.com>
- Remove unnecessary Node/PairNode allocations in heap swap operations
- Convert recursive manage_heap_upwards/downwards to iterative while loops
- Fix comparison operators (>= to >) in heap upwards to avoid unnecessary swaps
- Optimize Huffman code generation: StringBuilder with backtracking instead of String concatenation
- Remove dead code: stub() methods and commented-out debug prints
- Rename encoder/decoder classes to Encoder/Decoder (Java naming conventions)
- Make heap_size private with getSize() accessor in BinaryHeap and D_aryHeap

Co-authored-by: Vishal Gupta <vishal.gupta4081@gmail.com>
- Set up JUnit 5 test infrastructure with standalone console launcher JAR
- Add BinaryHeap tests (8 tests): insert/del_min ordering, is_empty, single
  element, equal frequencies, get_root, del_min on empty, many elements
- Add D_aryHeap tests (9 tests): 4-way, 2-way, 8-way variants plus same
  patterns as BinaryHeap
- Add PairingHeap tests (8 tests): same patterns as other heaps
- Add HuffmanTreeTest (8 tests): prefix-free property for all 3 heap
  builders, frequency-code-length relationship, single/two symbol trees,
  build_freq_table correctness
- Add RoundtripTest (4 tests): full encode/decode pipeline verification
  for small input, two symbols, single symbol, many symbols
- Add EdgeCaseTest (4 tests): empty input, single line, large numbers,
  all same symbol
- Update Makefile with test target and .gitignore with lib/ exclusion
- 41 total test methods, all passing

Co-authored-by: Vishal Gupta <vishal.gupta4081@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant