feat: add subdirectory .gitignore support for monorepos #144
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add Subdirectory .gitignore Support to RepoMapper and CodeSearcher
What problem(s) was I solving?
Kit-dev only loaded the root
.gitignorefile, completely ignoring subdirectory.gitignorefiles. This caused massive file count inflation on monorepos, leading to token overflow errors in MCP tools.Concrete Example: The humanlayer repository has 13
.gitignorefiles total:.gitignore(no node_modules pattern)humanlayer-wui/.gitignore(containsnode_modules).gitignorefilesBecause only the root
.gitignorewas loaded, all 205node_modulesdirectories (88,522 files) were included in results, causing:get_file_treeMCP toolRelated issues:
What user-facing changes did I ship?
RepoMapper & CodeSearcher Behavior
get_file_tree()now respects ALL.gitignorefiles in the repository tree.gitignorefiles override shallower ones)Performance Impact
Backwards Compatibility
✅ Fully backwards compatible:
.gitignore: Identical behavior.gitignore: Identical behavior.gitignorefiles: Now works correctlyHow I implemented it
Phase 1: Update
_load_gitignore()in RepoMapperFile:
src/kit/repo_mapper.py.gitignorefile loading to recursive tree walkingos.walk()to find all.gitignorefiles in repository.gitdirectory to avoid performance issues.gitignorefiles by depth (deepest first) for correct precedencePattern Processing:
.gitignorefile and process patterns line-by-line#).gitignorepatterns: use as-is/pattern): make relative to repo root from subdirectoryExample:
frontend/.gitignorecontainingnode_modules/becomesfrontend/node_modules/in the merged specpathspec.PathSpecfor efficient matchingNoneif no.gitignorefiles exist (graceful degradation)Phase 2: Update
_load_gitignore()in CodeSearcherFile:
src/kit/code_searcher.pyimport loggingandimport osfor error handling and filesystem walking.gitignoreloading logicPhase 3: Comprehensive Testing
Unit Tests (
tests/test_gitignore.py):test_root_gitignore_only(): Baseline behavior unchangedtest_subdirectory_gitignore(): Subdirectory patterns respectedtest_nested_gitignore_precedence(): Negation patterns work correctlytest_multiple_subdirectory_gitignores(): Multiple subdirs each with own.gitignoretest_no_gitignore_files(): Graceful handling of repos without.gitignoreIntegration Test (
tests/integration/test_humanlayer_repo.py):git ls-filescountnode_modulesfiles includedHow to verify it
I have ensured tests pass
Manual Testing
Test on small repository (baseline verification):
Test on large monorepo (fix verification):
Expected results:
Test subdirectory patterns:
Test MCP integration (if kit-dev MCP server is available):
Description for the changelog
Fixed
.gitignorehandling to respect subdirectory.gitignorefiles (previously only root was loaded). RepoMapper and CodeSearcher now recursively load all.gitignorefiles with proper pattern precedence, eliminating token overflow on large monorepos with multiple.gitignorefiles.