Git Interface Performance

Performance is currently dominated by the git interface. Pegit queries git repos through git CLI plumbing commands rather than manipulating them directly, because it's the safest and simplest way. For a Pegit command invocation in moderately sized repos this can add up to hundreds of ms on Linux, and tens of thousands of ms on Windows Subsystem for Linux.

This makes Pegit command responsiveness less than ideal on Linux (queries aught to be perceptibly instant), but almost unusable on Windows.

Git commands are called via `child_process.execSync` which will spawn a new process for each command. For each Pegit command there are roughly `treePaths * (4 + treeRefs * 4)` calls to git, depending on the repo this can result in anywhere from tens to thousands of calls. Almost all time is spent setting up child processes and the shell, command execution time is negligible.

### Observations

#### Dominated by child process overhead
For large numbers of git commands the overhead of setting up a child process in nodejs is the dominant factor. This is easy to prove looping over 1k of `execSync('echo')`  vs `execSync('echo &&'.repeat(1000))`. The command execution time is negligible for git also.

#### Synchronous calls not significant factor
All git commands are called synchronously, even though some portion have the potential to be called asynchronously with some added complexity. However due to the dominant factor being child process overhead this doesn't actually help, in fact async appears to incur greater overhead in my tests.

### Possible Solutions

#### ~Nodejs bindings for git~
This is the conceptually ideal solution, a solid and reliable library like `libgit2` should be used, dispensing with the git CLI all together. Unfortunately all of the current npm offerings for nodejs bindings are in various states of disarray. The official nodegit doesn't even install cleanly, frankly I don't trust any of them enough to depend on right now even if I could get them working.

#### Bundle commands
This is the conceptually worst solution. Independent commands could be bundled together into a dynamic shell script that returns a json response with a JS interface to separate the stdout of each and a callback for each. This wouldn't cover all commands (some are dependent with some JS in between), it may also make errors hard to handle properly, overall this just feels a bit hacky.

#### Persistent process
This would be a decent drop in with minimal refactoring necessary, or none if commands can remain synchronous. If a persistent process with a persistent shell can be created, all commands can be fed to it one at a time (the security implications of a shared shell are not an issue for this use case). There is one example I can find of this `stateful-process-command-proxy` although it has an excessively layered source (to be kind) and i'd prefer to find the meat of implementation than depend on it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Git Interface Performance #1

Observations

Dominated by child process overhead

Synchronous calls not significant factor

Possible Solutions

Nodejs bindings for git

Bundle commands

Persistent process

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Git Interface Performance #1

Description

Observations

Dominated by child process overhead

Synchronous calls not significant factor

Possible Solutions

Nodejs bindings for git

Bundle commands

Persistent process

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions