Skip to content

refactor: speed up docstring checker#45009

Merged
tarekziade merged 5 commits intomainfrom
speedup-check-docstrings-2
Mar 27, 2026
Merged

refactor: speed up docstring checker#45009
tarekziade merged 5 commits intomainfrom
speedup-check-docstrings-2

Conversation

@tarekziade
Copy link
Copy Markdown
Collaborator

What does this PR do?

This patch improves the docstring checker implementation (redundant AST walks) and adds cache.

For the AST calls, 2.3x speedup check_docstrings.py --check_all on my M1:

  • before : 29.3s
  • after: 12.6s

@tarekziade tarekziade requested a review from ydshieh March 26, 2026 05:31
@tarekziade tarekziade self-assigned this Mar 26, 2026
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few (rather minor) questions and suggestions.

But overall LGTM and cache is always nice!

# First, identify processor classes to track method context (only top-level classes)
processor_classes: set[str] = set()
for node in ast.walk(tree):
for node in tree.body:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change from the fact "(only top-level classes)"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(maybe explain why only top-level now, in PR page or within code)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we narrow the ast walk to module-level model classes.

Copy link
Copy Markdown
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Go go go

@tarekziade tarekziade added this pull request to the merge queue Mar 27, 2026
Merged via the queue into main with commit 23773e7 Mar 27, 2026
20 checks passed
@tarekziade tarekziade deleted the speedup-check-docstrings-2 branch March 27, 2026 07:21
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Mar 27, 2026
* speed up docstring checker

* add doctsrings and improve test readability

* fmt

* refactor cache so it's owned by a single function and the flow is clearer

* adopt test_fetcher style for cache
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants