Skip to content

Why use pypdf2 to split the pdf first? #3

@bryanyzhu

Description

@bryanyzhu

Hi, thanks for the code, it works perfect! I have a quick question, why use pypdf2 to split the pdf first? I think pdfminer can work with multiple pages and extract the content as well. Will the additional dependency, pdf splitting, pdf writing, text merging make the pipeline more complicated? I'm very curious to understand the design, thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions