Skip to content

WHO-BCN Bulk-Load sheet generation scripts#2

Open
nshandra wants to merge 55 commits intodevelopmentfrom
who_bcn_data_scripts
Open

WHO-BCN Bulk-Load sheet generation scripts#2
nshandra wants to merge 55 commits intodevelopmentfrom
who_bcn_data_scripts

Conversation

@nshandra
Copy link

@nshandra nshandra commented Jun 6, 2023

📌 References

📝 Implementation

  • Imported the existing script from the previous repository
  • Added currency adjustment option to make_quantitative_bulk_load_file.py

@nshandra nshandra requested review from ifoche and saragilcas June 6, 2023 10:27
@nshandra nshandra self-assigned this Jun 6, 2023
@reviewpad
Copy link

reviewpad bot commented Jun 6, 2023

Thank you @nshandra for this first contribution!

@reviewpad reviewpad bot added the large Pull request is large label Jun 6, 2023
@reviewpad
Copy link

reviewpad bot commented Jun 7, 2023

AI-Generated Summary: This pull request introduces two new scripts: make_quantitative_bulk_load_file.py for processing CSV files from the "Data Extraction Tool" into "Bulk Load" Excel files (XLSX), and make_qualitative_bulk_load_file.py for automating the conversion of data from DOCX files into "Bulk Load" XLSX files used for qualitative data reviews. Both scripts accept various command-line options, including template files and debugging flags.

Additionally, new requirements.txt files are added to both the quantitative_data_script and qualitative_data_script directories to manage dependencies, .vscode is added to the .gitignore file, and the README.md file is significantly updated with installation instructions, examples, and script descriptions.

@reviewpad
Copy link

reviewpad bot commented Sep 11, 2023

AI-Generated Summary: This pull request introduces various changes across different aspects of the project:

  1. A new file "requirements.txt" has been added to both the directories "WHO-BCN-data_scripts/qualitative_data_script" and "quantitative_data_script". These files contain the version specifications for the required python packages.

  2. The README.md file has been updated to provide a comprehensive overview of the 'Bulk Load Pytools' Python tools. It now explains the installation of dependencies, as well as the usage and examples of the 'make_quantitative_bulk_load_file.py' and 'make_qualitative_bulk_load_file.py' scripts.

  3. A new python script 'make_qualitative_bulk_load_file.py' has been introduced. It handles the process of turning DOCX files into "Bulk Load" XLSX files with quite a lot of functionality, including error checking, debugging, and logging features.

  4. There has been a significant update to a script used for processing data from CSV files produced by the "Data Extraction Tool" and loading it into a Bulk Load XLSX file. This script uses a variety of imported modules, dictionaries, lists, and functions to conduct its tasks along with exception handling and logging mechanisms.

  5. Lastly, the ".gitignore" file has been updated to ignore the ".vscode" directory, ensuring that local VS Code settings are not tracked or shared via the repository.

@reviewpad
Copy link

reviewpad bot commented Sep 19, 2023

AI-Generated Summary: This pull request includes significant updates to the project documentation and introduces several Python scripts, along with other auxiliary files. The README.md file was updated with comprehensive documentation that outlines the installation and usage instructions for the scripts.

Four new Python scripts have been introduced, each with its unique functionality:

  • Two of these scripts are used for processing CSV and DOCX files, respectively, converting them into a specific format (.xlsx files in this case). They employ multiple standard and third-party Python libraries for their operations. Both scripts handle command-line arguments that define the processing parameters. These scripts are accompanied by two new requirements.txt files located in their respective directories, specifying the Python package dependencies.
  • The other two scripts seem to be used for data extraction, conversion, matching, and writing.

The .gitignore file has been updated to exclude '.vscode' directory, which means that the changes in Visual Studio Code settings won't be tracked or committed anymore.

Given the size and complexity of the newly created files, it's recommended to conduct a comprehensive code review, which may require several iterations and potentially some testing.

@reviewpad
Copy link

reviewpad bot commented Sep 21, 2023

AI-Generated Summary: This pull request introduces several significant changes involving the addition of new scripts and updates to project documentation and configuration. The added scripts include make_quantitative_bulk_load_file.py and make_qualitative_bulk_load_file.py, both of which are responsible for handling and processing data from various file types. The quantitative script processes CSV files from a Data Extraction Tool into "Bulk Load" XLSX files, adjusting values based on command-line flags and utilizing online currency conversion. The qualitative script, on the other hand, processes DOCX files into Excel files, with functions for extracting specific data structures and handling command-line arguments.

Two new requirements.txt files were added in 'WHO-BCN-data_scripts/quantitative_data_script' and 'WHO-BCN-data-scripts\qualitative_data_script' directories, specifying dependencies such as 'openpyxl', 'pandas', and 'python_docx' at precise versions. The .gitignore file was updated to ignore editor-specific settings from VS Code. Additionally, the README.md file was substantially updated with detailed descriptions of the newly added scripts and installation instructions, significantly enriching the documentation and making it more informative for users.

…dataElement match (to deal with double spaces and such)
@reviewpad
Copy link

reviewpad bot commented Sep 25, 2023

AI-Generated Summary: This pull request introduces several changes primarily focused on data transformation and dependency management.

A new Python file (make_quantitative_bulk_load_file.py) has been added to the WHO-BCN-data_scripts/quantitative_data_script/ directory. This script automates the complex process of transforming CSV data into an XLSX file based on certain rules and adjustments.

The .gitignore file now includes settings ensuring user-specific VS Code configurations are not disturbed.

Additionally, new requirements.txt were created in quantitative_data_script and qualitative_data_script directories respectively. These files specify necessary Python package dependencies for the project.

The README.md documentation has been significantly enhanced with comprehensive details and instructions for using the new Python scripts, installation process, and usage guidance involving various arguments and options.

Finally, a Python script for processing DOCX files into XLSX format was added. This script extracts various data from a given DOCX file and writes it into an Excel sheet. Helper functions are utilized throughout the data extraction and writing process, enhancing the script's efficiency and functionality. Command-line arguments, error handling, and logging have also been addressed in the script.

Overall, the pull request significantly enhances data management, extraction, and transformation with new scripts and appropriate changes.

If quintile is Total and no service use default combo.
Add a list of DEs ignoring the quintile to determine the combo.
Added a count message detailing the number of entries matched from the CSV and the XLSX.
@reviewpad
Copy link

reviewpad bot commented Sep 25, 2023

AI-Generated Summary: This pull request includes several updates and additions related to the 'Bulk Load Pytools' project. New requirements.txt files have been added in both the WHO-BCN-data_scripts/qualitative_data_script and WHO-BCN-data_scripts/quantitative_data_script directories, specifying dependencies for the newly added Python scripts. The .gitignore file has been updated to exclude the Visual Studio Code settings folder. Two new Python scripts, make_quantitative_bulk_load_file.py and make_qualitative_bulk_load_file.py, have been added. These scripts process CSV files and DOCX files, respectively, into bulk load XLSX files with facilities to handle various special cases. The README.md file has been substantially updated to enhance the documentation of the 'Bulk Load Pytools' project which includes installation instruction, instructions on usage of the new scripts, debugging options, and other useful information.

@reviewpad
Copy link

reviewpad bot commented Dec 7, 2023

AI-Generated Summary: This pull request introduces a python script that converts .docx files to an Excel workbook, designed specifically for health care policy documentation conversion. The script parses the input arguments, verifies the files, extracts necessary information, validates it, and writes it into the Excel workbook. Numerous helper functions are included for smoother operations, adhering to a defined .docx table format.

The diff also includes updates and additions to requirements.txt in the WHO-BCN-data_scripts/qualitative_data_script and WHO-BCN-data_scripts/quantitative_data_script directories, specifying necessary packages and versions like openpyxl, python_docx, and pandas.

Edits to '.gitignore' were made to prevent tracking changes for the .vscode directory, and a potential need for a newline character at the end of the file was flagged.

Lastly, significant expansions were made in the README file for the "Bulk Load Pytools" project, offering comprehensive execution instructions, improved header formatting, and additional resources.

@reviewpad
Copy link

reviewpad bot commented Dec 11, 2023

AI-Generated Summary: This pull request includes an update to the .gitignore file to exclude Visual Studio Code settings and introduces a "requirements.txt" file in the directories "WHO-BCN-data_scripts/qualitative_data_script" and "WHO-BCN-data_scripts/quantitative_data_script" to specify python package dependencies. There is also a new Python script named make_qualitative_bulk_load_file.py added to the directory WHO-BCN-data_scripts/qualitative_data_script/ for processing DOCX files into "Bulk Load" XLSX files. Substantial enhancements have been made to the project documentation in the README.md file, detailing the installation, usage instructions, and providing examples for the "Bulk Load Pytools" project. These changes also include correcting the project's title and referencing two python scripts (make_quantitative_bulk_load_file.py and make_qualitative_bulk_load_file.py). These scripts are integral to the project's utility as they process various file types into "Bulk Load" XLSX files. Lastly, this pull request involves the active use of helper functions and the main function in association with command-line arguments in the newly added python script, which ultimately contributes to writing data updates to the XLSX files.

@nshandra nshandra marked this pull request as ready for review April 8, 2024 09:52
@nshandra nshandra requested review from Ramon-Jimenez and removed request for saragilcas July 4, 2024 08:43
nshandra and others added 20 commits July 4, 2024 12:41
…heckbox DEs from id list to avoid incorrect assignment.
…e_update

Feature: Qualitative template update
…cators that need to duplicate the total value.
…tal_value

Store the total value for by consumption quintile indicators in 'Total' combo
…ter_values

Fix: Handle empty currency converter values
…ntry codes

Note: old non-standard entries kept to keep backwards compatibility
@nshandra nshandra force-pushed the who_bcn_data_scripts branch from e764207 to 6d72715 Compare February 24, 2026 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

large Pull request is large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants