Extraction and validation of Brazilian municipal data through official IBGE API, integrated with internal databases for data quality assurance and demographic analysis at FGV IBRE.
FGV IBRE - Instituto Brasileiro de Economia
Made with ☕ by Isabel Cruz | in Google Colab | in Brazil | Data from IBGE
Medium: https://belgon.medium.com/
LinkedIn: http://www.linkedin.com/in/belcruz
This repository contains scripts and notebooks for:
- Extracting official municipal data via IBGE public API
- Validating municipality identifiers (ID IBGE) against internal records
- Reconciling demographic data for analysis
- Generating data quality reports
- IBGE API: https://servicodados.ibge.gov.br/api/v1/localidades/
- Internal Database: Municipal reference tables with custom identifiers
```
├── notebooks/ # Colab notebooks for data extraction
├── scripts/ # Python scripts for validation and processing
├── data/ # Input and output data files
├── reports/ # Generated validation reports (Excel)
```
- Automated data extraction from IBGE's public API (all 27 Brazilian states)
- Municipal name normalization and standardized matching
- ID reconciliation: internal identifiers vs. official IBGE codes
- Data quality metrics (match rates, missing values, collisions)
- Excel reports with validation summaries
pip install pandas requests openpyxl- Upload script:
validacao_ibge_corrigido.py - Upload dataset:
validacao_colisoes_IBGE - Copia.xlsx - Execute script to generate:
validacao_ibge_resultado.xlsx
python validacao_ibge_corrigido.pyInput: validacao_colisoes_IBGE - Copia.xlsx
Output: validacao_ibge_resultado.xlsx
Excel workbook with three sheets:
| Sheet | Content |
|---|---|
| Validacao_Completa | Full dataset with IBGE ID validation |
| Nao_Encontrados | Records not matched in IBGE database |
| Resumo | Metrics: total records, match rate, gaps |
- Total records processed
- Records matched with IBGE database
- Unmatched records requiring review
- Match rate percentage
- Python 3.7+
- pandas: Data manipulation
- requests: API calls
- openpyxl: Excel file generation