Skip to content

Latest commit

 

History

History
19 lines (16 loc) · 812 Bytes

File metadata and controls

19 lines (16 loc) · 812 Bytes

webtextclassifier

A generic classifier for texts from the web for official statistics

WEB-FOSS-NL

This repo is part of the WEB-FOSS-NL project on statistical scraping. More info on statistical scraping here

Getting started

  • Install all required packages using

    pip install -r requirements.txt

  • Create a config.yaml file using config_template.yaml
  • Place your input file with URLs according to your input configuration
  • Configure your variables in the configured input file
  • Start the script with

    python src/main.py

  • Find your output in the configured output directory

Other info

This repository develops on the prototype created during a meeting in Lisbon for AIML4OS WP12,