Name	Name	Last commit message	Last commit date
parent directory ..
src/fca_web_scrape	src/fca_web_scrape
README.md	README.md
app_maker.py	app_maker.py
fca_webscaper.py	fca_webscaper.py

Name

Last commit message

Last commit date

README.md

app_maker.py

fca_webscaper.py

Streamlit to Signpost Genuine Financial Institutions via Financial Conduct Authority

Important

If FCA block the web scaping - this becomes pointless

The project is incomplete

Warning

I’m testing the Microsoft copilot AI to document functions – thus treat the docstring and README with caution

This project provides tools to scrape and extract information from the Financial Conduct Authority (FCA) Register. It includes Python scripts for automated data collection and a simple Streamlit app for user interaction.

Project Structure

fca_scrape.py
Core scraping logic using Selenium and BeautifulSoup. Functions include:
- get_soup_of_fca_search(search_term): Fetch FCA search results page for a given institution.
- get_soup_without_cookie_notice(url_link): Load FCA profile page without cookie notice.
- get_ref_links_from_soup(soup): Extract firm reference links.
- get_reference_number(soup): Extract firm reference numbers.
- get_dict_of_ordering_firm_listings(soup): Return a DataFrame of firm reference numbers and links.
- get_name_from_fca_profile(soup), get_phone_number_from_fca_profile(soup), get_website_from_fca_profile(soup): Extract firm details.
- get_general_info_for_a_profile(url_link): Get name, phone, and website from a firm profile.
- get_general_info_for_a_profile_via_soup(soup): Same as above but from an existing soup object.
fca_webscaper.py
Example usage of the scraper functions:
- Searches for a firm (e.g., Leeds Building Society).
- Retrieves the first firm link and extracts general info.
- Demonstrates scraping via URL and via soup.
app_maker.py
A Streamlit app for interactive scraping:
- User inputs an institution name.
- Displays the first firm's name, website, and phone number.

Requirements

Python 3.8+
Libraries:
- selenium
- beautifulsoup4
- pandas
- streamlit
- re
- urllib
Browser Driver:
- Firefox with GeckoDriver (ensure it’s installed and in PATH).

Usage

Run Streamlit App

streamlit run app_maker.py

Run Example Script

python fca_webscaper.py

Features

✔ Scrapes FCA register for firm details
✔ Handles cookie notices automatically
✔ Outputs firm name, phone number, and website
✔ Interactive UI via Streamlit

Notes

FCA pages may change; selectors might need updates.
Ensure Firefox and GeckoDriver are properly configured.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Streamlit to Signpost Genuine Financial Institutions via Financial Conduct Authority

Project Structure

Requirements

Usage

Run Streamlit App

Run Example Script

Features

Notes

FilesExpand file tree

20240101_App_Making_Using_Financial_Conduct_Authority

Directory actions

More options

Directory actions

More options

Latest commit

History

20240101_App_Making_Using_Financial_Conduct_Authority

Folders and files

parent directory

README.md

Streamlit to Signpost Genuine Financial Institutions via Financial Conduct Authority

Project Structure

Requirements

Usage

Run Streamlit App

Run Example Script

Features

Notes