llm-extraction

Here are 11 public repositories matching this topic...

shcherbak-ai / contextgem

ContextGem: Effortless LLM extraction from documents

nlp ai text-analysis docx data-extraction contract-analysis legaltech docx2txt unstructured-data document-intelligence llm docx2md prompt-engineering llms generative-ai llm-framework llm-pipeline llm-extraction

Updated Mar 16, 2026
Python

lightfeed / extractor

Star

Use LLMs to robustly extract web data

Updated Apr 8, 2026
TypeScript

msoedov / validex

Star

Simplifies the retrieval, extraction, and training of structured data from various unstructured sources.

structured-output structured-data-extraction llm-extraction

Updated Oct 22, 2025
Python

🕵️‍♂️ Privacy-focused AI job scraper, local storage, and interactive dashboard. Auto-scrapes AI/ML roles from top companies using ScrapeGraph-AI + LLM and LangGraph Agents, filters for relevance, and provides a Streamlit UI for tracking applications. Built for developers seeking AI careers.

Updated Aug 28, 2025
Python

lightfeed / scrapedown

Star

HTML to Markdown with CSS selector and XPath annotations

nlp markdown crawler html-to-markdown html-parser webscraping data-pipeline web-data-extraction llm llm-scraper llm-extraction

Updated Apr 7, 2026
TypeScript

lightfeed / sdk

Star

Lightfeed SDK to search and filter web data

Updated Jun 7, 2025
Python

jolovicdev / sourcery

Star

Schema-first LLM extraction framework with entity grounding, multi-pass extraction, and deterministic post-processing

ai document-processing document-ai llm-extraction

Updated Mar 1, 2026
Python

mikayelgr / corsa

Star

CORSA is a Python tool for scraping, cleaning, and analyzing AUA course data from SONIS and GenEd sources.

python scraping openai data-analysis feature-engineering aua jenzabar beautifulsoup4 llm-extraction llm-classification

Updated Dec 20, 2025
Jupyter Notebook

Luigina2001 / StartupScoutingAI

Star

Pipeline automatizzata per la ricerca di acceleratori in Europa e l'analisi dei portfolio startup.

google-apps-script web-scraping gemini-api llm-extraction google-sheets-automation

Updated Feb 1, 2026
JavaScript

nchourrout / faq-generator

Star

Generates FAQs for any website using Firecrawl

scraping crawling llm firecrawl llm-extraction

Updated Apr 14, 2026
TypeScript

citation-cosmograph / citation-astrolabe

Star

AI-agent-driven venue governance database. Extracts editorial boards and program committees from journal websites using local LLMs, with entity resolution against OpenAlex.

entity-resolution web-scraping knowledge-base scientometrics structured-extraction editorial-board ai-agent openalex llm-extraction venue-governance bibliometric-infrastructure

Updated Mar 29, 2026

Improve this page

Add a description, image, and links to the llm-extraction topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-extraction topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-extraction

Here are 11 public repositories matching this topic...

shcherbak-ai / contextgem

lightfeed / extractor

msoedov / validex

BjornMelin / ai-job-scraper

lightfeed / scrapedown

lightfeed / sdk

jolovicdev / sourcery

mikayelgr / corsa

Luigina2001 / StartupScoutingAI

nchourrout / faq-generator

citation-cosmograph / citation-astrolabe

Improve this page

Add this topic to your repo