Skip to content

cosmez/sqlite_urlparser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SQLite URL Parser Extension

License

A C SQLite extension providing a suite of functions to parse and extract components from URL strings directly within your SQL queries. This extension leverages the furi library for URL parsing.

Features

  • Extract various components of a URL:
    • Scheme (e.g., http, https)
    • Authority (e.g., user:pass@example.com:8080)
    • Userinfo (e.g., user:pass)
    • Host (e.g., example.com)
    • Port (e.g., 8080)
    • Path (e.g., /path/to/resource)
    • Query (e.g., key1=value1&key2=value2)
    • Fragment (e.g., section1)

Prerequisites

  • SQLite (version 3.x.x with extension loading enabled)
  • SQLite Development Headers (e.g., sqlite3.h, sqlite3ext.h)
  • C Compiler (e.g., GCC, Clang)

Building the Extension

TODO: Cmake instructons per OS

Installation / Loading

Once you have compiled libsqlite_urlparser.so (or .dylib/.dll), you can load it into SQLite:

  1. Using the SQLite CLI:

    sqlite> .load ./libsqlite_urlparser.dll
    sqlite> SELECT url_host('https://www.example.com/path');
    www.example.com
  2. Programmatically (e.g., in C, Python, etc.): Most SQLite client libraries provide a way to load extensions.

    • C API: sqlite3_load_extension(db, "./libsqlite_urlparser.so", 0, 0);
    • Python (sqlite3 module):
      import sqlite3
      conn = sqlite3.connect(':memory:')
      conn.enable_load_extension(True)
      conn.load_extension('./library.so') # Adjust path and extension name
      conn.enable_load_extension(False) # Optional: disable after loading
      
      cursor = conn.cursor()
      for row in cursor.execute("SELECT url_scheme('ftp://files.server.net/data.zip')"):
          print(row[0]) # ftp
      conn.close()

Usage

After loading the extension, the following SQL functions become available:

Function Description Example
url_scheme Extracts the scheme from a URL. SELECT url_scheme('https://ex.com'); -> https
url_authority Extracts the authority part. SELECT url_authority('https://u:p@ex.com:80'); -> u:p@ex.com:80
url_userinfo Extracts user information from the authority. SELECT url_userinfo('https://u:p@ex.com'); -> u:p
url_host Extracts the host from the authority. SELECT url_host('https://ex.com:8080/p'); -> ex.com
url_port Extracts the port from the authority. SELECT url_port('https://ex.com:8080'); -> 8080
url_path Extracts the path component. SELECT url_path('https://ex.com/a/b?q'); -> /a/b
url_query Extracts the query string (without '?'). SELECT url_query('https://ex.com/p?k=v#f'); -> k=v
url_fragment Extracts the fragment (without '#'). SELECT url_fragment('https://ex.com/p#sec'); -> sec

Example Queries:

-- Load the extension (if not already loaded)
-- .load ./library.so

-- Basic usage
SELECT url_scheme('http://example.com/path?query=value#fragment');
-- Result: http

SELECT url_host('http://user:password@example.com:8080/path');
-- Result: example.com

SELECT url_path('http://example.com/some/resource.html');
-- Result: /some/resource.html

SELECT url_query('http://example.com/search?q=sqlite&lang=en');
-- Result: q=sqlite&lang=en

SELECT url_fragment('http://example.com/page#section-2');
-- Result: section-2

-- Using with table data
CREATE TABLE weblogs (id INTEGER PRIMARY KEY, referrer_url TEXT);
INSERT INTO weblogs (referrer_url) VALUES
  ('https://www.google.com/search?q=sqlite+extensions'),
  ('http://internal.app/dashboard#overview'),
  (NULL),
  ('https://blog.example.org/article/123');

SELECT
  id,
  referrer_url,
  url_host(referrer_url) AS referring_host,
  url_path(referrer_url) AS referring_path
FROM weblogs;

About

SQLite URL parsing extension

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors