Skip to content

efixler/headless

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

headless

Go Reference Go Report Card License MPL 2.0

headless scrapes html for a target url using a headless Chrome browser. The included headless and headless-proxy apps provide headless scraping functionality from the shell and as an HTTP proxy server.

Table of Contents

Usage As a CLI Application

Installation

go install github.com/efixler/headless

Usage

headless % ./build/headless -h
Usage: 
        headless [flags] :url
 
  -h
        Show this help message
  -H    Show browser window (don't run in headless mode)
        Environment: HEADLESS_NO_HEADLESS
  -log-level value
        Log level
        Environment: HEADLESS_LOG_LEVEL
  -user-agent value
        User agent to use (omit for browser default)
        Environment: HEADLESS_USER_AGENT

Usage As a Proxy Server

headless-proxy is currently experimental. It's functional as a proof-of-concept but not ready for usage in production environments.

Installation

go install github.com/efixler/headless-proxy

Usage

headless % ./build/headless-proxy -h
Usage: 
        headless-proxy [flags] :url
 
  -h
        Show this help message
  -default-user-agent value
        Default user agent string (empty for browser default)
        Environment: HEADLESS_PROXY_DEFAULT_USER_AGENT
  -inbound-idle-timeout value
        Inbound connection keepalive idle timeout
        Environment: HEADLESS_PROXY_IDLE_TIMEOUT (default 2m0s)
  -inbound-read-timeout value
        Inbound connection read timeout
        Environment: HEADLESS_PROXY_READ_TIMEOUT (default 5s)
  -inbound-write-timeout value
        Inbound connection write timeout
        Environment: HEADLESS_PROXY_WRITE_TIMEOUT (default 30s)
  -log-level value
        Set the log level [debug|error|info|warn]
        Environment: HEADLESS_PROXY_LOG_LEVEL
  -max-concurrent value
        Maximum concurrent connections
        Environment: HEADLESS_PROXY_MAX_CONCURRENT (default 6)
  -port value
        Port to listen on
        Environment: HEADLESS_PROXY_PORT (default 8008)

Roadmap

  • Implemenent Proxy Authorization
  • Build docker container
  • Add https support
  • Document https usage in perimeter-http environments
  • Implement better header checking, url verification, etc.
  • Proxy inbound user agent to outbound connection

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published