HtmlUnit-CSSParser is a CSS parser for Java that reads CSS source text and produces a DOM-style object tree. It is the CSS parser powering HtmlUnit since version 1.30. The project originated as a fork of CSSParser 0.9.25, with the SAC (org.w3c.css.sac) dependency removed and a more flexible object model introduced.
- Group/Artifact:
org.htmlunit:htmlunit-cssparser - License: Apache License 2.0
- Default branch:
master - Java version: JDK 17+ (version 5.x, current development); JDK 8+ for 4.x releases
- Build system: Maven
htmlunit-cssparser/
├── pom.xml # Maven build configuration
├── checkstyle.xml # Checkstyle rules (enforced on build)
├── checkstyle_suppressions.xml # Checkstyle suppression rules
├── README.md
├── LICENSE # Apache 2.0
├── .github/
│ ├── workflows/
│ │ └── codeql.yml # CodeQL security scanning (Java)
│ ├── dependabot.yml # Dependabot dependency updates
│ └── FUNDING.yml # Sponsorship info
├── src/
│ ├── main/
│ │ ├── java/org/htmlunit/cssparser/
│ │ │ ├── dom/ # CSS DOM implementation classes
│ │ │ ├── parser/ # Core parser classes
│ │ │ │ ├── condition/ # CSS selector conditions
│ │ │ │ ├── selector/ # CSS selector model
│ │ │ │ └── media/ # Media query support
│ │ │ └── util/ # Utility classes
│ │ └── javacc/
│ │ └── CSS3Parser.jj # JavaCC grammar file (generates the parser)
│ └── test/
│ ├── java/ # JUnit 5 test classes
│ └── resources/ # CSS test fixture files
└── target/ # Build output (not committed)
- Maven 3.6.3+
- JDK 17+ (for current master / version 5.x)
# Compile (this also runs JavaCC to generate the parser from CSS3Parser.jj)
mvn compile
# Run all tests
mvn test
# Full build with checkstyle verification
mvn -U clean test
# Check for dependency/plugin updates
mvn versions:display-plugin-updates
mvn versions:display-dependency-updatesThe CSS parser is generated from a JavaCC grammar file at src/main/javacc/CSS3Parser.jj. During the generate-sources phase, the ph-javacc-maven-plugin generates Java source files into target/generated-sources/javacc/org/htmlunit/cssparser/parser/javacc/. A post-processing step using the maven-replacer-plugin cleans up the generated code (removes dead code patterns produced by JavaCC).
Do not manually edit files in target/generated-sources/ — they are regenerated on every build. If parser behavior needs to change, edit src/main/javacc/CSS3Parser.jj.
The main entry point for users. Key classes:
| Class | Purpose |
|---|---|
CSSOMParser |
High-level parser that produces a DOM-style tree from CSS input. Main public API. |
AbstractCSSParser |
Base class with shared parsing logic; CSS3Parser (generated) extends this. |
InputSource |
Wraps a Reader to feed CSS text to the parser. Replaces the old SAC InputSource. |
LexicalUnit / LexicalUnitImpl |
Represents CSS values (lengths, colors, functions, etc.) as a linked list of lexical tokens. |
CSSErrorHandler |
Interface for custom error handling during parsing. Replaces the old SAC ErrorHandler. |
CSSException / CSSParseException |
Exception types for parse errors. |
DocumentHandler / HandlerBase |
Event-based (SAX-like) callback interface for streaming CSS parsing. |
Locator / Locatable |
Source location tracking (line/column numbers). |
Represents CSS selectors as an object model:
Selector,SimpleSelector— base typesElementSelector— type selectors (h1,div,*)DescendantSelector,ChildSelector— combinators (,>)DirectAdjacentSelector,GeneralAdjacentSelector— combinators (+,~)PseudoElementSelector— pseudo-elements (::before,::after)RelativeSelector— for:has()relative selectorsSelectorList/SelectorListImpl— ordered list of selectorsSelectorSpecificity— calculates selector specificityCombinator— enum of CSS combinator types
Conditions attached to selectors (class, id, attribute, pseudo-class matching):
ClassCondition(.foo),IdCondition(#bar)AttributeCondition([attr=val]),PrefixAttributeCondition([attr^=val]),SuffixAttributeCondition([attr$=val]),SubstringAttributeCondition([attr*=val]),OneOfAttributeCondition([attr~=val]),BeginHyphenAttributeCondition([attr|=val])PseudoClassCondition(:hover,:nth-child(), etc.)NotPseudoClassCondition(:not()),IsPseudoClassCondition(:is()),HasPseudoClassCondition(:has()),WherePseudoClassCondition(:where())LangCondition(:lang())
MediaQuery— a single media query (screen and (min-width: 768px))MediaQueryList— a list of media queries
Implements a CSS object model (style sheets, rules, values):
CSSStyleSheetImpl— represents a complete stylesheetCSSStyleRuleImpl— a style rule (selector { declarations })CSSStyleDeclarationImpl— a set of property declarationsCSSMediaRuleImpl,CSSImportRuleImpl,CSSPageRuleImpl,CSSFontFaceRuleImpl,CSSCharsetRuleImpl,CSSUnknownRuleImpl— at-rule implementationsCSSRuleListImpl— ordered list of rulesCSSValueImpl— wraps parsed CSS valuesProperty— a single CSS property with name, value, and priority- Color classes:
RGBColorImpl,HSLColorImpl,HWBColorImpl,LABColorImpl,LCHColorImpl(plusAbstractColorbase) RectImpl,CounterImpl— CSSrect()andcounter()value typesMediaListImpl,CSSStyleSheetListImpl— list typesDOMExceptionImpl— DOM exception handling
ParserUtils— string processing helpers used by the generated parser (trimming, unescaping)
Checkstyle is strictly enforced via checkstyle.xml and runs during the build. Key rules:
- Line length: 120 characters max
- Indentation: 4-space tabs
- Braces: opening brace on same line (
eol), closing brace on its own line (alone) - Naming conventions:
- Member fields:
camelCase_(trailing underscore) - Static fields:
CamelCase_(capital start, trailing underscore) - Constants:
UPPER_SNAKE_CASE(exception:log) - Methods:
camelCase(test methods may use underscores:test[A-Z][a-zA-Z0-9_]+) - Catch parameters:
e,ex,ignored, orexpected
- Member fields:
- Javadoc: Required on all public/protected methods, types, and packages. Author tag format:
@author Firstname Lastname - Imports: No star imports, no unused imports, no redundant imports
- License header: Required on every source file:
/* * Copyright (c) 2019-2026 Ronald Brill. * * Licensed under the Apache License, Version 2.0 ... */ - No
serialVersionUIDfields - No
@versiontags - No
System.out/System.errin production code - Final local variables and parameters are enforced
- No trailing whitespace, no tab characters, no double blank lines
- Single empty line after package declaration, none before it
Checkstyle suppressions (checkstyle_suppressions.xml):
- Test files are exempt from
JavadocPackage,JavadocMethod, andLineLength - Generated files in
target/generated-sources/javaccare fully exempt CssCharStream.javais fully exempt (special character stream handling)
- Framework: JUnit Jupiter (JUnit 5), version 6.x
- Test dependency:
commons-io(test scope only) - Test resources: CSS fixture files in
src/test/resources/ - Run tests:
mvn test(usesmaven-surefire-plugin)
- CodeQL: GitHub Actions workflow (
.github/workflows/codeql.yml) runs security analysis on pushes/PRs tomasterand weekly (Mondays 23:34 UTC). Analyzes Java code only. - Dependabot: Configured via
.github/dependabot.ymlfor automated dependency update PRs. - Jenkins: Primary CI runs on an external Jenkins server at
https://jenkins.wetator.org/job/HtmlUnit%20-%20CSS%20Parser/.
- Edit the JavaCC grammar:
src/main/javacc/CSS3Parser.jj - Run
mvn compileto regenerate and compile - Add/update tests to cover the change
- Run
mvn testto verify
New CSS features typically require changes in multiple layers:
- Grammar (
CSS3Parser.jj) — add token definitions and production rules - Lexical units (
LexicalUnit.java,LexicalUnitImpl.java) — add newLexicalUnitTypeenum values if needed - Conditions (
parser/condition/) — for new pseudo-classes or attribute selectors - Selectors (
parser/selector/) — for new selector types or combinators - DOM (
dom/) — for new at-rule types or value types - Tests — comprehensive tests for parsing, serialization, and error handling
- Run
mvn -U clean testand ensure all tests pass - Run checkstyle: it's part of the build; fix all violations
- Follow the naming conventions (especially trailing underscores on fields)
- Add Javadoc to all new public/protected API
- Keep the license header on all new files
- Do not modify generated files in
target/
- Current development: 5.0.0-SNAPSHOT (requires JDK 17+)
- Latest stable: 4.21.0 (December 2025, JDK 8+)
- Artifacts: Published to Maven Central via Sonatype Central Publishing
- Release process: (from README)
- Ensure all tests pass
- Update version in
pom.xmlandREADME.md - Commit, build, and deploy:
mvn -up clean deploy - Publish on Maven Central Portal
- Create GitHub release with signed JARs
- Bump to next SNAPSHOT version
None. The library has zero runtime dependencies — it is completely self-contained.
org.junit.jupiter:junit-jupiter-engineorg.junit.platform:junit-platform-launchercommons-io:commons-io
- No SAC dependency: The
org.w3c.css.sacAPI (stalled since 2008) was removed. All interfaces are built-in, giving the project full control over the object model. - JavaCC-based parser: The CSS grammar is defined in
CSS3Parser.jjand compiled by JavaCC. This provides robust, specification-aligned tokenization and parsing. - Event-based + DOM-based API: The parser supports both SAX-like streaming (
DocumentHandler) and tree-building (CSSOMParser) usage patterns. - Zero runtime dependencies: Makes the library safe to embed anywhere without dependency conflicts.
- Repository: https://github.com/HtmlUnit/htmlunit-cssparser
- Maven Central: https://central.sonatype.com/artifact/org.htmlunit/htmlunit-cssparser
- HtmlUnit: https://www.htmlunit.org/
- Developer Blog: https://htmlunit.github.io/htmlunit-blog/
- CI: https://jenkins.wetator.org/job/HtmlUnit%20-%20CSS%20Parser/
- Sponsor: https://github.com/sponsors/rbri
- Predecessor: http://cssparser.sourceforge.net/