Generate Java files AST in a format compatible with 150k Python Dataset and 150k JavaScript Dataset.
Files are parsed using JavaParser.
The only dependency is JDK >= 8. The project can be built with the following command.
./gradlew build
./bin/bigcode-astgen-java [options] <input>
<input> should be a file, or a glob expression to files.
In normal mode, <input> is interpreted as a filename and the resulting AST
is outputed in <output> if provided, else printed to stdout.
In batch mode, <input> is interpreted as a glob, and all matching files
are parsed. <output> is a prefix and <output>.json, <output>.txt and
<output>_failed.txt files will be created.
<output>.json- contains a JSON formatted AST per line<output>.txt- contains a filename per line, in the same order as<output>.json<output>_failed.txt- contains a filename per line, with the reason why it co uld not be parsed
The glob expression should be quoted so that it is not expanded by the shell.
bigcode-astgen-java src/main/java/com/tuvistavie/bigcode/astgen/AstGenerator.java
parse src/main/java/com/tuvistavie/bigcode/astgen/AstGenerator.java and output the result to stdout.
bigcode-astgen-java --batch -o result/asts "src/**/*.java"
parse all .java files in src directory and output results in the result directory
with the prefix asts.
The project is available on JCenter as com.tuvistavie.bigcode:astgen:0.1.1.
Javadoc can be generated to build/docs by running
./gradlew javadoc