Parsequery is a parser combinator library. It uses staging to remove composition overhead at compile time to produce efficient, fast parsers. Its interface is almost similar to Scala's parser combinators.
To benefit from staging, you wrap your parser declarations in the optimise
scope. Here is an example JSON parser:
sealed abstract class JSValue
case class JSObject(dict: List[(String, JSValue)]) extends JSValue
case class JSArray(arr: List[JSValue]) extends JSValue
case class JSDouble(d: Double) extends JSValue
case class JSString(s: String) extends JSValue
case class JSBool(b: Boolean) extends JSValue
case object JSNull extends JSValue
val jsonParser = optimise {
def value: Parser[JSValue] = (
obj |
arr |
stringLiteral.map(x => JSString(x)) |
double.map(x => JSDouble(x)) |
accept("null").map(_ => JSNull) |
accept("true").map(_ => JSBool(true)) |
accept("false").map(_ => JSBool(false))
)
def obj: Parser[JSValue] = (skipWs(accept('{')) ~>
repsep(member, skipWs(accept(',')))
<~ skipWs(accept('}'))) map { x => JSObject(x) }
def arr: Parser[JSValue] = (skipWs(accept('[')) ~>
repsep(value, skipWs(accept(',')))
<~ skipWs(accept(']'))) map { x => JSArray(x) }
def member: Parser[(String, JSValue)] =
stringLiteral ~ (skipWs(accept(':')) ~> value)
value
}
You run the parser as follows:
val myReader = CharReader("""{"libname": "parsequery"}""".toArray)
jsonParser(myReader) match {
case Success(res, rest) => println(res)
case Failure(err, _) => println(err)
}
You can find more examples in the test files here
A snapshot is available on Sonatype. To use it with SBT, add the following lines to your build:
libraryDependencies += "com.github.manojo" %% "parsequery" % "0.1-SNAPSHOT"
resolvers += Resolver.sonatypeRepo("snapshots")
Check for yourself, by running the following two benchmarks, sources here and here:
> bench:testOnly parsequery.BooleansBenchmark
> bench:testOnly parsequery.JSONBenchmark
On these benchmarks we are currently about 2-3x faster than FastParse.
The main goal of parsequery
is to systematically eliminate all intermediate
data structures that are creating when running a traditional parser combinator
program. Typically, parser combinators interleave static composition of
parsers with the dynamic act of parsing itself, at runtime. The key insight
is that we can fully decouple the static parts from the dynamic ones. This is
done by leveraging a technique known as partial
evaluation. To know more
about the ideas check out the these blog posts on optimising list
pipelines and parser
combinators. A
more detailed blog post catered to the current implementation is to appear soon.
You can find a talk about this here and the accompanying slides here.
This implementation draws inspiration from previous implementations in the LMS framework, found [here](https://github.com/ma nojo/functadelic/tree/master/src/main/scala/stagedparsec) and a macro implementation by the man called Eric Béguet, found here.