Microformats 2 parser for Haskell! #IndieWeb
- parses
items,rels,rel-urls - resolves relative URLs (with support for the
<base>tag), including inside ofhtmlfore-*properties - parses the value-class-pattern, including date and time normalization
- handles malformed HTML (the actual HTML parser is tagstream-conduit)
- also can convert to JF2
- high performance
- extensively tested
Also check out http-link-header because you often need to read links from the Link header!
Look at the API docs on Hackage for more info, here's a quick overview:
{-# LANGUAGE OverloadedStrings #-}
import Data.Microformats2.Parser
import Data.Default
import Network.URI
parseMf2 def $ documentRoot $ parseLBS "<body><p class=h-entry><h1 class=p-name>Yay!</h1></p></body>"
parseMf2 (def { baseUri = parseURI "https://where.i.got/that/page/from/" }) $ documentRoot $ parseLBS "<body><base href=\"base/\"><link rel=micropub href='micropub'><p class=h-entry><h1 class=p-name>Yay!</h1></p></body>"The def is the default configuration.
The configuration includes:
htmlMode, an HTML parsing mode (Unsafe|Escape|Sanitize)baseUri, theMaybe URIthat represents the address you retrieved the HTML from, used for resolving relative addresses -- you should set it
parseMf2 will return an Aeson Value structured like canonical microformats2 JSON.
lens-aeson is a good way to navigate it.
Use stack to build.
Use ghci to run tests quickly with :test (see the .ghci file).
$ stack build
$ stack test
$ stack ghciThis is free and unencumbered software released into the public domain.
For more information, please refer to the UNLICENSE file or unlicense.org.