Data.HTML · SnowWalkerJ/UseLess Wiki · GitHub
Skip to content

Data.HTML

Chen Bingxuan edited this page Aug 19, 2016 · 1 revision

This module provides a structure that contains the DOM tree of HTML documents.


Data.HTML data

data Tag = 
    Tag {
        tagname::String,
        attrs::Attrs,
        children::[Tag]
    } 
    | Text {
        text::String
    }
    |Comment {
        text::String
    }
    deriving Eq

The Tag data is either an HTML tag or a pure text, or a comment tag. An HTML tag includes tag name, attributes and child tags.

type Attr = (String, String)
type Attrs = [Attr]

Attrs is a list of key-value tuples to store attributes of a tag.


Data.HTML extractors

An extractor is of form

Tag->Maybe String
getTagName::Tag->Maybe String

returns the name of the tag. If it's a Text or Comment, returns nothing.

getAttr::String->Tag->Maybe String

returns the value of the attribute of a tag. If the tag is Text or Comment, or the tag doesn't have the attribute, returns Nothing.

getContent::Tag->Maybe String

returns the Text content of a tag.


Data.HTML operators

(?>>)::[Tag]->(Tag->Maybe String)->[String]

apply an extractor on a list of tags.

(!)::Tag->String->Maybe String
(!) = flip getAttr
(!!)::[Tag]->String->[String]

returns the attribute values of a list of tags.

Clone this wiki locally