blob: b9315ce919ef10e716235f5753a5bfa57baa9c2f (
plain)
1
2
3
4
5
|
A small html parser written in C. The parser fixes broken html (missing end tags). The parser doesn't perform any dynamic allocations (heap) and neither copies any text, and only outputs the parsing result to a callback function rather than a dom tree.
This html parser can also be used to parse xml files with namespaces, such as rss feeds.
# TODO
Unescape html sequences in text and attribute values
|