BracketPipe
BracketPipe is a .NET library for building parsing and processing piplines for web languages like HTML, CSS, Javscript, SVG, and MathML. The parser is built upon the official W3C specification. It differentiates itself from other libraries such as AngleSharp (which it is based on) and HTML Agility Pack in that it does not build an in-memory representation of the DOM. Rather, it focuses on providing a convenient streaming interface for fast processing of HTML documents. This makes the library ideal for
- minifying HTML
- sanitizing HTML to prevent XSS attacks
- converting HTML to text
- crawling hyperlinks from HTML documents
- cleaning up MS Word HTML
- ... and any other task that only requires a single traversal of the HTML document
It can also be viewed as a modern update to previous projects such as the SgmlReader and Majestic-12 HTML Parser
Usage
Use a pipeline to parse, minify, and sanitize HTML;
"; using (var reader = new HtmlReader(html)) { var result = (string)reader.Sanitize().Minify().ToHtml(); Assert.AreEqual(@"" , result); }">var html = @"";using (var reader = new HtmlReader(html))
{
var result = (string)reader.Sanitize().Minify().ToHtml();
Assert.AreEqual(@""
, result);
}