Html Agility Pack (HAP)
is an HTML parser that can build a read/write DOM and supports plain XPATH or XSLT. In other words, the utility acts like a .NET code library that enables developers to parse HTML files offsite.
According to the developer, the tool can work with all sorts of HTML files, including malformed ones. The result of the operation is something similar to a system.xml with the main difference that it addresses HTML documents or streams.
Generally speaking, the parser can be interpreted as a feature specific to a web scraped cloud and has the role of automatizing the data processing. In the current circumstance, the parser embedded allows developers to break the HTML and return it as an HTMLDocument. The extraction can be done from a specific file, specified string, an Internet source or directly from the web browser.
The utility comes with selections that allow the selection of the HTML nodes to be processed. There are two methods of selection, namely choosing the first XmlNode or a list of nodes that match the XPath expression. Following the call, the HTML writer allows users to write the node and save the HTMLDocument to various sources, such as StreamWriter, stream, TextWriter or file.