在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称:fb55/htmlparser2开源软件地址:https://github.com/fb55/htmlparser2开源编程语言:TypeScript 98.8%开源软件介绍:htmlparser2The fast & forgiving HTML/XML parser. htmlparser2 is the fastest HTML parser, and takes some shortcuts to get there. If you need strict HTML spec compliance, have a look at parse5. Installation
A live demo of Ecosystem
Usage
import * as htmlparser2 from "htmlparser2";
const parser = new htmlparser2.Parser({
onopentag(name, attributes) {
/*
* This fires when a new tag is opened.
*
* If you don't need an aggregated `attributes` object,
* have a look at the `onopentagname` and `onattribute` events.
*/
if (name === "script" && attributes.type === "text/javascript") {
console.log("JS! Hooray!");
}
},
ontext(text) {
/*
* Fires whenever a section of text was processed.
*
* Note that this can fire at any point within text and you might
* have to stitch together multiple pieces.
*/
console.log("-->", text);
},
onclosetag(tagname) {
/*
* Fires when a tag is closed.
*
* You can rely on this event only firing when you have received an
* equivalent opening tag before. Closing tags without corresponding
* opening tags will be ignored.
*/
if (tagname === "script") {
console.log("That's it?!");
}
},
});
parser.write(
"Xyz <script type='text/javascript'>const foo = '<<bar>>';</script>"
);
parser.end(); Output (with multiple text events combined):
This example only shows three of the possible events. Read more about the parser, its events and options in the wiki. Usage with streamsWhile the import { WritableStream } from "htmlparser2/lib/WritableStream";
const parserStream = new WritableStream({
ontext(text) {
console.log("Streaming:", text);
},
});
const htmlStream = fs.createReadStream("./my-file.html");
htmlStream.pipe(parserStream).on("finish", () => console.log("done")); Getting a DOMThe import * as htmlparser2 from "htmlparser2";
const dom = htmlparser2.parseDocument(htmlString); The Parsing RSS/RDF/Atom Feedsconst feed = htmlparser2.parseFeed(content, options); Note: While the provided feed handler works for most feeds, you might want to use danmactough/node-feedparser, which is much better tested and actively maintained. PerformanceAfter having some artificial benchmarks for some time, @AndreasMadsen published his At the time of writing, the latest versions of all supported parsers show the following performance characteristics on GitHub Actions (sourced from here):
node-htmlparser?How does this module differ fromIn 2011, this module started as a fork of the The parser now provides a callback interface inspired by sax.js (originally targeted at readabilitySAX). As a result, old handlers won't work anymore. The Security contact informationTo report a security vulnerability, please use the Tidelift security contact. Tidelift will coordinate the fix and disclosure.
|
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论