在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称:microcosm-cc/bluemonday开源软件地址:https://github.com/microcosm-cc/bluemonday开源编程语言:Go 99.7%开源软件介绍:bluemondaybluemonday is a HTML sanitizer implemented in Go. It is fast and highly configurable. bluemonday takes untrusted user generated content as an input, and will return HTML that has been sanitised against an allowlist of approved HTML elements and attributes so that you can safely include the content in your web page. If you accept user generated content, and your server uses Go, you need bluemonday. The default policy for user generated content ( Hello <STYLE>.XSS{background-image:url("javascript:alert('XSS')");}</STYLE><A CLASS=XSS></A>World Into a harmless: Hello World And it turns this: <a href="javascript:alert('XSS1')" onmouseover="alert('XSS2')">XSS<a> Into this: XSS Whilst still allowing this: <a href="http://www.google.com/">
<img src="https://ssl.gstatic.com/accounts/ui/logo_2x.png"/>
</a> To pass through mostly unaltered (it gained a rel="nofollow" which is a good thing for user generated content): <a href="http://www.google.com/" rel="nofollow">
<img src="https://ssl.gstatic.com/accounts/ui/logo_2x.png"/>
</a> It protects sites from XSS attacks. There are many vectors for an XSS attack and the best way to mitigate the risk is to sanitize user input against a known safe list of HTML elements and attributes. You should always run bluemonday after any other processing. If you use blackfriday or Pandoc then bluemonday should be run after these steps. This ensures that no insecure HTML is introduced later in your process. bluemonday is heavily inspired by both the OWASP Java HTML Sanitizer and the HTML Purifier. Technical SummaryAllowlist based, you need to either build a policy describing the HTML elements and attributes to permit (and the The policy containing the allowlist is applied using a fast non-validating, forward only, token-based parser implemented in the Go net/html library by the core Go team. We expect to be supplied with well-formatted HTML (closing elements for every applicable open element, nested correctly) and so we do not focus on repairing badly nested or incomplete HTML. We focus on simply ensuring that whatever elements do exist are described in the policy allowlist and that attributes and links are safe for use on your web page. GIGO does apply and if you feed it bad HTML bluemonday is not tasked with figuring out how to make it good again. Supported Go Versionsbluemonday is tested on all versions since Go 1.2 including tip. We do not support Go 1.0 as we depend on We support Go 1.1 but Travis no longer tests against it. Is it production ready?Yes We are using bluemonday in production having migrated from the widely used and heavily field tested OWASP Java HTML Sanitizer. We are passing our extensive test suite (including AntiSamy tests as well as tests for any issues raised). Check for any unresolved issues to see whether anything may be a blocker for you. We invite pull requests and issues to help us ensure we are offering comprehensive protection against various attacks via user generated content. UsageInstall in your Then call it: package main
import (
"fmt"
"github.com/microcosm-cc/bluemonday"
)
func main() {
// Do this once for each unique policy, and use the policy for the life of the program
// Policy creation/editing is not safe to use in multiple goroutines
p := bluemonday.UGCPolicy()
// The policy can then be used to sanitize lots of input and it is safe to use the policy in multiple goroutines
html := p.Sanitize(
`<a onblur="alert(secret)" href="http://www.google.com">Google</a>`,
)
// Output:
// <a href="http://www.google.com" rel="nofollow">Google</a>
fmt.Println(html)
} We offer three ways to call Sanitize: p.Sanitize(string) string
p.SanitizeBytes([]byte) []byte
p.SanitizeReader(io.Reader) bytes.Buffer If you are obsessed about performance, You can build your own policies: package main
import (
"fmt"
"github.com/microcosm-cc/bluemonday"
)
func main() {
p := bluemonday.NewPolicy()
// Require URLs to be parseable by net/url.Parse and either:
// mailto: http:// or https://
p.AllowStandardURLs()
// We only allow <p> and <a href="">
p.AllowAttrs("href").OnElements("a")
p.AllowElements("p")
html := p.Sanitize(
`<a onblur="alert(secret)" href="http://www.google.com">Google</a>`,
)
// Output:
// <a href="http://www.google.com">Google</a>
fmt.Println(html)
} We ship two default policies:
Policy BuildingThe essence of building a policy is to determine which HTML elements and attributes are considered safe for your scenario. OWASP provide an XSS prevention cheat sheet to help explain the risks, but essentially:
Basically, you should be able to describe what HTML is fine for your scenario. If you do not have confidence that you can describe your policy please consider using one of the shipped policies such as To create a new policy: p := bluemonday.NewPolicy() To add elements to a policy either add just the elements: p.AllowElements("b", "strong") Or using a regex: Note: if an element is added by name as shown above, any matching regex will be ignored It is also recommended to ensure multiple patterns don't overlap as order of execution is not guaranteed and can result in some rules being missed. p.AllowElementsMatching(regex.MustCompile(`^my-element-`)) Or add elements as a virtue of adding an attribute: // Note the recommended pattern, see the recommendation on using .Matching() below
p.AllowAttrs("nowrap").OnElements("td", "th") Again, this also supports a regex pattern match alternative: p.AllowAttrs("nowrap").OnElementsMatching(regex.MustCompile(`^my-element-`)) Attributes can either be added to all elements: p.AllowAttrs("dir").Matching(regexp.MustCompile("(?i)rtl|ltr")).Globally() Or attributes can be added to specific elements: // Not the recommended pattern, see the recommendation on using .Matching() below
p.AllowAttrs("value").OnElements("li") It is always recommended that an attribute be made to match a pattern. XSS in HTML attributes is very easy otherwise: // \p{L} matches unicode letters, \p{N} matches unicode numbers
p.AllowAttrs("title").Matching(regexp.MustCompile(`[\p{L}\p{N}\s\-_',:\[\]!\./\\\(\)&]*`)).Globally() You can stop at any time and call .Sanitize(): // string htmlIn passed in from a HTTP POST
htmlOut := p.Sanitize(htmlIn) And you can take any existing policy and extend it: p := bluemonday.UGCPolicy()
p.AllowElements("fieldset", "select", "option") Inline CSSAlthough it's possible to handle inline CSS using It is strongly recommended that you use Similar to attributes, you can allow specific CSS properties to be set inline: p.AllowAttrs("style").OnElements("span", "p")
// Allow the 'color' property with valid RGB(A) hex values only (on any element allowed a 'style' attribute)
p.AllowStyles("color").Matching(regexp.MustCompile("(?i)^#([0-9a-f]{3,4}|[0-9a-f]{6}|[0-9a-f]{8})$")).Globally() Additionally, you can allow a CSS property to be set only to an allowed value: p.AllowAttrs("style").OnElements("span", "p")
// Allow the 'text-decoration' property to be set to 'underline', 'line-through' or 'none'
// on 'span' elements only
p.AllowStyles("text-decoration").MatchingEnum("underline", "line-through", "none").OnElements("span") Or you can specify elements based on a regex pattern match: p.AllowAttrs("style").OnElementsMatching(regex.MustCompile(`^my-element-`))
// Allow the 'text-decoration' property to be set to 'underline', 'line-through' or 'none'
// on 'span' elements only
p.AllowStyles("text-decoration").MatchingEnum("underline", "line-through", "none").OnElementsMatching(regex.MustCompile(`^my-element-`)) If you need more specific checking, you can create a handler that takes in a string and returns a bool to validate the values for a given property. The string parameter has been converted to lowercase and unicode code points have been converted. myHandler := func(value string) bool{
// Validate your input here
return true
}
p.AllowAttrs("style").OnElements("span", "p")
// Allow the 'color' property with values validated by the handler (on any element allowed a 'style' attribute)
p.AllowStyles("color").MatchingHandler(myHandler).Globally() LinksLinks are difficult beasts to sanitise safely and also one of the biggest attack vectors for malicious content. It is possible to do this: p.AllowAttrs("href").Matching(regexp.MustCompile(`(?i)mailto|https?`)).OnElements("a") But that will not protect you as the regular expression is insufficient in this case to have prevented a malformed value doing something unexpected. We provide some additional global options for safely working with links.
p.RequireParseableURLs(true) If you have enabled parseable URLs then the following option will p.AllowRelativeURLs(true) If you have enabled parseable URLs then you can allow the schemes (commonly called protocol when thinking of p.AllowURLSchemes("mailto", "http", "https") Regardless of whether you have enabled parseable URLs, you can force all URLs to have a rel="nofollow" attribute. This will be added if it does not exist, but only when the // This applies to "a" "area" "link" elements that have a "href" attribute
p.RequireNoFollowOnLinks(true) Similarly, you can force all URLs to have "noreferrer" in their rel attribute. // This applies to "a" "area" "link" elements that have a "href" attribute
p.RequireNoReferrerOnLinks(true) We provide a convenience method that applies all of the above, but you will still need to allow the linkable elements for the URL rules to be applied to: p.AllowStandardURLs()
p.AllowAttrs("cite").OnElements("blockquote", "q")
p.AllowAttrs("href").OnElements("a", "area")
p.AllowAttrs("src").OnElements("img") An additional complexity regarding links is the data URI as defined in RFC2397. The data URI allows for images to be served inline using this format: <img src="data:image/webp;base64,UklGRh4AAABXRUJQVlA4TBEAAAAvAAAAAAfQ//73v/+BiOh/AAA="> We have provided a helper to verify the mimetype followed by base64 content of data URIs links: p.AllowDataURIImages() That helper will enable GIF, JPEG, PNG and WEBP images. It should be noted that there is a potential security risk with the use of data URI links. You should only enable data URI links if you already trust the content. We also have some features to help deal with user generated content: p.AddTargetBlankToFullyQualifiedLinks(true) This will ensure that anchor Additionally any link that has Policy Building HelpersWe also bundle some helpers to simplify policy building: // Permits the "dir", "id", "lang", "title" attributes globally
p.AllowStandardAttributes()
// Permits the "img" element and its standard attributes
p.AllowImages()
// Permits ordered and unordered lists, and also definition lists
p.AllowLists()
// Permits HTML tables and all applicable elements and non-styling attributes
p.AllowTables() Invalid InstructionsThe following are invalid: // This does not say where the attributes are allowed, you need to add
// .Globally() or .OnElements(...)
// This will be ignored without error.
p.AllowAttrs("value")
// This does not say where the attributes are allowed, you need to add
// .Globally() or .OnElements(...)
// This will be ignored without error.
p.AllowAttrs(
"type",
).Matching(
regexp.MustCompile("(?i)^(circle|disc|square|a|A|i|I|1)$"),
) Both examples exhibit the same issue, they declare attributes but do not then specify whether they are allowed globally or only on specific elements (and which elements). Attributes belong to one or more elements, and the policy needs to declare this. LimitationsWe are not yet including any tools to help allow and sanitize CSS. Which means that unless you wish to do the heavy lifting in a single regular expression (inadvisable), you should not allow the "style" attribute anywhere. In the same theme, both It is not the job of bluemonday to fix your bad HTML, it is merely the job of bluemonday to prevent malicious HTML getting through. If you have mismatched HTML elements, or non-conforming nesting of elements, those will remain. But if you have well-structured HTML bluemonday will not break it. TODO
DevelopmentIf you have cloned this repo you will probably need the dependency:
Gophers can use their familiar tools:
I personally use a Makefile as it spares typing the same args over and over whilst providing consistency for those of us who jump from language to language and enjoy just typing
Long term goals
|
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论