Python sgml.SGMLParser类代码示例

OGeek|极客世界-中国程序员成长平台 › 门户 › 编程› Python›Python编程经验

原作者: [db:作者] 来自: [db:来源] 收藏邀请

本文整理汇总了Python中w3af.core.data.parsers.doc.sgml.SGMLParser类的典型用法代码示例。如果您正苦于以下问题：Python SGMLParser类的具体用法？Python SGMLParser怎么用？Python SGMLParser使用的例子？那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。

在下文中一共展示了SGMLParser类的20个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于我们的系统推荐出更棒的Python代码示例。

示例1: test_extract_emails_mailto

    def test_extract_emails_mailto(self):
        body = u'<a href="mailto:[email protected]">test</a>'
        resp = build_http_response(self.url, body)
        p = SGMLParser(resp)
        p.parse()

        expected_res = {u'[email protected]'}
        self.assertEqual(p.get_emails(), expected_res)

开发者ID:andresriancho，项目名称:w3af，代码行数:8，代码来源:test_sgml.py

示例2: test_mailto_ignored_in_links

    def test_mailto_ignored_in_links(self):
        body = u'<a href="mailto:[email protected]">a</a>'
        resp = build_http_response(self.url, body)
        p = SGMLParser(resp)
        p.parse()

        parsed, _ = p.references
        self.assertEqual(parsed, [])

开发者ID:andresriancho，项目名称:w3af，代码行数:8，代码来源:test_sgml.py

示例3: test_get_emails_filter

    def test_get_emails_filter(self):
        resp = build_http_response(self.url, '')
        p = SGMLParser(resp)
        p._emails = {'[email protected]', '[email protected]'}

        self.assertEqual(p.get_emails(), {'[email protected]', '[email protected]'})

        self.assertEqual(p.get_emails(domain='w3af.com'), ['[email protected]'])
        self.assertEqual(p.get_emails(domain='not.com'), ['[email protected]'])

开发者ID:andresriancho，项目名称:w3af，代码行数:9，代码来源:test_sgml.py

示例4: close

    def close(self):
        """
        Called by the parser when it ends
        """
        SGMLParser.close(self)

        # Don't call clear() here! That would call clear() on SGMLParser and
        # remove all the forms, references, etc.
        self._html_internals_clear()

开发者ID:0x554simon，项目名称:w3af，代码行数:9，代码来源:html.py

示例5: _handle_textarea_tag_inside_form

    def _handle_textarea_tag_inside_form(self, tag, tag_name, attrs):
        """
        Handler for textarea tag inside a form
        """
        SGMLParser._handle_textarea_tag_start(self, tag, tag_name, attrs)

        # Set the data and name
        self._text_area_data = tag.text
        self._text_area_tag_name = get_value_by_key(attrs, 'name', 'id')

开发者ID:0x554simon，项目名称:w3af，代码行数:9，代码来源:html.py

示例6: test_mailto_subject_body

    def test_mailto_subject_body(self):
        body = u'<a href="mailto:[email protected]?subject=testing out mailto'\
               u'&body=Just testing">test</a>'
        resp = build_http_response(self.url, body)
        p = SGMLParser(resp)
        p.parse()

        expected_res = {u'[email protected]'}
        self.assertEqual(p.get_emails(), expected_res)

开发者ID:andresriancho，项目名称:w3af，代码行数:9，代码来源:test_sgml.py

示例7: test_get_clear_text_body

    def test_get_clear_text_body(self):
        html = 'header <b>ABC</b>-<b>DEF</b>-<b>XYZ</b> footer'
        clear_text = 'header ABC-DEF-XYZ footer'
        headers = Headers([('Content-Type', 'text/html')])
        r = build_http_response(self.url, html, headers)

        p = SGMLParser(r)
        p.parse()

        self.assertEquals(clear_text, p.get_clear_text_body())

开发者ID:andresriancho，项目名称:w3af，代码行数:10，代码来源:test_sgml.py

示例8: test_meta_tags

 def test_meta_tags(self):
     body = HTML_DOC % \
         {'head': META_REFRESH + META_REFRESH_WITH_URL,
          'body': ''}
     resp = build_http_response(self.url, body)
     p = SGMLParser(resp)
     p.parse()
     self.assertTrue(2, len(p.meta_redirs))
     self.assertTrue("2;url=http://crawler.w3af.com/" in p.meta_redirs)
     self.assertTrue("600" in p.meta_redirs)
     self.assertEquals([URL('http://crawler.w3af.com/')], p.references[0])

开发者ID:everping，项目名称:w3af，代码行数:11，代码来源:test_sgml.py

示例9: test_nested_with_text

    def test_nested_with_text(self):
        body = '<html><a href="/abc">foo<div>bar</div></a></html>'
        url = URL('http://www.w3af.com/')
        headers = Headers()
        headers['content-type'] = 'text/html'
        resp = HTTPResponse(200, body, headers, url, url, charset='utf-8')

        p = SGMLParser(resp)
        tags = p.get_tags_by_filter(('a', 'b'), yield_text=True)
        tags = list(tags)

        self.assertEqual([Tag('a', {'href': '/abc'}, 'foo')], tags)

开发者ID:andresriancho，项目名称:w3af，代码行数:12，代码来源:test_sgml.py

示例10: _handle_script_tag_start

    def _handle_script_tag_start(self, tag, tag_name, attrs):
        """
        Handle the script tags
        """
        SGMLParser._handle_script_tag_start(self, tag, tag_name, attrs)

        if tag.text is not None:
            re_extract = ReExtract(tag.text.strip(),
                                   self._base_url,
                                   self._encoding)
            re_extract.parse()
            self._re_urls.update(re_extract.get_references())

开发者ID:0x554simon，项目名称:w3af，代码行数:12，代码来源:html.py

示例11: test_meta_tags_with_single_quotes

    def test_meta_tags_with_single_quotes(self):
        body = HTML_DOC % {'head': META_REFRESH + META_REFRESH_WITH_URL_AND_QUOTES,
                           'body': ''}
        resp = build_http_response(self.url, body)

        p = SGMLParser(resp)
        p.parse()

        self.assertEqual(2, len(p.meta_redirs))
        self.assertIn("2;url='http://crawler.w3af.com/'", p.meta_redirs)
        self.assertIn("600", p.meta_redirs)
        self.assertEqual([URL('http://crawler.w3af.com/')], p.references[0])

开发者ID:andresriancho，项目名称:w3af，代码行数:12，代码来源:test_sgml.py

示例12: test_none

    def test_none(self):
        body = '<html><a href="/abc">foo<div>bar</div></a></html>'
        url = URL('http://www.w3af.com/')
        headers = Headers()
        headers['content-type'] = 'text/html'
        resp = HTTPResponse(200, body, headers, url, url, charset='utf-8')

        p = SGMLParser(resp)
        tags = p.get_tags_by_filter(None)
        tags = list(tags)
        tag_names = [tag.name for tag in tags]

        self.assertEqual(tag_names, ['html', 'body', 'a', 'div'])

开发者ID:andresriancho，项目名称:w3af，代码行数:13，代码来源:test_sgml.py

示例13: test_reference_with_colon

 def test_reference_with_colon(self):
     body = """
     <html>
         <a href="d:url.html?id=13&subid=3">foo</a>
     </html>"""
     r = build_http_response(self.url, body)
     p = SGMLParser(r)
     p.parse()
     parsed_refs = p.references[0]
     #
     #    Finding zero URLs is the correct behavior based on what
     #    I've seen in Opera and Chrome.
     #
     self.assertEquals(0, len(parsed_refs))

开发者ID:andresriancho，项目名称:w3af，代码行数:14，代码来源:test_sgml.py

示例14: _handle_form_tag_start

    def _handle_form_tag_start(self, tag, tag_name, attrs):
        """
        Handle the form tags.

        This method also looks if there are "pending inputs" in the
        self._saved_inputs list and parses them.
        """
        SGMLParser._handle_form_tag_start(self, tag, tag_name, attrs)

        method = attrs.get('method', 'GET').upper()
        action = attrs.get('action', None)
        form_encoding = attrs.get('enctype', DEFAULT_FORM_ENCODING)
        autocomplete = attrs.get('autocomplete', None)

        if action is None:
            action = self._source_url
        else:
            action = self._decode_url(action)
            try:
                action = self._base_url.url_join(action,
                                                 encoding=self._encoding)
            except ValueError:
                # The URL in the action is invalid, the best thing we can do
                # is to guess, and our best guess is that the URL will be the
                # current one.
                action = self._source_url

        # Create the form object and store everything for later use
        form_params = FormParameters(encoding=self._encoding,
                                     method=method,
                                     action=action,
                                     form_encoding=form_encoding,
                                     attributes=attrs,
                                     hosted_at_url=self._source_url)
        form_params.set_autocomplete(autocomplete)

        self._forms.append(form_params)

        # Now I verify if there are any input tags that were found
        # outside the scope of a form tag
        for input_attrs in self._saved_inputs:
            # Parse them just like if they were found AFTER the
            # form tag opening
            self._handle_input_tag_inside_form(tag, 'input', input_attrs)

        # All parsed, remove them.
        self._saved_inputs = []

开发者ID:everping，项目名称:w3af，代码行数:47，代码来源:html.py

示例15: test_parsed_references

 def test_parsed_references(self):
     # The *parsed* urls *must* come both from valid tags and tag attributes
     # Also invalid urls like must be ignored (like javascript instructions)
     body = """
     <html>
         <a href="/x.py?a=1" Invalid_Attr="/invalid_url.php">
         <form action="javascript:history.back(1)">
             <tagX href="/py.py"/>
         </form>
     </html>"""
     r = build_http_response(self.url, body)
     p = SGMLParser(r)
     p.parse()
     parsed_refs = p.references[0]
     self.assertEquals(1, len(parsed_refs))
     self.assertEquals(
         'http://w3af.com/x.py?a=1', parsed_refs[0].url_string)

开发者ID:andresriancho，项目名称:w3af，代码行数:17，代码来源:test_sgml.py

示例16: test_get_clear_text_body_encodings

    def test_get_clear_text_body_encodings(self):

        raise SkipTest('Not sure why this one is failing :S')

        for lang_desc, (body, encoding) in TEST_RESPONSES.iteritems():
            encoding_header = 'text/html; charset=%s' % encoding
            headers = Headers([('Content-Type', encoding_header)])

            encoded_body = body.encode(encoding)
            r = build_http_response(self.url, encoded_body, headers)

            p = SGMLParser(r)
            p.parse()

            ct_body = p.get_clear_text_body()

            # These test strings don't really have tags, so they should be eq
            self.assertEqual(ct_body, body)

开发者ID:andresriancho，项目名称:w3af，代码行数:18，代码来源:test_sgml.py

示例17: init

    def __init__(self, http_resp):
        # An internal list to be used to save input tags found
        # outside of the scope of a form tag.
        self._saved_inputs = []

        # For <textarea> elems parsing
        self._text_area_tag_name = None
        self._text_area_data = None

        # Save for using in form parsing
        self._source_url = http_resp.get_url()
        self._re_urls = set()

        # For <select> and <option> parsing
        self._select_option_values = set()
        self._select_input_name = None

        # Call parent's __init__
        SGMLParser.__init__(self, http_resp)

开发者ID:0x554simon，项目名称:w3af，代码行数:19，代码来源:html.py

示例18: test_get_clear_text_issue_4402

    def test_get_clear_text_issue_4402(self):
        """
        :see: https://github.com/andresriancho/w3af/issues/4402
        """
        test_file_path = 'core/data/url/tests/data/encoding_4402.php'
        test_file = os.path.join(ROOT_PATH, test_file_path)
        body = file(test_file, 'rb').read()

        sample_encodings = [encoding for _, (_, encoding) in TEST_RESPONSES.iteritems()]
        sample_encodings.extend(['', 'utf-8'])

        for encoding in sample_encodings:
            encoding_header = 'text/html; charset=%s' % encoding
            headers = Headers([('Content-Type', encoding_header)])

            r = build_http_response(self.url, body, headers)

            p = SGMLParser(r)
            p.parse()

            p.get_clear_text_body()

开发者ID:andresriancho，项目名称:w3af，代码行数:21，代码来源:test_sgml.py

示例19: test_case_sensitivity

    def test_case_sensitivity(self):
        """
        Ensure handler methods are *always* called with lowered-cased
        tag and attribute names
        """
        def islower(s):
            il = False
            if isinstance(s, basestring):
                il = s.islower()
            else:
                il = all(k.islower() for k in s)
            assert il, "'%s' is not lowered-case" % s
            return il

        def start_wrapper(orig_start, tag):
            islower(tag.tag)
            islower(tag.attrib)
            return orig_start(tag)

        tags = (A_LINK_ABSOLUTE, INPUT_CHECKBOX_WITH_NAME, SELECT_WITH_NAME,
                TEXTAREA_WITH_ID_AND_DATA, INPUT_HIDDEN)
        ops = "lower", "upper", "title"

        for indexes in combinations(range(len(tags)), 2):

            body_elems = []

            for index, tag in enumerate(tags):
                ele = tag
                if index in indexes:
                    ele = getattr(tag, choice(ops))()
                body_elems.append(ele)

            body = HTML_DOC % {'head': '', 'body': ''.join(body_elems)}
            resp = build_http_response(self.url, body)
            p = SGMLParser(resp)
            orig_start = p.start
            wrapped_start = partial(start_wrapper, orig_start)
            p.start = wrapped_start
            p.parse()

开发者ID:andresriancho，项目名称:w3af，代码行数:40，代码来源:test_sgml.py

示例20: _handle_select_tag_end

    def _handle_select_tag_end(self, tag):
        """
        Handler for select end tag
        """
        SGMLParser._handle_select_tag_end(self, tag)

        if not self._forms:
            return

        if not self._select_input_name:
            return

        attrs = {'name': self._select_input_name,
                 'values': list(self._select_option_values),
                 'type': INPUT_TYPE_SELECT}

        # Work with the last form
        form_params = self._forms[-1]
        form_params.add_field_by_attrs(attrs)

        # Reset selects container
        self._select_option_values = set()
        self._select_input_name = None

开发者ID:0x554simon，项目名称:w3af，代码行数:23，代码来源:html.py

注：本文中的w3af.core.data.parsers.doc.sgml.SGMLParser类示例由纯净天空整理自Github/MSDocs等源码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。

鲜花

握手

雷人

路过

鸡蛋

该文章已有0人参与评论

请发表评论

全部评论

专题导读

More+

10-27 六六分期app的软件客服如何联系？(六六分期

11-06 可心卡盟:win10系统火狐flash插件崩溃怎么

11-06 亲亲特价:怎么删除回收站图标

11-06 济南大学虚拟社区:鲁大师节能降温的具体办

11-06 xlueops.exe:无线网络安装向导

11-06 女斗合众国:win7系统cf与主机连接不稳定怎

11-06 0xc000022-[cf烟雾头]cf怎么调烟雾头

11-06 qizideyouhuo:应用程序无法正常启动0xc0000

11-06 ipz-185:win7系统vcf文件怎么打开

11-06 傻哥蹦迪:win10系统s4怎么打开usb调试

11-06 八神浩树gtaste:回收站清空了怎么恢复

11-06 妖尾之黑色守护:win10系统电脑没有1440x900

11-06 校园至尊魔王小说:win7系统浏览网页时字体

11-06 女斗合众国:win10系统访问共享文件夹提示请

11-06 tokyo hot n0654:恢复win7系统默认字体一招

11-06 雨酷仙境:设置win7系统转移临时文件夹腾出

11-06 阿穆纳伊之杖:win7系统开始菜单在右边还原

11-06 tunespotting:win10系统火狐flash插件总是

11-06 甘尔葛分析师：计谋网站seo关键词暴涨有什

11-06 蔡贵霖: 计谋网站seo关键词暴涨有什么秘密

11-06 博益网首页:ao3网页版进入不了解决方法

11-06 漏斗子专栏: 网站数据分析小白易懂精华篇

11-06 见证双虹怎么做:win7系统开启telnet命令的

11-06 颾狐蝶蜋:系统资源不足无法完成请求的服务

11-06 国光中学校歌:提交网站到alexa查询详细步骤

11-06 西安有情天:静态网页和动态网页的区别

11-06 红木雅尚斋:外部链接构造对网站的好处

11-06 前官礼遇：防止域名劫持–增强域安全性的10

11-06 密传二转答案: 中文分词算法有哪些

11-06 金泉家园邮编:百度快照劫持的表现及应对方

Python url.URL类代码示例发布时间：2022-05-26

Python opt_factory.opt_factory函数代码示例发布时间：2022-05-26

Python util.grid_equal函数代码示例

1 Python 入门教程

Python入门教程 Python 是一种解释型、面向对象、动态数据类型的高级程序设计语言。 P

阅读：13772|2022-01-22

2 Python wikiutil.getFrontPage函数代码示例

Python wikiutil.getFrontPage函数代码示例

阅读：9579|2022-05-24

3 Python 简介

Python 简介 Python 是一个高层次的结合了解释性、编译性、互动性和面向对象的脚本

阅读：4066|2022-01-22

4 Python tests.group函数代码示例

Python tests.group函数代码示例

阅读：4039|2022-05-27

5 Python util.check_if_user_has_permission

Python util.check_if_user_has_permission函数代码示例

阅读：3819|2022-05-27

6 Python 操练实例98

Python 练习实例98 Python 100例题目：从键盘输入一个字符串，将小写字母全部转换成大

阅读：3499|2022-01-22

7 Python 环境搭建

Python 环境搭建本章节我们将向大家介绍如何在本地搭建 Python 开发环境。 Py

阅读：3022|2022-01-22

8 Python output.darkgreen函数代码示例

Python output.darkgreen函数代码示例

阅读：2639|2022-05-25

9 Python 基础语法

Python 基础语法 Python 语言与 Perl，C 和 Java 等语言有许多相似之处。但是，也

阅读：2624|2022-01-22

10 Python 中文编码

Python 中文编码前面章节中我们已经学会了如何用 Python 输出 Hello, World!，英文没

阅读：2290|2022-01-22

客服电话

电子邮件

Python sgml.SGMLParser类代码示例

示例1: test_extract_emails_mailto

示例2: test_mailto_ignored_in_links

示例3: test_get_emails_filter

示例4: close

示例5: _handle_textarea_tag_inside_form

示例6: test_mailto_subject_body

示例7: test_get_clear_text_body

示例8: test_meta_tags

示例9: test_nested_with_text

示例10: _handle_script_tag_start

示例11: test_meta_tags_with_single_quotes

示例12: test_none

示例13: test_reference_with_colon

示例14: _handle_form_tag_start

示例15: test_parsed_references

示例16: test_get_clear_text_body_encodings

示例17: __init__

示例18: test_get_clear_text_issue_4402

示例19: test_case_sensitivity

示例20: _handle_select_tag_end

请发表评论

全部评论

上一篇：

下一篇：

Python util.grid_equal函数代码示例

Python util.get_worker_name函数代码示例

Python util.get_webmention_target函数代

Python util.get_uuid函数代码示例

Python util.get_type_by_name函数代码示例

Python util.grid_equal函数代码示例

Python util.get_worker_name函数代码示例

Python util.get_webmention_target函数代

Python util.get_uuid函数代码示例

Python util.get_type_by_name函数代码示例

Python util.get_stdout函数代码示例

关于我们

产品与服务

解决方案

139-2527-9053

示例17: init