Python pyquery.PyQuery类代码示例

OGeek|极客世界-中国程序员成长平台 › 门户 › 编程› Python›Python编程经验

原作者: [db:作者] 来自: [db:来源] 收藏邀请

本文整理汇总了Python中pyquery.pyquery.PyQuery类的典型用法代码示例。如果您正苦于以下问题：Python PyQuery类的具体用法？Python PyQuery怎么用？Python PyQuery使用的例子？那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。

在下文中一共展示了PyQuery类的20个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于我们的系统推荐出更棒的Python代码示例。

示例1: parse

 async def parse(self, input_text, *k, **kk):
     if not await self._check_support(input_text):
         return []
     html_text = await get_url_service.get_url_async(input_text)
     html = PyQuery(html_text)
     title = html('h1.main_title > a').text()
     if not title:
         for a in html('div.crumb-item > a'):
             a = PyQuery(a)
             if a.attr('href') in input_text:
                 title = a.text()
     if not title:
         try:
             title = match1(html_text, '<title>([^<]+)').split('-')[0]
         except AttributeError:
             pass
     data = {
         "data": [],
         "more": False,
         "title": title,
         "total": 0,
         "type": "list",
         "caption": "271视频全集"
     }
     data["data"] = await self._get_list_info_api(html_text)
     return data

开发者ID:wwqgtxx，项目名称:wwqLyParse，代码行数:26，代码来源:iqiyilistparser.py

示例2: extract_data

def extract_data(text):
    global total_data
    pq = PyQuery(text)
    data = pq.find('p.data').text()
    total_data = total_data + data
    nextState = pq.find('.nextState').attr('value')
    return nextState

开发者ID:gleitz，项目名称:code-kata，代码行数:7，代码来源:solution.py

示例3: parse

 def parse(self, input_text, *k, **kk):
     html2 = get_url(input_text)
     html2 = PyQuery(html2)
     w120 = html2("div.gut > div.listTab > div.listPic > div.list > dl.w120 > dt > a")
     total = len(w120)
     title = html2("div.gut > div.listTab > div.listPic > div.tab:first-child > p.p1 > i").text()
     data = {
         "data": [],
         "more": False,
         "title": title,
         "total": total,
         "type": "list",
         "caption": "乐视视频全集"
     }
     for i in w120:
         i = PyQuery(i)
         url = i.attr("href")
         title = i("a > img").attr("title")
         info = {
             "name": title,
             "no": title,
             "subtitle": title,
             "url": url
         }
         data["data"].append(info)
     return data

开发者ID:erics8，项目名称:wwqLyParse，代码行数:26，代码来源:lelistparser.py

示例4: detail_page

    def detail_page(self, response):
        t = response.text.replace('&nbsp;', '')
        d = PyQuery(t)
        base = response.save
        base_url = response.url
        fenbu = dict(map(
                lambda x: (x.find('.field-righttit').text(), x.find('ul').text()),
                list(d.find(".right-border div").items())
        ))
        basic_info = dict(map(
                lambda x: (x.text().replace(u'：', "").strip(),
                           x.parent().text().replace(x.text(), "").strip()),
                list(d.find('.fc-gray').items())
        ))
        other_info = dict(map(
                lambda x: (x.text().replace(u'：', ''), x.next().text()), list(d.find('.xiaoqu-otherinfo dt').items())
        ))
        info_temp = {
            'base': base,
            'sell_rent_info': fenbu,
            'basic_info': basic_info,
            'other_info': other_info
        }
        url = base_url + 'amenities/'
        self.crawl(url, callback=self.amenities_page, save=info_temp, retries=100)

        return [
            2,
            response.url,
            json.dumps(info_temp),
            time.strftime('%Y-%m-%d %X', time.localtime())
        ]

开发者ID:yangmingsong，项目名称:python，代码行数:32，代码来源:ganji_ershoufang.py

示例5: urlHandle

 def urlHandle(self,input_text):
     html = PyQuery(common.getUrl(input_text))
     a = html.children('a')
     a = PyQuery(a)
     url = a.attr("href")
     print('urlHandle:"'+input_text+'"-->"'+url+'"')
     return url

开发者ID:v1-hermit，项目名称:wwqLyParse，代码行数:7，代码来源:jumpurlhandle.py

示例6: Parse_le

 def Parse_le(self, input_text):
     html = PyQuery(get_url(input_text))
     items = html('dt.d_tit')
     title = "LETV"
     i = 0
     data = {
         "data": [],
         "more": False,
         "title": title,
         "total": i,
         "type": "collection"
     }
     for item in items:
         a = PyQuery(item).children('a')
         name = a.text()
         no = a.text()
         subtitle = a.text()
         url = a.attr('href')
         if url is None:
             continue
         if not re.match('^http://www\.le\.com/.+\.html', url):
             continue
         info = {
             "name": name,
             "no": no,
             "subtitle": subtitle,
             "url": url,
             "caption": "首页地址列表"
         }
         data["data"].append(info)
         i = i + 1
     total = i
     data["total"] = total
     return data

开发者ID:erics8，项目名称:wwqLyParse，代码行数:34，代码来源:indexparser.py

示例7: url_handle

 async def url_handle(self, input_text):
     html = await get_url_service.get_url_async(input_text)
     html = PyQuery(html)
     a = html.children('a')
     a = PyQuery(a)
     url = a.attr("href")
     return url

开发者ID:wwqgtxx，项目名称:wwqLyParse，代码行数:7，代码来源:jumpurlhandle.py

示例8: onSuccess

 def onSuccess(self, tid, context, response,headers):
     resp = PyQuery(response)
     for h3 in resp.find("h3 a"):
         url="http://dev.open.taobao.com/bbs/"+h3.attrib['href']
         print h3.text
         Spider.executeSql(self,"insert into task (task_type,url,status,http_code,task_context) values('topbbs文章',%s,0,-1,%s)",(url,h3.text))
     Spider.onSuccess(self,tid, context,response,headers);

开发者ID:liuyun96，项目名称:python，代码行数:7，代码来源:TopBBS.py

示例9: __getPageAllLink

    def __getPageAllLink(self,p):        
#        if self.kind=="1":
#            lis=PyQuery(p)("div.qiuzu li")
#        elif self.kind=="2":
#            lis=PyQuery(p)("div.qiuzu li")
        if self.kind=="1" or self.kind=="2":
            lis=PyQuery(p)("div.house")
        else:
            lis=PyQuery(p)("div.qiuzu li")
        links=[]
        for li in lis:
#            if self.kind=="3":
#                tm=PyQuery(li)("p.time span").eq(1).text()
#                link=self.baseurl+PyQuery(li)("p.housetitle a").attr("href")
            if self.kind=="2" or self.kind=="1":
                tm=PyQuery(li)("p.time").text()
                tm=tm and tm.replace("个人","") or ""
                link=self.baseurl+PyQuery(li)("p.housetitle a").attr("href")
            else: 
                tm=PyQuery(li)("span.li5").text()
                link=self.baseurl+PyQuery(li)("span.li2 a").attr("href")
            if self.kind=="4": 
                if PyQuery(li)("span.li1").text()=="合租 ":
                    continue
#            tm=PyQuery(li)("span.li5").text()
#            link=self.baseurl+PyQuery(li)("span.li2 a").attr("href")
            #link=self.baseurl+PyQuery(li)("span.li2 a").attr("href")
#            print link
            if u"天" in tm:
                s=tm.find(u"天")
                tm=tm[:s]
                if int(tm)<8:
                    links.append(link)
                else:
                    break
            elif u"小时" in tm:
                links.append(link)
            elif u"分钟" in tm:
                links.append(link)
            else:
                continue
            if 1:#not checkPath(homepath,self.folder,link):
                LinkLog.info("%s|%s"%(self.kind,link))
                try:
                    getContent(link,self.citycode,self.kind)
                except Exception,e:print "ganji getContent Exception %s"%e
            time.sleep(int(self.st))
#            fetch_quere.put({"mod":"soufang","link":link,"citycode":self.citycode,"kind":self.kind})
#        self.clinks.extend(links)
       
        if self.kind=="1" or self.kind=="2":
            if len(links)!=30:
                return False
            else:
                return True
        else:
            if len(links)!=35:
                return False
            else:
                return True

开发者ID:ptphp，项目名称:PyLib，代码行数:60，代码来源:soufun.py

示例10: url_handle

 def url_handle(self, input_text):
     html = PyQuery(get_url(input_text))
     a = html.children('a')
     a = PyQuery(a)
     url = a.attr("href")
     logging.debug('urlHandle:"' + input_text + '"-->"' + url + '"')
     return url

开发者ID:erics8，项目名称:wwqLyParse，代码行数:7，代码来源:jumpurlhandle.py

示例11: parse_html_page

    def parse_html_page(self):
        pq = PyQuery(self.html_page)
        main_table = pq('#mainBody > table.coltable')

        def find_row(text):
            for c in main_table.find('td:first-child').items():
                if c.text() == text:
                    return c.nextAll().items().next()

        def find_row_text(text, default=''):
            row = find_row(text)
            if row:
                return row.text()
            return default

        def find_row_html(text, default=''):
            row = find_row(text)
            if row:
                return row.html()
            return default

        self.info_hash = find_row_text('Info hash')
        self.title = pq.find('#mainBody > h1').text()
        self.category, self.subcategory = find_row_text('Type').split(' - ', 1)
        self.language = find_row_text('Language')
        self.cover_url = find_row('Picture:').find('img').attr('src')
        self.small_description = find_row_html('Small Description')
        self.description = find_row_html('Description')
        self.torrent_url = find_row('Download').find('a#dlNormal').attr('href')
        size_string = find_row_text('Size')
        match = re.match('.* \((?P<size>\d+(,\d\d\d)*) bytes\)', size_string)
        self.torrent_size = int(match.group('size').replace(',', ''))

开发者ID:ChaosTherum，项目名称:WhatManager2，代码行数:32，代码来源:models.py

示例12: Parse_v

 def Parse_v(self,input_text):
     print(input_text)
     html = PyQuery(common.getUrl(input_text))
     datainfo_navlist = PyQuery(html("#datainfo-navlist"))
     for a in datainfo_navlist.children('a'):
         a = PyQuery(a)
         url = a.attr("href")
         if re.search('www.iqiyi.com/(a_|lib/m)',url):
             return self.Parse(url)

开发者ID:v1-hermit，项目名称:wwqLyParse，代码行数:9，代码来源:listparser.py

示例13: parse

    async def parse(self, input_text, *k, **kk):
        html = await get_url_service.get_url_async(input_text)
        html = PyQuery(html)
        title = ""
        for meta in html('meta[itemprop="name"]'):
            meta = PyQuery(meta)
            title = meta.attr("content")
            break
        data = {
            "data": [],
            "more": False,
            "title": title,
            "total": 0,
            "type": "list",
            "caption": "QQ视频全集"
        }
        for a in html(".mod_episode a"):
            a = PyQuery(a)
            _title = ""
            for span in PyQuery(a("span")):
                span = PyQuery(span)
                if span.attr("itemprop") == "episodeNumber":
                    _title = "第%s集" % span.text()
                elif span.has_class("mark_v"):
                    _title += span.children("img").attr("alt")
            info = {
                "name": _title,
                "no": _title,
                "subtitle": _title,
                "url": a.attr("href")
            }
            data["data"].append(info)
        data["total"] = len(data["data"])

        return data

开发者ID:wwqgtxx，项目名称:wwqLyParse，代码行数:35，代码来源:qqlistparser.py

示例14: parse

    async def parse(self, input_text, *k, **kk):
        html = await get_url_service.get_url_async(input_text)
        html = PyQuery(html)
        p_title = html("div.pl-title")
        title = p_title.attr("title")
        list_id = re.search('https?://list.youku.com/albumlist/show/id_(\d+)\.html', input_text).group(1)
        ep = 'https://list.youku.com/albumlist/items?id={}&page={}&size=20&ascending=1&callback=a'

        first_u = ep.format(list_id, 1)
        xhr_page = await get_url_service.get_url_async(first_u)
        json_data = json.loads(xhr_page[14:-2])
        # print(json_data)
        # video_cnt = json_data['data']['total']
        xhr_html = json_data['html']
        # print(xhr_html)
        data = {
            "data": [],
            "more": False,
            "title": title,
            "total": 0,
            "type": "collection",
            "caption": "优酷视频全集"
        }
        last_num = 1
        while True:
            new_url = ep.format(list_id, last_num)
            json_data = await get_url_service.get_url_async(new_url)[14:-2]
            info = json.loads(json_data)
            if info.get("error", None) == 1 and info.get("message", None) == "success":
                new_html = info.get("html", None)
                if new_html:
                    new_html = PyQuery(new_html)
                    items = new_html("a[target='video'][data-from='2-1']")
                    for item in items:
                        item = PyQuery(item)
                        url = "http:" + item.attr("href")
                        title = item.attr("title")
                        info = {
                            "name": title,
                            "no": title,
                            "subtitle": title,
                            "url": url
                        }
                        data["data"].append(info)
                    last_num += 1
                else:
                    break
            else:
                break
        data["total"] = len(data["data"])
        # print(data)

        return data

开发者ID:wwqgtxx，项目名称:wwqLyParse，代码行数:53，代码来源:youkulistparser.py

示例15: __initPageNum

 def __initPageNum(self):
     initurl="%s/%s/&act=personal&options="%(self.baseUrl,self.urlpath)
     req=urllib2.Request(initurl, None, self.header)
     p=self.br.open(req).read()
     pg=PyQuery(p)("div#houses div.fl")
     if re.search('''(\d+)''',pg.text()):
         pg=re.search('''(\d+)''',pg.text()).group(1)
     r=self.__getPageAllLink(p)
     if not r:
         return
         
     self.pn= [i for i in range(int(pg)+1)][2:]
     print ""

开发者ID:aviatorBeijing，项目名称:ptpy，代码行数:13，代码来源:soufang.py

示例16: __getAllNeedLinks

    def __getAllNeedLinks(self):
        cond=True
        idx=0
        checkit="0"
        while  cond:
            url=self.baseUrl+self.urlpath%("f"+str(idx*32))
            #url="http://gz.ganji.com/fang2/u2f0/a1f768/"
#            print url
            try:
                req=urllib2.Request(url, None, self.header)
                p=self.br.open(req).read()
            except:
                continue
            else:
                check=PyQuery(p)("ul.pageLink li a.c").text()
                if check==None or check==checkit:
                    cond=False
                    break
                else:
                    checkit=check
                    links=PyQuery(p)("div.list dl")
                    p=None
#                    print len(links)
                    for link in links:
                        lk=self.baseUrl+PyQuery(link)(" a.list_title").attr("href")
#                        print lk
                        if self.kind=="3" or self.kind=="4":
                            tm=PyQuery(link)("dd span.time").text()
                            if re.match('''\d{2}-\d{2}''', tm):
                                Y=int(time.strftime('%Y', time.localtime()))
                                tm="%s-%s"%(Y,tm.strip())
                                if tm<self.endtime:
                                    cond=False
                                    break
                            elif "分钟" in tm:
                                pass
                            elif "小时" in tm:
                                pass
                            else:
                                cond=False
                                break
                        if not checkPath(homepath,self.folder,lk):
                            LinkLog.info("%s|%s"%(self.kind,lk))
                            try:
                                getContent(lk,self.citycode,self.kind,self.upc)
                            except Exception,e:print "ganji getContent Exception %s"%e
#                            fetch_quere.put({"mod":"ganji","link":lk,"citycode":self.citycode,"kind":self.kind})        
#                        if lk not in self.clinks:
#                            self.clinks.append(lk)
                idx=idx+1

开发者ID:ptphp，项目名称:PyLib，代码行数:50，代码来源:ganji.py

示例17: _parse

 def _parse(self, response):
     d = PyQuery(response)
     # page_turning
     __url = map(lambda x: x.attr('href'),
                 d.find(self.__css).items()
                 )
     if config_dictionary.get(self.__url_start).get('basejoin'):
         new_url = map(lambda u: urlparse.urljoin(self.__url_base, u), __url)
     else:
         new_url = __url
     self.__url_pool = self.__url_pool.union(set(new_url))
     # IP address extracting
     rst = ':'.join(d.text().split(' '))
     proxy_list = re.findall(pattern_ip_address, rst)
     proxy_port_queue.put((proxy_list, self.__url_base))

开发者ID:yangmingsong，项目名称:python，代码行数:15，代码来源:proxy_collection.py

示例18: serializeArray

def serializeArray(form):
    form = PyQuery(form)
    if not form.is_('form'):
        return []

    source = form.find('input, select, textarea')

    data = []
    for input in source:
        input = PyQuery(input)
        if input.is_('[disabled]') or not input.is_('[name]'):
            continue
        if input.is_('[type=checkbox]') and not input.is_('[checked]'):
            continue

        data.append((input.attr('name'), input.val()))

    return data

开发者ID:ivanp，项目名称:emailsopener，代码行数:18，代码来源:utils.py

示例19: Parse

	def Parse(self,input_text):
		html = PyQuery(self.getUrl(input_text))
		items = html('a')
		title = html('title').text()
		i =0
		data = {
			"data": [],
			"more": False,
			"title": title,
			"total": i,
			"type": "collection"
		}
		for item in items:
			a = PyQuery(item)
			name = a.attr('title')
			if name is None:
				name = a.text()
			no = name
			subtitle = name
			url = a.attr('href')
			if url is None:
				continue
			if name is None or name == "":
				continue
			if not re.match('(^(http|https)://.+\.(shtml|html))|(^(http|https)://.+/video/)',url):
				continue
			if re.search('(list|mall|about|help|shop|map|vip|faq|support|download|copyright|contract|product|tencent|upload|common|index.html|v.qq.com/u/|open.baidu.com)',url):
				continue
			if re.search('(下载|播 放|播放|投诉|评论|(\d{1,2}:\d{1,2}))',no):
				continue
			unsure = False
			
			info = {
				"name": name,
				"no": no,
				"subtitle": subtitle,
				"url": url,
				"unsure": unsure			
			}
			data["data"].append(info)
			i = i+1
		total = i
		data["total"] = total
		return data

开发者ID:road0001，项目名称:wwqLyParse，代码行数:44，代码来源:anypageparser.py

示例20: __getAllNeedLinks

 def __getAllNeedLinks(self):
     cond=True
     idx=0
     checkit="0"
     while  cond:
         url=self.baseUrl+self.urlpath%("f"+str(idx*32))
         #url="http://gz.ganji.com/fang2/u2f0/a1f768/"
         print url
         try:
             req=urllib2.Request(url, None, self.header)
             p=self.br.open(req).read()
         except:
             pass
         else:
             check=PyQuery(p)("ul.pageLink li a.c").text()
             if check==checkit:
                 break
             else:
                 checkit=check
                 links=PyQuery(p)("div.list dl")
                 print len(links)
                 for link in links:
                     lk=self.baseUrl+PyQuery(link)(" a.list_title").attr("href")
                     if self.kind=="3" or self.kind=="4":
                         tm=PyQuery(link)("dd span.time").text()
                         if re.match('''\d{2}-\d{2}''', tm):
                             Y=int(time.strftime('%Y', time.localtime()))
                             tm="%s-%s"%(Y,tm.strip())
                             if tm<self.endtime:
                                 break
                         elif "分钟" in tm:
                             pass
                         elif "小时" in tm:
                             pass
                         else:
                             break
                             
                     if lk not in self.clinks:
                         self.clinks.append(lk)
             idx=idx+1
         time.sleep(self.st)
     print len(self.clinks)

开发者ID:ptphp，项目名称:PyLib，代码行数:42，代码来源:ganji.py

注：本文中的pyquery.pyquery.PyQuery类示例由纯净天空整理自Github/MSDocs等源码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。

鲜花

握手

雷人

路过

鸡蛋

该文章已有0人参与评论

请发表评论

全部评论

专题导读

More+

10-27 六六分期app的软件客服如何联系？(六六分期

11-06 可心卡盟:win10系统火狐flash插件崩溃怎么

11-06 亲亲特价:怎么删除回收站图标

11-06 济南大学虚拟社区:鲁大师节能降温的具体办

11-06 xlueops.exe:无线网络安装向导

11-06 女斗合众国:win7系统cf与主机连接不稳定怎

11-06 0xc000022-[cf烟雾头]cf怎么调烟雾头

11-06 qizideyouhuo:应用程序无法正常启动0xc0000

11-06 ipz-185:win7系统vcf文件怎么打开

11-06 傻哥蹦迪:win10系统s4怎么打开usb调试

11-06 八神浩树gtaste:回收站清空了怎么恢复

11-06 妖尾之黑色守护:win10系统电脑没有1440x900

11-06 校园至尊魔王小说:win7系统浏览网页时字体

11-06 女斗合众国:win10系统访问共享文件夹提示请

11-06 tokyo hot n0654:恢复win7系统默认字体一招

11-06 雨酷仙境:设置win7系统转移临时文件夹腾出

11-06 阿穆纳伊之杖:win7系统开始菜单在右边还原

11-06 tunespotting:win10系统火狐flash插件总是

11-06 甘尔葛分析师：计谋网站seo关键词暴涨有什

11-06 蔡贵霖: 计谋网站seo关键词暴涨有什么秘密

11-06 博益网首页:ao3网页版进入不了解决方法

11-06 漏斗子专栏: 网站数据分析小白易懂精华篇

11-06 见证双虹怎么做:win7系统开启telnet命令的

11-06 颾狐蝶蜋:系统资源不足无法完成请求的服务

11-06 国光中学校歌:提交网站到alexa查询详细步骤

11-06 西安有情天:静态网页和动态网页的区别

11-06 红木雅尚斋:外部链接构造对网站的好处

11-06 前官礼遇：防止域名劫持–增强域安全性的10

11-06 密传二转答案: 中文分词算法有哪些

11-06 金泉家园邮编:百度快照劫持的表现及应对方

Python pyquickhelper.fLOG函数代码示例发布时间：2022-05-27

Python pyquery.pq函数代码示例发布时间：2022-05-27

Python util.grid_equal函数代码示例

1 Python 入门教程

Python入门教程 Python 是一种解释型、面向对象、动态数据类型的高级程序设计语言。 P

阅读：13811|2022-01-22

2 Python wikiutil.getFrontPage函数代码示例

Python wikiutil.getFrontPage函数代码示例

阅读：10203|2022-05-24

3 Python 简介

Python 简介 Python 是一个高层次的结合了解释性、编译性、互动性和面向对象的脚本

阅读：4092|2022-01-22

4 Python tests.group函数代码示例

Python tests.group函数代码示例

阅读：4044|2022-05-27

5 Python util.check_if_user_has_permission

Python util.check_if_user_has_permission函数代码示例

阅读：3845|2022-05-27

6 Python 操练实例98

Python 练习实例98 Python 100例题目：从键盘输入一个字符串，将小写字母全部转换成大

阅读：3515|2022-01-22

7 Python 环境搭建

Python 环境搭建本章节我们将向大家介绍如何在本地搭建 Python 开发环境。 Py

阅读：3032|2022-01-22

8 Python output.darkgreen函数代码示例

Python output.darkgreen函数代码示例

阅读：2655|2022-05-25

9 Python 基础语法

Python 基础语法 Python 语言与 Perl，C 和 Java 等语言有许多相似之处。但是，也

阅读：2651|2022-01-22

10 Python 中文编码

Python 中文编码前面章节中我们已经学会了如何用 Python 输出 Hello, World!，英文没

阅读：2303|2022-01-22

客服电话

电子邮件

Python pyquery.PyQuery类代码示例

示例1: parse

示例2: extract_data

示例3: parse

示例4: detail_page

示例5: urlHandle

示例6: Parse_le

示例7: url_handle

示例8: onSuccess

示例9: __getPageAllLink

示例10: url_handle

示例11: parse_html_page

示例12: Parse_v

示例13: parse

示例14: parse

示例15: __initPageNum

示例16: __getAllNeedLinks

示例17: _parse

示例18: serializeArray

示例19: Parse

示例20: __getAllNeedLinks

请发表评论

全部评论

上一篇：

下一篇：

Python util.grid_equal函数代码示例

Python util.get_worker_name函数代码示例

Python util.get_webmention_target函数代

Python util.get_uuid函数代码示例

Python util.get_type_by_name函数代码示例

Python util.grid_equal函数代码示例

Python util.get_worker_name函数代码示例

Python util.get_webmention_target函数代

Python util.get_uuid函数代码示例

Python util.get_type_by_name函数代码示例

Python util.get_stdout函数代码示例

关于我们

产品与服务

解决方案

139-2527-9053