• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

ckreibich/scholar.py: A parser for Google Scholar, written in Python

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称(OpenSource Name):

ckreibich/scholar.py

开源软件地址(OpenSource Url):

https://github.com/ckreibich/scholar.py

开源编程语言(OpenSource Language):

Python 100.0%

开源软件介绍(OpenSource Introduction):

scholar.py

scholar.py is a Python module that implements a querier and parser for Google Scholar's output. Its classes can be used independently, but it can also be invoked as a command-line tool.

The script used to live at http://icir.org/christian/scholar.html, and I've moved it here so I can more easily manage the various patches and suggestions I'm receiving for scholar.py. Thanks guys, for all your interest! If you'd like to get in touch, email me at [email protected] or ping me on Twitter.

Cheers,
Christian

Features

  • Extracts publication title, most relevant web link, PDF link, number of citations, number of online versions, link to Google Scholar's article cluster for the work, Google Scholar's cluster of all works referencing the publication, and excerpt of content.
  • Extracts total number of hits as reported by Scholar (new in version 2.5)
  • Supports the full range of advanced query options provided by Google Scholar, such as title-only search, publication date timeframes, and inclusion/exclusion of patents and citations.
  • Supports article cluster IDs, i.e., information relating to the variants of an article already identified by Google Scholar
  • Supports retrieval of citation details in standard external formats as provided by Google Scholar, including BibTeX and EndNote.
  • Command-line tool prints entries in CSV format, simple plain text, or in the citation export format.
  • Cookie support for higher query volume, including ability to persist cookies to disk across invocations.

Note

I will always strive to add features that increase the power of this API, but I will never add features that intentionally try to work around the query limits imposed by Google Scholar. Please don't ask me to add such features.

Examples

Try scholar.py --help for all available options. Note, the command line arguments changed considerably in version 2.0! A few examples:

Retrieve one article written by Einstein on quantum theory:

$ scholar.py -c 1 --author "albert einstein" --phrase "quantum theory"
         Title On the quantum theory of radiation
           URL http://icole.mut-es.ac.ir/downloads/Sci_Sec/W1/Einstein%201917.pdf
          Year 1917
     Citations 184
      Versions 3
    Cluster ID 17749203648027613321
      PDF link http://icole.mut-es.ac.ir/downloads/Sci_Sec/W1/Einstein%201917.pdf
Citations list http://scholar.google.com/scholar?cites=17749203648027613321&as_sdt=2005&sciodt=0,5&hl=en
 Versions list http://scholar.google.com/scholar?cluster=17749203648027613321&hl=en&as_sdt=0,5
       Excerpt The formal similarity between the chromatic distribution curve for thermal radiation [...]

Note the cluster ID in the above. Using this ID, you can directly access the cluster of articles Google Scholar has already determined to be variants of the same paper. So, let's see the versions:

$ scholar.py -C 17749203648027613321
         Title On the quantum theory of radiation
           URL http://icole.mut-es.ac.ir/downloads/Sci_Sec/W1/Einstein%201917.pdf
     Citations 184
      Versions 0
    Cluster ID 17749203648027613321
      PDF link http://icole.mut-es.ac.ir/downloads/Sci_Sec/W1/Einstein%201917.pdf
Citations list http://scholar.google.com/scholar?cites=17749203648027613321&as_sdt=2005&sciodt=0,5&hl=en
       Excerpt The formal similarity between the chromatic distribution curve for thermal radiation [...]

         Title ON THE QUANTUM THEORY OF RADIATION
           URL http://www.informationphilosopher.com/solutions/scientists/einstein/1917_Radiation.pdf
     Citations 0
      Versions 0
      PDF link http://www.informationphilosopher.com/solutions/scientists/einstein/1917_Radiation.pdf
       Excerpt The formal similarity between the chromatic distribution curve for thermal radiation [...]

         Title The Quantum Theory of Radiation
           URL http://web.ihep.su/dbserv/compas/src/einstein17/eng.pdf
     Citations 0
      Versions 0
      PDF link http://web.ihep.su/dbserv/compas/src/einstein17/eng.pdf
       Excerpt 1 on the assumption that there are discrete elements of energy, from which quantum [...]

Let's retrieve a BibTeX entry for that quantum theory paper. The best BibTeX often seems to be the one linked from search results, not those in the article cluster, so let's do a search again:

$ scholar.py -c 1 --author "albert einstein" --phrase "quantum theory" --citation bt
@article{einstein1917quantum,
  title={On the quantum theory of radiation},
  author={Einstein, Albert},
  journal={Phys. Z},
  volume={18},
  pages={121--128},
  year={1917}
}

Report the total number of articles Google Scholar has for Einstein:

$ scholar.py --txt-globals --author "albert einstein" | grep '\[G\]' | grep Results
[G]    Results 4190

License

scholar.py is using the standard BSD license.




鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap