• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

gereeter/hsdecomp: A decompiler for GHC-compiled Haskell

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称(OpenSource Name):

gereeter/hsdecomp

开源软件地址(OpenSource Url):

https://github.com/gereeter/hsdecomp

开源编程语言(OpenSource Language):

Python 98.8%

开源软件介绍(OpenSource Introduction):

hsdecomp

A decompiler for GHC-compiled Haskell

Dependencies

Trying it out

To decompile a file without any installation steps, simply run the runner.py script on the file you want to decompile:

python3 runner.py path/to/binary

Installation

hsdecomp utilizes setuptools for packaging and installation. To install:

python3 setup.py install

Known Limitations

Note that testing has been slim, so there probably are many other limitations not mentioned here.

  • No support for stripped binaries.
  • No support for direct manipulation of unboxed types. This generally shouldn't be a problem for unopimized binaries, as all that manipulation should be hidden behind library calls.
  • No support for tail recursion (which gets compiled to a loop).
  • Limited ability to display useful patterns in case expressions. As a replacement for proper names, patterns of the form <tag n> are shown.
  • No support for FFI.
  • Limited to x86 and x86-64.
  • Limited to ELF files.

How It Works

The decompiler is composed of several distinct stages:

  • Metadata Parsing. In this stage, we read basic metadata from the file, including the names of all symbols in the program, the version of GHC the program was compiled with, and whether the binary is 32 bit or 64 bit. Code for this process can be found in hsdecomp/metadata.py.
  • Code Parsing. In this stage, we recursively locate and parse every relevant section of code into an internal interpretation representation. This is the meat of the work done by the decompiler, and can be found primarily in hsdecomp/parse/__init__.py. Note that much of the analysis is done by means of simulation, for which the code can be found at hsdecomp/machine.py.
  • Type Inference. Although much of the interpretation of the binary can be found directly, the patterns which case expressions are branching on are initially opaque to the decompiler. Type inference allows displaying more precise patterns. Note that this stage is currently extremely primitive.
  • Optimization. At this stage in the pipeline, the decompiler has a fairly clear understanding of what is going on. However, the information is laid out as it is in the binary, with many small, uninlined expressions. To increase readability, the decompiler will perform various passes over the interpretations to clean them up and make them easier for a human to understand. The code for this is at hsdecomp/optimize.py.
  • Display. Finally, the decompiled code must be displayed to the user. This currently uses a fairly hacky pretty printer implemented at hsdecomp/show.py.

Unfortunately, I haven't written a full description of any of these stages or even adequately commented my code. However, I wrote a description of manually decompiling a file for the sCTF security competition. The output of this decompiler on that file can be found at test/lambda1/output in this repository.




鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap