在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称(OpenSource Name):mjhucla/Google_Refexp_toolbox开源软件地址(OpenSource Url):https://github.com/mjhucla/Google_Refexp_toolbox开源编程语言(OpenSource Language):Jupyter Notebook 97.2%开源软件介绍(OpenSource Introduction):Python toolbox for the Google Referring Expressions DatasetThe Google RefExp dataset is a collection of text descriptions of objects in images which builds on the publicly available MS-COCO dataset. Whereas the image captions in MS-COCO apply to the entire image, this dataset focuses on text descriptions that allow one to uniquely identify a single object or region within an image. See more details in this paper: Generation and Comprehension of Unambiguous Object Descriptions Sample of the dataGreen dot is the object that is being referred to. Sentences are generated by humans in a way that uniquely describes the chosen object. Requirements
Setup and data downloadingEasy setup
Running the setup.py script will do the following five things:
At each step you will be prompted by keyboard whether or not you would like to skip a step. Note that the MS COCO images (13GB) and annotations (158MB) are very large and it takes some time to download all of them. Manual downloading and setupYou can download the GoogleRefexp data directly from this link. If you have already played with MS COCO and do not want to have two copies of them, you can choose to create a symbolic link from external to your COCO toolkit. E.g.:
Please make sure the following on are on your path:
You can create symbolic links if you have already downloaded the data and compiled the COCO toolbox. Then run setup.py to download the Google Refexp data and compile this toolbox. You can skip steps 2, 3, 4. DemosFor visualization and utility functions, please see google_refexp_dataset_demo.ipynb. For automatic and Amazon Mechanical Turk (AMT) evaluation of the comprehension and generation tasks, please see google_refexp_eval_demo.ipynb; The appropriate output format for a comprehension/generation algorithm is described in ./evaluation/format_comprehension_eval.md and ./evaluation/format_generation_eval.md We also provide two sample outputs for reference. For the comprehension task, we use a naive baseline which is a random shuffle of the region candidates (./evaluation/sample_results/sample_results_comprehension.json). For the generation task, we use a naive baseline which outputs the class name of an object (./evaluation/sample_results/sample_results_generation.json). If you are not familiar with AMT evaluations, please see this tutorial The interface and APIs provided by this toolbox have already grouped 5 evaluations into one HIT. In our experiment, paying 2 cents for one HIT leads to reasonable results. CitationIf you find the dataset and toolbox useful in your research, please consider citing:
LicenseThis data is released by Google under the following license:
Toolbox Developers |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论