• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

microsoft/genomicsnotebook: Jupyter Notebooks on Azure for Genomics Data Analysi ...

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

microsoft/genomicsnotebook

开源软件地址:

https://github.com/microsoft/genomicsnotebook

开源编程语言:

Jupyter Notebook 98.2%

开源软件介绍:

Genomics Data Analysis with Jupyter Notebooks on Azure

text

Jupyter notebook is a great tool for data scientists who are working on genomics data analysis. In this repo, we demonstrate the use of Azure Notebooks for genomics data analysis via GATK, Picard, Bioconductor and Python libraries.

3/31/2022: NEW DEMO VIDEO: How to use 'genomicsnotebook' repo in GitHub Codespaces?

For more information about Codespaces please visit the product page

Here is the list of sample notebooks on this repo:

  1. genomics.ipynb: Analysis from 'uBAM' to 'structured data table' analysis.
  2. genomicsML.ipynb: Train Machine Learning models with Genomics + Clinical Data
  3. genomics-platinum-genomes.ipynb: Accessing Illumina Platinum Genomes data from Azure Open Datasets* and to make initial data analysis.
  4. genomics-reference-genomes.ipynb: Accessing reference genomes from Azure Open Datasets*
  5. genomics-clinvar.ipynb: Accessing ClinVar data from Azure Open Datasets*
  6. genomics-giab.ipynb: Accessing Genome in a Bottle data from Azure Open Datasets*
  7. SnpEff.ipynb: Accessing SnpEff databases from Azure Open Datasets*
  8. 1000 Genomes.ipynb: Accessing 1000 Genomes dataset from Azure Open Datasets*
  9. GATKResourceBundle.ipynb: Accessing GATK resource bundle from Azure Open Datasets*
  10. ENCODE.ipynb: Accessing ENCODE dataset from Azure Open Datasets*
  11. genomics-OpenCRAVAT.ipynb: Accessing OpenCRAVAT dataset from Azure Open Datasets and deploy built-in Azure Data Science VM for OpenCRAVAT*
  12. Bioconductor.ipynb: Pulling Bioconductor Docker image from Microsoft Container Registry
  13. simtotable.ipynb: Simulate NGS data, use Cromwell on Azure OR Microsoft Genomics service for secondary analysis and convert the gVCF data to a structured data table.
  14. igv_jupyter_extension_sample.ipynb: Download sample VCF file from Azure Open Datasets and use igv-jupyter extension on Jupyter Lab environment.
  15. radiogenomics.ipynb: Combine DICOM, VCF and gene expression data for patient segmentation analysis.

*Technical note: Explore Azure Genomics Data Lake with Azure Storage Explorer

1. Prerequisites

Create and manage Azure Machine Learning workspaces in the Azure portal

text

For further details on creation of Azure ML workspace please visit this page.

Run the notebook in your workspace

This chapter uses the cloud notebook server in your workspace for an install-free and pre-configured experience. Use your own environment if you prefer to have control over your environment, packages and dependencies.

Follow along with this video or use the detailed steps below to clone and run the tutorial from your workspace.

Watch the video

2. Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

3. References

  1. Jupyter Notebook on Azure
  2. Introduction to Azure Notebooks
  3. GATK
  4. Picard
  5. Azure Machine Learning
  6. Azure Open Datasets
  7. Cromwell on Azure
  8. Bioconductor



鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap