Skip to content

fanyucai1/PopGen

Repository files navigation

Get started with PopGen

Illumina DRAGEN Resources for PopGen:https://developer.illumina.com/dragen/dragen-popgen

Accurate and efficient calling of small and large variants from popgen datasets using the DRAGEN Bio-IT Platform

DRAGEN reanalysis of the 1000 Genomes Dataset now available on the Registry of Open Data

UKBB Command Line for DRAGEN

dragen -r <hg38-ref-dir> \
--bam-input <input BAM file> \
--output-directory <out-dir> \
--output-file-prefix <prefix> \
--enable-variant-caller=true \
--vc-emit-ref-confidence=GVCF \
--vc-enable-vcf-output=true \
--enable-duplicate-marking=true \
--enable-map-align=true \
--enable-map-align-output=true \
--output-format=CRAM \
--vc-hard-filter 'DRAGENHardQUAL:all:QUAL<5.0;LowDepth:all:DP<=1' \
--vc-frd-max-effective-depth=40 \
--qc-cross-cont-vcf <path-to>/SNP_NCBI_GRCh38.vcf \
--qc-coverage-region-1 <path-to>/wgs_coverage_regions.hg38_minus_N.interval_list.bed \
--qc-coverage-reports-1 cov_report \
--qc-coverage-region-2 <path-to>/acmg59_allofus_19dec2019.GRC38.wGenes.NEW.bed \
--qc-coverage-reports-2 cov_report \
--qc-coverage-ignore-overlaps=true \
--qc-coverage-count-soft-clipped-bases=true \
--read-trimmers polyg \
--soft-read-trimmers none \
--intermediate-results-dir=/ephemeral/staging/tmp/ \
--repeat-genotype-enable=true \
--enable-cyp2d6=true \
--enable-sv=true \
--enable-cnv=true \
--cnv-enable-self-normalization=true \
--vc-enable-joint-detection=true

PopGen data processing and analysis workflows using the DRAGEN Platform (left) and GATK best practices (right) workflows

png/dragen-popgen.png

Global genomic biobanks and studies

png/cohort-studies.png

Carress H, Lawson D J, Elhaik E. Population genetic considerations for using biobanks as international resources in the pandemic era and beyond[J]. BMC genomics, 2021, 22: 1-19.

Tanjo T, Kawai Y, Tokunaga K, et al. Practical guide for managing large-scale human genome data in research[J]. Journal of Human Genetics, 2021, 66(1): 39-52.

png/genetic-data-sept2022.jpg

UK Biobank Allele Frequency Browser:https://afb.ukbiobank.ac.uk/

UK Biobank Whole-Genome Sequencing Consortium, Li S, Carss K J, et al. Whole-genome sequencing of half-a-million UK Biobank participants[J]. medRxiv, 2023: 2023.12. 06.23299426.

Halldorsson B V, Eggertsson H P, Moore K H S, et al. The sequences of 150,119 genomes in the UK Biobank[J]. Nature, 2022, 607(7920): 732-740.

WGS-based genome study of patient with rare disease and their families and cancer patients in England.

Cancer_rare-disease_Covid-19

png/all_of_us.png

All of Us Research Program Genomics Investigators. Genomic data in the All of Us Research Program[J]. Nature.

Mahmoud M, Huang Y, Garimella K, et al. Utility of long-read sequencing for All of Us[J]. bioRxiv, 2023: 2023.01. 23.525236.

Lennon N J, Kottyan L C, Kachulis C, et al. Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations[J]. Nature Medicine, 2024: 1-8.

png/Singapore.png

SG10K:Wu D, Dou J, Chai X, et al. Large-scale whole-genome sequencing of three diverse Asian populations in Singapore[J]. Cell, 2019, 179(3): 736-749. e15.

SG10K_Health:Wong E, Bertin N, Hebrard M, et al. The Singapore National Precision Medicine Strategy[J]. Nature Genetics, 2023, 55(2): 178-186.

SG10K_med:Chan S H, Bylstra Y, Teo J X, et al. Analysis of clinically relevant variants from ancestrally diverse Asian genomes[J]. Nature Communications, 2022, 13(1): 6694.

Precision Medicine Research Highlights:https://www.npm.sg/research/research-highlights/

Call for Proposals – Driver Projects for the PRECISE-SG100K Dataset:https://www.npm.sg/research/call-for-proposals/

“Call for Proposals” meeting:https://file.for.sg/sg100k-cfp.pdf

China

ChinaMAP:Cao Y, Li L, Xu M, et al. The ChinaMAP analytics of deep whole genome sequences in 10,588 individuals[J]. Cell research, 2020, 30(9): 717-731.

Westlake BioBank:Cong P K, Bai W Y, Li J C, et al. Genomic analyses of 10,376 individuals in the Westlake BioBank for Chinese (WBBC) pilot project[J]. Nature Communications, 2022, 13(1): 2939.

Tian Z, Chen F, Wang J, et al. CAS Array: design and assessment of a genotyping array for Chinese biobanking[J]. Precision Clinical Medicine, 2023, 6(1): pbad002.

Zhang P, Luo H, Li Y, et al. NyuWa Genome resource: a deep whole-genome sequencing-based variation profile and reference panel for the Chinese population[J]. Cell reports, 2021, 37(7).

png/Indigenomes.jpg

Jain A, Bhoyar R C, Pandhare K, et al. IndiGenomes: a comprehensive resource of genetic variants from over 1000 Indian genomes[J]. Nucleic Acids Research, 2021, 49(D1): D1225-D1232.

png/Qatar_genome.jpg

Mbarek H, Devadoss Gandhi G, Selvaraj S, et al. Qatar genome: Insights on genomics from the Middle East[J]. Human mutation, 2022, 43(4): 499-510.

Qatar Genome Program is about to enter a new era thanks to Illumina #DRAGEN.

TaiwanGenomes:https://genomes.tw/#/

png/TaiwanGenomes.jpg

Hsu J S, Wu D C, Shih S H, et al. Complete genomic profiles of 1,496 Taiwanese reveal curated medical insights[J]. Journal of Advanced Research, 2023.

Determining the human genetic variation by means of whole-genome sequencing in population scale.

Byrska-Bishop M, Evani U S, Zhao X, et al. High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios[J]. Cell, 2022, 185(18): 3426-3440. e19.

1KG Project reference panel:http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/

GenomeAsia 100K Project:https://genomeasia100k.org/

WGS-based genome study of people in South and East Asia.

The GenomeAsia 100K Project enables genetic discoveries across Asia[J]. Nature, 2019, 576(7785): 106-111.

Mexico City

Ziyatdinov A, Torres J, Alegre-Díaz J, et al. Genotyping, sequencing and analysis of 140,000 adults from Mexico City[J]. Nature, 2023, 622(7984): 784-793.

Nationwide biobank and genome cohort study in Finland.

Kurki M I, Karjalainen J, Palta P, et al. FinnGen provides genetic insights from a well-phenotyped isolated population[J]. Nature, 2023, 613(7944): 508-518.

Simons genome diversity project

Mallick S, Li H, Lipson M, et al. The Simons genome diversity project: 300 genomes from 142 diverse populations[J]. Nature, 2016, 538(7624): 201-206.

East Asian populations

Choi J, Kim S, Kim J, et al. A whole-genome reference panel of 14,393 individuals for East Asian populations accelerates discovery of rare functional variants[J]. Science Advances, 2023, 9(32): eadg6319.

Global Biobank Meta-analysis Initiative:https://www.globalbiobankmeta.org

png/GBMI.png

Zhou W, Kanai M, Wu K H H, et al. Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease[J]. Cell Genomics, 2022, 2(10).

Method

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages