Skip to content

IARCbioinfo/bam2peaks-nf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bam2peaks -nf

Nextflow pipeline for peaks calling with MACS

Workflow representation

Description

Nextflow pipeline designed for peak calling using MACS and IDR, coupled with QC generation using deeptools. The saturation option generates peaks by successively considering increasing percentages of the total reads, repeating the operation multiple times within the range of 0.05 to 0.95.

Dependencies

  1. Nextflow : for common installation procedures see the IARC-nf repository.

MACS

  1. MACS2 or MACS3

IDR (Irreproducible Discovery Rate)

  1. IDR

Deeptools

  1. Deeptools

A conda receipe, and docker and singularity containers are available with all the tools needed to run the pipeline (see "Usage")

Input

Type Description
--input_file input tabulation-separated values file with columns sample (sample name), tag (short name for figures), bam (bam file path) and group (group), for chip mode, you must also provide input : 0 for normal samples and 1 for input sample

eg:

sample tag bam group input
SAM015 S15 S15.bam 1 0
SAM016 S16 S16.bam 1 0
SAM010 S10 S10.bam 1 1

Parameters

  • Mandatory

Name Example value Description
--ref hg38 Reference fasta file hg19, hg38 or mm10'
--gencode gencode.bed gencode file
  • Optional

Name Default value Description
--mode atac There is two mode : atac or chip, chip require "input" sample(s)
--output_folder bam2peaks Output folder name
--cpu 16 number of CPUs
--mem 16 memory
--extsize 150 MACS extsize : extendsize of peaks to to fix-sized fragments.
  • Flags

Flags are special parameters without value.

Name Description
--help print usage and optional parameters
--broad Compute broadpeaks instead of narrowpeaks
--ignoreDuplicates Ignore duplicates reads
--saturation Run saturation process

Usage

To run the pipeline for ATAC, one can type:

nextflow run iarcbioinfo/bam2peaks-nf -r latest -profile singularity --input_file input.tsv --ref hg38 --gencode gencode.bed --output_folder output --ignoreDuplicates

To run the pipeline without singularity just remove "-profile singularity". Alternatively, one can run the pipeline using a docker container (-profile docker) the conda receipe containing all required dependencies (-profile conda).

Chip-seq mode

To use the pipeline for Chip-seq, add the --chip flag :

nextflow run iarcbioinfo/bam2peaks-nf -r latest -profile singularity --input_file input.tsv --ref hg38 --gencode gencode.bed --output_folder output --mode chip --broad --extsize 320

Output

Type Description
bw/ Outputs of bamCoverage in bigWig format
Counts/ With --saturation return the number of reads for each subsets
Peaks Peaks computed by MACS
Peaks_intersect Peaks intersections computed by idr
QCs deeptools graphics
Saturation_peaks With --saturation, all peaks files for each subsets

Contributions

Name Email Description
Vincent Cahais CahaisV@iarc.who.int Developer to contact for support
Claire Renard Renardc@iarc.who.int Developer

About

Nextflow pipeline for peaks calling with MACS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published