View on GitHub

schoolwiki

Using Nextflow

Required Tools

This pipeline uses Nextflow, a bioinformatics workflow tool and Singularity, a containerization tool.

Make sure both tools rae installed before running this pipeline. If running on a HPC cluster then load required modules.

module load nextflow/20.01.0 singularity/3.5.3

Create Input Directory

The SCHOOL workflow can run many different configurations. The input files must be placed into a fastqs directory along with a design file.

cd school/
mkdir -p ./fastq

Running Nextflow Workflows

This workflow can run either DNA or RNA sequencing. Please determine the desired configuration to achieve the proper analysis run.

DNA Workflow

DNA Design File

The design file must named design.txt and be in tab seperated format for the workflows. This workflow can be run with tumor-only or with tumor and normal pairs. If running tumor-only then do not include the NormalID collumn in the design file.

SampleID CaseID TumorID NormalID FqR1 FqR2
Sample1 Fam1 Sample1 Sample2 Sample1.R1.fastq.gz Sample1.R2.fastq.gz
Sample2 Fam1 Sample1 Sample2 Sample2.R1.fastq.gz Sample2.R2.fastq.gz
Sample3 Fam2 Sample3 Sample4 Sample3.R1.fastq.gz Sample3.R2.fastq.gz
Sample4 Fam2 Sample3 Sample4 Sample4.R1.fastq.gz Sample4.R2.fastq.gz

DNA Parameters

DNA Run Workflow

nextflow run -w $workdir ${baseDir}/dna.nf --input /project/shared/bicf_workflow_ref/workflow_testdata/germline_variants/fastq --output ${basedir}/output --seqrunid 'SHI1333-27' --pon /project/shared/bicf_workflow_ref/human/grch38_cloud/panels/UTSW_V4_heme/mutect2.pon.vcf.gz --capture /project/shared/bicf_workflow_ref/human/grch38_cloud/panels/UTSW_V4_heme/targetpanel.bed --capturedir /project/shared/bicf_workflow_ref/human/grch38_cloud/panels/UTSW_V4_heme --version $gittag --genome /project/shared/bicf_workflow_ref/human/grch38_cloud/dnaref -resume

RNASeq Workflow

The RNA workflow can be run in with the whole genome, or with a specific list of genes of interest.

RNA Design File

The design file must named design.txt and be in tab seperated format for the workflows. All RNA workflows can be run usin the same design file format.

SampleID CaseID FqR1 FqR2
Sample1 Fam1 Sample1.R1.fastq.gz Sample1.R2.fastq.gz
Sample2 Fam1 Sample2.R1.fastq.gz Sample2.R2.fastq.gz
Sample3 Fam2 Sample3.R1.fastq.gz Sample3.R2.fastq.gz
Sample4 Fam2 Sample4.R1.fastq.gz Sample4.R2.fastq.gz

RNA Parameters

RNA Run Workflow

Whole rnaseq example

nextflow run -w $workdir ${baseDir}/rna.nf --input /project/shared/bicf_workflow_ref/workflow_testdata/rnaseq/fastq --output ${basedir}/output --seqrunid 'test' --version $gittag --genome /project/shared/bicf_workflow_ref/human/grch38_cloud/rnaref --geneinfo /project/shared/bicf_workflow_ref/human/gene_info.human.txt -resume

GOI rnaseq example

nextflow run -w $workdir ${baseDir}/rna.nf --input /project/shared/bicf_workflow_ref/workflow_testdata/rnaseq/fastq --output ${basedir}/output --seqrunid 'test' --version $gittag --genome /project/shared/bicf_workflow_ref/human/grch38_cloud/rnaref --geneinfo /project/shared/bicf_workflow_ref/human/gene_info.human.txt --glist /project/shared/bicf_workflow_ref/human/grch38_cloud/panels/UTSW_V4_heme/genelist.txt -resume