Skip to content

Tutorial Main

danielm710 edited this page Nov 19, 2020 · 8 revisions

AXIOME3-GUI Usage Tutorial

Input preparation

Prior to following this tutorial, please download input files (FASTQ files, manifest file, metadata file) here

After downloading input files for the tutorial, make sure to...

  1. Place the FASTQ files in the directory of your choice

    Make sure to have FASTQ files on the same machine as the AXIOME3_GUI app is installed

    If you installed the app on a local computer, store the data on the local computer

    If you installed the app on a remote server, store the data on the remote server

  2. Modify 'sample_manifest.txt' file

    sample-id,absolute-filepath,direction
    sample1,ABSOLUTE_PATH_TO_DIR/sample1_forward.fastq.gz,forward
    sample1,ABSOLUTE_PATH_TO_DIR/sample1_reverse.fastq.gz,reverse
    sample2,ABSOLUTE_PATH_TO_DIR/sample2_forward.fastq.gz,forward
    sample2,ABSOLUTE_PATH_TO_DIR/sample2_reverse.fastq.gz,reverse
    sample3,ABSOLUTE_PATH_TO_DIR/sample3_forward.fastq.gz,forward
    sample3,ABSOLUTE_PATH_TO_DIR/sample3_reverse.fastq.gz,reverse
    

    Replace ABSOLUTE_PATH_TO_DIR with the path to directory FASTQ files are in (if you install the app on a remote server, FASTQ files MUST BE stored on the remote server, and replace ABSOLUTE_PATH_TO_DIR with the path to directory on the server)

    For example, if you place FASTQ files in /home/some_user/my_dir, manifest file would look like the following:

    sample-id,absolute-filepath,direction
    sample1,/home/some_user/my_dir/sample1_forward.fastq.gz,forward
    sample1,/home/some_user/my_dir/sample1_reverse.fastq.gz,reverse
    sample2,/home/some_user/my_dir/sample2_forward.fastq.gz,forward
    sample2,/home/some_user/my_dir/sample2_reverse.fastq.gz,reverse
    sample3,/home/some_user/my_dir/sample3_forward.fastq.gz,forward
    sample3,/home/some_user/my_dir/sample3_reverse.fastq.gz,reverse
    

Input Upload

Input(s)

  1. Manifest file
  2. FASTQ files (stored on the same machine AXIOME3_GUI app is installed)

Steps

  1. Upload manifest file

  2. Choose options

    1. Sample Type: Currently, AXIOME3 only supports paired-end reads

    2. Input Format: It has three parts to it: [Single-end or paired-end] + [Phred encoding type] + [Manifest format]

      • PairedEndFastqManifestPhred33: paired-end FASTQ file with Phred+33 encoding and default manifest format
      • PairedEndFastqManifestPhred33V2: paired-end FASTQ file with Phred+33 encoding and manifest format V2
      • PairedEndFastqManifestPhred64: paired-end FASTQ file with Phred+64 encoding and default manifest format
      • PairedEndFastqManifestPhred64V2: paired-end FASTQ file with Phred+64 encoding and manifest format V2

      Most modern FASTQ sequence files are Phred+33 encoded.

      - Default manifest file format -
      
      sample-id,absolute-filepath,direction
      sample_1,/SOME_FOLDER/SAMPLE_1_R1.fastq.gz,forward
      sample_1,/SOME_FOLDER/SAMPLE_1_R2.fastq.gz,reverse
      sample_2,/SOME_FOLDER/SAMPLE_2_R1.fastq.gz,forward
      sample_2,/SOME_FOLDER/SAMPLE_2_R2.fastq.gz,reverse
      
      - Manifest file format V2-
      
      sample-id,forward-absolute-filepath,reverse-absolute-filepath
      sample_1,/SOME_FOLDER/SAMPLE_1_R1.fastq.gz,/SOME_FOLDER/SAMPLE_1_R2.fastq.gz
      sample_2,/SOME_FOLDER/SAMPLE_2_R1.fastq.gz,/SOME_FOLDER/SAMPLE_2_R2.fastq.gz
      
      Note different column headers between default and V2 format
      

      For the tutorial, we will be using PairedEndFastqManifestPhred33 for Input Format option.

    3. multiple run: In case you want to analyze multiple sequencing runs that are not sequenced together, you can check this option to be yes

      In order to use this option, you need to add run_ID (case sensitive) column to the manifest file like the example below

      - Default manifest file format with multiple run-
      
      sample-id,absolute-filepath,direction,run_ID
      sample_1,/SOME_FOLDER/SAMPLE_1_R1.fastq.gz,forward,myRun1
      sample_1,/SOME_FOLDER/SAMPLE_1_R2.fastq.gz,reverse,myRun1
      sample_2,/SOME_FOLDER/SAMPLE_2_R1.fastq.gz,forward,myRun2
      sample_2,/SOME_FOLDER/SAMPLE_2_R2.fastq.gz,reverse,myRun2
      
      - Manifest file format V2 with multiple run-
      
      sample-id,forward-absolute-filepath,reverse-absolute-filepath,run_ID
      sample_1,/SOME_FOLDER/SAMPLE_1_R1.fastq.gz,/SOME_FOLDER/SAMPLE_1_R2.fastq.gz,myRun1
      sample_2,/SOME_FOLDER/SAMPLE_2_R1.fastq.gz,/SOME_FOLDER/SAMPLE_2_R2.fastq.gz,myRun2
      
      Note different column headers between default and V2 format
      

      Using multiple run option will separately analyze the samples with same ID instead of analyzing all the samples together.

      For the tutorial, multiple run option should be No

    When you are done uploading input file and choosing options, it should look like the image below.

    Input upload step 1

  3. Click ANALYZE! to start analysis.

When it's done running, it should look like the following,

Input upload done

You may click on View Report to see the summary report.

Input upload report

Denoise

Input(s)

  1. Manifest file
  2. FASTQ files (stored on the same machine AXIOME3_GUI app is installed)
  3. QIIME2 sequence visualization (.qzv) (output of Input Upload module)

Note that it has same inputs as 'Input Upload' module

Steps

  1. Prepare manifest file and FASTQ files as described in 'Input Upload'

  2. Choose options (Denoise module has 4 additional options compared to Input Upload module)

    1. trim-left-f: Position at which forward sequence should be trimmed starting at the 5` end
    2. trim-left-r: Position at which reverse sequence should be trimmed starting at the 5` end
    3. trunc-len-f: Position at which forward sequence should be truncated starting from the 3` end. There should still be a minimum 20 nucleotide overlap after truncation
    4. trunc-len-r: Position at which reverse sequence should be truncated starting from the 3` end. There should still be a minimum 20 nucleotide overlap after truncation
    5. cores: Number of cores to use (using more cores will also use more RAM due to job parallelization)

    trim-left-f and trim-left-r can be used to remove sequence artifacts (barcode sequence, adapter, and so forth) in the beginning of the sequences (First n bases are removed).

    trunc-len-f and trunc-len-r can be used to remove low quality regions at the end of the sequences (Bases from the n-th base to the end of the sequences are removed).

    You may use QIIME2 View to determine low quality regions in the sequences (you will need QIIME2 sequence visualization (.qzv), which is the output of the Input Upload module)

    The image below is the visual explanation of these options with hypothetical amplicon sequence.

    Denoise option example

    For the tutorial, we will use the following values.

    1. trim-left-f: 19 (First 19 bases in the forward reads are adapter sequences, hence remove them)
    2. trim-left-r: 21 (First 21 bases in the reverse reads are adapter sequences, hence remove them)
    3. trunc-len-f: 250 (No bases are removed since the forward read has quality scores in all bases)
    4. trunc-len-r: 240 (Last 10 bases removed; Read length is 250 and there are 10 bases between the 240th base and the end of the sequence)
  3. Click ANALYZE! to start analysis.

It should take some time to run this module. When it's done running, it should look like the following,

Denoise done

You may click on View Report to see the summary report.

Denoise report

Analysis

Please download outputs of the previous section ('Denoise' module) prior to following this section.

Input(s)

  1. QIIME2 archived feature table (.qza)
    • output of Denoise module
    • should be named as merged_table.qza
  2. QIIME2 archived representative sequences (.qza)
    • output of Denoise module
    • should be named as merged_rep_seqs.qza
  3. Metadata file (refer to the sample metadata file)
  4. Classifier (optional) By default, it will use Naive-Bayes classifier trained on SILVA database (release 138).

Steps

  1. Upload inputs to the corresponding fields

  2. You may optionally upload custom trained QIIME2 archived feature-classifier (if you don't upload it, AXIOME3 will use the default Naive-Bayes classifier)

    1. Choose options
    2. sampling depth: Samples with read count lower than this value will be discarded, and samples with read count higher than this value will be subsampled to this value

    You may refer to the report generated from Denoise module to pick an appropriate sampling depth. 2. cores: Number of cores to use (using more cores will also use more RAM due to job parallelization)

  3. After uploading inputs, start analysis!

Similar to Input Upload and Denoise modules, you may see the summary report once the job is done running.

Extension

There's more tutorial on Extension here