Skip to main content

SMRTLink/SMRTAnalysis

SMRTLink v10+ uses Cromwell workflow manager which offers additional flexibility and compatibility with SLURM. Commandline version of v10 does not depend on web GUI service and is always available.

There are two main steps involved - provide input parameters for your workflow and then submit the job via SLURM.

View the available workflows

module load smrtlink/13
$ pbcromwell show-workflows

cromwell.workflows.pb_detect_methyl: 5mC CpG Detection
cromwell.workflows.pb_ccs: Circular Consensus Sequencing (CCS)
cromwell.workflows.pb_demux_ccs: Demultiplex Barcodes
cromwell.workflows.pb_export_ccs: Export Reads
cromwell.workflows.pb_assembly_hifi: Genome Assembly
cromwell.workflows.pb_align_ccs: HiFi Mapping
cromwell.workflows.pb_target_enrichment: HiFi Target Enrichment
cromwell.workflows.pb_sars_cov2_kit: HiFiViral SARS-CoV-2 Analysis
cromwell.workflows.pb_isoseq: Iso-Seq Analysis
cromwell.workflows.pb_mark_duplicates: Mark PCR Duplicates
cromwell.workflows.pb_microbial_analysis: Microbial Genome Analysis
cromwell.workflows.pb_segment_reads: Read Segmentation
cromwell.workflows.pb_segment_reads_and_isoseq: Read Segmentation and Iso-Seq
cromwell.workflows.pb_segment_reads_and_sc_isoseq: Read Segmentation and Single-Cell Iso-Seq
cromwell.workflows.pb_sc_isoseq: Single-Cell Iso-Seq
cromwell.workflows.pb_sv_ccs: Structural Variant Calling
cromwell.workflows.pb_trim_adapters: Trim Ultra-Low Adapters
cromwell.workflows.pb_undo_demux: Undo Demultiplexing
cromwell.workflows.pb_variant_calling: Variant Calling

View input options for a workflow

Using Genome Assembly as an example -

$ pbcromwell show-workflow-details pb_assembly_hifi


Workflow Summary
Workflow Id    : cromwell.workflows.pb_assembly_hifi
Name           : Genome Assembly
Description    : Cromwell workflow pb_assembly_hifi
Required Inputs: 
Optional Inputs: ConsensusReadSet XML
Tags           : auto-analysis, analysis, assembly, ccs 
Task Options:
  reads = None
    reads (file)
  ipa2_genome_size = 0k
    Genome Length (string)
  ipa2_downsampled_coverage = 0
    Downsampled coverage (integer)
  ipa2_advanced_options = 
    Advanced Assembly Options (string)
  ipa2_run_polishing = True
    Run polishing (boolean)
  ipa2_run_phasing = True
    Run phasing (boolean)
  ipa2_run_purge_dups = True
    Purge duplicate contigs from the assembly (boolean)
  ipa2_ctg_prefix = ctg
    ipa2_ctg_prefix (string)
  ipa2_reads_db_prefix = reads
    ipa2_reads_db_prefix (string)
  ipa2_cleanup_intermediate_files = True
    Cleanup intermediate files (boolean)
  dataset_filters = 
    Filters to Add to the Data Set (string)
  filter_min_qv = 20
    Min. CCS Predicted Accuracy (Phred Scale) (integer)
  downsample_factor = 0
    Downsampling Factor (integer)
  mem_scale_factor = 8
    Memory Scale Factor (EXPERIMENTAL) (integer)
  add_memory_mb = 0
    Add task memory (MB) (integer)


Example Usage:

  $ pbcromwell run pb_assembly_hifi \

  $ pbcromwell run pb_assembly_hifi \
      -e input1.consensusreadset.xml \
      --task-option reads=None \
      --task-option ipa2_genome_size="0k" \
      --task-option ipa2_downsampled_coverage=0 \
      --task-option ipa2_advanced_options="" \
      --task-option ipa2_run_polishing=True \
      --task-option ipa2_run_phasing=True \
      --task-option ipa2_run_purge_dups=True \
      --task-option ipa2_ctg_prefix="ctg" \
      --task-option ipa2_reads_db_prefix="reads" \
      --task-option ipa2_cleanup_intermediate_files=True \
      --task-option dataset_filters="" \
      --task-option filter_min_qv=20 \
      --task-option downsample_factor=0 \
      --task-option mem_scale_factor=8 \
      --task-option add_memory_mb=0 \
      --config cromwell.conf \
      --nproc 8

Use cromwell config files for Ceres

As shown above, the pbcromwell run command requires a cromwell config file for the jobs to be submitted via SLURM. On ceres, the config files are avaiable in a central location. Users can point to the files directly or can copy and modify based on their individual requirements. The config files are located at

/system/smrtanalysis/10/slurm_template/cromwell-slurm-short.conf
/system/smrtanalysis/10/slurm_template/cromwell-slurm-medium.conf
/system/smrtanalysis/10/slurm_template/cromwell-slurm-mem.conf

The file names correspond to the partitions the jobs will be submitted to.

Priority users can copy those files to their work directory and modify the following (lines 130-131)

        runtime-attributes = """
        Int cpu = 8
        Int requested_memory_mb_per_core = 8000
        String queue_name = "short"
        String? jms_args
        """

to

        runtime-attributes = """
        Int cpu = 8
        Int requested_memory_mb_per_core = 8000
        String queue_name = "priority"
        String? jms_args = "--qos=your_QOS --time=14:00:00" 
        """

Users can also modify the CPU threads or memory per core values but these default values should suffice for most workflows.

Sample batch script

#!/bin/bash

#SBATCH -N 1 # No. of nodes used
#SBATCH -n 4      # Threads 
#SBATCH -t 240    # Minutes

module load smrtlink/10

pbcromwell run pb_assembly_hifi \
      -e input1.consensusreadset.xml \
      --task-option reads=None \                       # Task options vary based on the workflow
      --task-option ipa2_genome_size=0 \               # These task options are optional and will use default values if not specified
      --task-option ipa2_downsampled_coverage=0 \
      --task-option ipa2_advanced_options="" \
      --task-option ipa2_run_polishing=True \
      --task-option ipa2_run_phasing=True \
      --task-option ipa2_run_purge_dups=True \
      --task-option ipa2_ctg_prefix="ctg." \
      --task-option ipa2_reads_db_prefix="reads" \
      --task-option ipa2_cleanup_intermediate_files=True \
      --task-option dataset_filters="" \
      --task-option filter_min_qv=20 \
      --config /system/smrtanalysis/10/slurm_template/cromwell-slurm-short.conf \
      --nproc 8 \                                     # this option is required for some stages in the pipeline
      --backend SLURM \                               # Set the default backend
      --tmp-dir \${TMPDIR} \                          # Use TMPDIR variable
      -c 8 \                                          # Number of chunks
      --output-dir hifi-out			      #