Skip to Content
Pipelines & AnalysisRunning a Pipeline

Running a Pipeline

SeqDesk supports two launch contexts:

  • Study pipelines run across the selected samples of a study.
  • Order pipelines run on the linked sequencing files of samples in an order.

In both cases, SeqDesk prepares the package inputs automatically and executes the workflow either locally or on a SLURM cluster.

Common Prerequisites

Before running a pipeline:

  • Pipelines must be enabled in admin settings (pipelines.enabled: true)
  • The execution environment must be configured (local or SLURM)
  • You need the FACILITY_ADMIN role

Some packages require linked reads. For example, MAG, FASTQ Checksum, and FastQC require FASTQ files to already be linked. Simulate Reads is the exception because it generates read files instead of consuming existing ones.

Launching a Study Pipeline

Study pipelines are the right choice for workflows that combine multiple samples into larger analyses, reports, or submission jobs.

Open the study

Navigate to the study that contains your samples. Go to the Pipelines tab.

Select a pipeline

Choose from the available pipelines (e.g., MAG). Each pipeline shows its description and requirements.

Configure parameters

Adjust pipeline-specific settings:

MAG Pipeline options:

ParameterDefaultDescription
Stub ModefalseTest mode — runs fast without actual analysis
Skip MEGAHITfalseSkip the MEGAHIT assembler
Skip SPAdestrueSkip the SPAdes assembler
Skip ProkkatrueSkip gene annotation
Skip CONCOCTtrueSkip CONCOCT binning
Skip Bin QCfalseSkip bin quality control
Skip GTDB-TkfalseSkip taxonomy classification

Select samples

Choose which samples from the study to include. All selected samples must have reads assigned.

Launch

Confirm and start the run. SeqDesk:

  1. Generates a samplesheet CSV from your samples and reads
  2. Creates a run directory (e.g., MAG-20240126-001/)
  3. Builds the Nextflow execution command
  4. Starts the pipeline (locally or via SLURM)

Launching an Order Pipeline

Order pipelines are the right choice for sample-level sequencing utilities such as read simulation, checksum validation, and read QC.

Open the order

Navigate to the order you want to work on. Use the sequencing or pipeline area for that order, depending on the package and your current workflow.

Review linked sequencing files

Check whether the samples already have linked FASTQ files. This is required for packages such as FASTQ Checksum and FastQC. If the order has no reads yet, you can start with Simulate Reads.

Select an order pipeline

Choose the package you want to run for that order. The current built-in order catalog includes Simulate Reads, FASTQ Checksum, and FastQC.

Configure parameters

Order pipelines typically have narrower configuration than study pipelines. Examples:

PipelineExample parameters
Simulate ReadsMode, read count, read length, replace existing files
FASTQ ChecksumUsually no additional configuration
FastQCUsually no additional configuration

Launch

Confirm and start the run. SeqDesk:

  1. Generates the package inputs from order samples and linked reads
  2. Creates a run directory for the package
  3. Builds the Nextflow execution command
  4. Starts the pipeline and tracks the run
  5. Resolves artifacts and validated Read writeback after completion

Run Number Format

Each run gets a unique number: {PIPELINE}-{YYYYMMDD}-{NNN} (e.g., MAG-20240126-001).

Input Generation

SeqDesk auto-generates the package inputs that Nextflow expects. The exact file shape depends on the package, but the source data always comes from canonical SeqDesk records.

ScopeTypical generated input
Study pipelineA study-level samplesheet built from selected samples and their reads
Order pipelineA samplesheet or manifest generated from order samples and linked reads

For the MAG pipeline, each row contains:

ColumnSource
sampleSample alias or ID
groupSample group (from study)
short_reads_1Path to R1 FASTQ file
short_reads_2Path to R2 FASTQ file

The generated input is saved in the run directory, typically as samplesheet.csv or a package-specific manifest file.

Execution Modes

Local

Nextflow runs directly on the SeqDesk server. Suitable for testing and small datasets.

SLURM

Nextflow submits jobs to a SLURM cluster. Configure in admin settings:

SettingDefaultDescription
QueuedefaultSLURM partition name
Cores4CPUs per job
Memory16GBMemory per job
Time Limit24hMaximum run time
Additional OptionsExtra SLURM flags

The SLURM job ID is tracked in the queueJobId field for status monitoring.

Run Directory Structure

Each run creates a directory under the configured pipelineRunDir. The exact outputs differ by package, but the common execution files are similar:

{PIPELINE}-{YYYYMMDD}-{NNN}/ ├── script.sh # Generated Nextflow command ├── samplesheet.csv # Or another generated package input ├── cluster_config.cfg # Nextflow configuration ├── nextflow.log # Execution log ├── trace.txt # Process trace (TSV) ├── output # stdout ├── error # stderr └── ... # Package-specific outputs and artifacts