Skip to Content
Pipelines & AnalysisRunning a Pipeline

Running a Pipeline

Pipelines run from a study context. SeqDesk generates the samplesheet automatically and executes Nextflow workflows either locally or on a SLURM cluster.

Prerequisites

Before running a pipeline:

  • Pipelines must be enabled in admin settings (pipelines.enabled: true)
  • The execution environment must be configured (local or SLURM)
  • Samples must have reads assigned (FASTQ files linked)
  • You need the FACILITY_ADMIN role

Launching a Pipeline Run

Open the study

Navigate to the study that contains your samples. Go to the Pipelines tab.

Select a pipeline

Choose from the available pipelines (e.g., MAG). Each pipeline shows its description and requirements.

Configure parameters

Adjust pipeline-specific settings:

MAG Pipeline options:

ParameterDefaultDescription
Stub ModefalseTest mode — runs fast without actual analysis
Skip MEGAHITfalseSkip the MEGAHIT assembler
Skip SPAdestrueSkip the SPAdes assembler
Skip ProkkatrueSkip gene annotation
Skip CONCOCTtrueSkip CONCOCT binning
Skip Bin QCfalseSkip bin quality control
Skip GTDB-TkfalseSkip taxonomy classification

Select samples

Choose which samples from the study to include. All selected samples must have reads assigned.

Launch

Confirm and start the run. SeqDesk:

  1. Generates a samplesheet CSV from your samples and reads
  2. Creates a run directory (e.g., MAG-20240126-001/)
  3. Builds the Nextflow execution command
  4. Starts the pipeline (locally or via SLURM)

Run Number Format

Each run gets a unique number: {PIPELINE}-{YYYYMMDD}-{NNN} (e.g., MAG-20240126-001).

Samplesheet Generation

SeqDesk auto-generates the samplesheet that Nextflow expects. For the MAG pipeline, each row contains:

ColumnSource
sampleSample alias or ID
groupSample group (from study)
short_reads_1Path to R1 FASTQ file
short_reads_2Path to R2 FASTQ file

The samplesheet is saved in the run directory as samplesheet.csv.

Execution Modes

Local

Nextflow runs directly on the SeqDesk server. Suitable for testing and small datasets.

SLURM

Nextflow submits jobs to a SLURM cluster. Configure in admin settings:

SettingDefaultDescription
QueuedefaultSLURM partition name
Cores4CPUs per job
Memory16GBMemory per job
Time Limit24hMaximum run time
Additional OptionsExtra SLURM flags

The SLURM job ID is tracked in the queueJobId field for status monitoring.

Run Directory Structure

Each run creates a directory under the configured pipelineRunDir:

MAG-20240126-001/ ├── script.sh # Generated Nextflow command ├── samplesheet.csv # Auto-generated input ├── cluster_config.cfg # Nextflow configuration ├── nextflow.log # Execution log ├── trace.txt # Process trace (TSV) ├── output # stdout ├── error # stderr ├── Assembly/ # Assembled contigs ├── GenomeBinning/ # Genome bins ├── Taxonomy/ # Classification results └── multiqc/ # QC reports