Running a Pipeline
Pipelines run from a study context. SeqDesk generates the samplesheet automatically and executes Nextflow workflows either locally or on a SLURM cluster.
Prerequisites
Before running a pipeline:
- Pipelines must be enabled in admin settings (
pipelines.enabled: true) - The execution environment must be configured (local or SLURM)
- Samples must have reads assigned (FASTQ files linked)
- You need the FACILITY_ADMIN role
Launching a Pipeline Run
Open the study
Navigate to the study that contains your samples. Go to the Pipelines tab.
Select a pipeline
Choose from the available pipelines (e.g., MAG). Each pipeline shows its description and requirements.
Configure parameters
Adjust pipeline-specific settings:
MAG Pipeline options:
| Parameter | Default | Description |
|---|---|---|
| Stub Mode | false | Test mode — runs fast without actual analysis |
| Skip MEGAHIT | false | Skip the MEGAHIT assembler |
| Skip SPAdes | true | Skip the SPAdes assembler |
| Skip Prokka | true | Skip gene annotation |
| Skip CONCOCT | true | Skip CONCOCT binning |
| Skip Bin QC | false | Skip bin quality control |
| Skip GTDB-Tk | false | Skip taxonomy classification |
Select samples
Choose which samples from the study to include. All selected samples must have reads assigned.
Launch
Confirm and start the run. SeqDesk:
- Generates a samplesheet CSV from your samples and reads
- Creates a run directory (e.g.,
MAG-20240126-001/) - Builds the Nextflow execution command
- Starts the pipeline (locally or via SLURM)
Run Number Format
Each run gets a unique number: {PIPELINE}-{YYYYMMDD}-{NNN} (e.g.,
MAG-20240126-001).
Samplesheet Generation
SeqDesk auto-generates the samplesheet that Nextflow expects. For the MAG pipeline, each row contains:
| Column | Source |
|---|---|
| sample | Sample alias or ID |
| group | Sample group (from study) |
| short_reads_1 | Path to R1 FASTQ file |
| short_reads_2 | Path to R2 FASTQ file |
The samplesheet is saved in the run directory as samplesheet.csv.
Execution Modes
Local
Nextflow runs directly on the SeqDesk server. Suitable for testing and small datasets.
SLURM
Nextflow submits jobs to a SLURM cluster. Configure in admin settings:
| Setting | Default | Description |
|---|---|---|
| Queue | default | SLURM partition name |
| Cores | 4 | CPUs per job |
| Memory | 16GB | Memory per job |
| Time Limit | 24h | Maximum run time |
| Additional Options | — | Extra SLURM flags |
The SLURM job ID is tracked in the queueJobId field for status monitoring.
Run Directory Structure
Each run creates a directory under the configured pipelineRunDir:
MAG-20240126-001/
├── script.sh # Generated Nextflow command
├── samplesheet.csv # Auto-generated input
├── cluster_config.cfg # Nextflow configuration
├── nextflow.log # Execution log
├── trace.txt # Process trace (TSV)
├── output # stdout
├── error # stderr
├── Assembly/ # Assembled contigs
├── GenomeBinning/ # Genome bins
├── Taxonomy/ # Classification results
└── multiqc/ # QC reports