Running a Pipeline
SeqDesk supports two launch contexts:
- Study pipelines run across the selected samples of a study.
- Order pipelines run on the linked sequencing files of samples in an order.
In both cases, SeqDesk prepares the package inputs automatically and executes the workflow either locally or on a SLURM cluster.
Common Prerequisites
Before running a pipeline:
- Pipelines must be enabled in admin settings (
pipelines.enabled: true) - The execution environment must be configured (local or SLURM)
- You need the FACILITY_ADMIN role
Some packages require linked reads. For example, MAG, FASTQ Checksum, and FastQC require FASTQ files to already be linked. Simulate Reads is the exception because it generates read files instead of consuming existing ones.
Launching a Study Pipeline
Study pipelines are the right choice for workflows that combine multiple samples into larger analyses, reports, or submission jobs.
Open the study
Navigate to the study that contains your samples. Go to the Pipelines tab.
Select a pipeline
Choose from the available pipelines (e.g., MAG). Each pipeline shows its description and requirements.
Configure parameters
Adjust pipeline-specific settings:
MAG Pipeline options:
| Parameter | Default | Description |
|---|---|---|
| Stub Mode | false | Test mode — runs fast without actual analysis |
| Skip MEGAHIT | false | Skip the MEGAHIT assembler |
| Skip SPAdes | true | Skip the SPAdes assembler |
| Skip Prokka | true | Skip gene annotation |
| Skip CONCOCT | true | Skip CONCOCT binning |
| Skip Bin QC | false | Skip bin quality control |
| Skip GTDB-Tk | false | Skip taxonomy classification |
Select samples
Choose which samples from the study to include. All selected samples must have reads assigned.
Launch
Confirm and start the run. SeqDesk:
- Generates a samplesheet CSV from your samples and reads
- Creates a run directory (e.g.,
MAG-20240126-001/) - Builds the Nextflow execution command
- Starts the pipeline (locally or via SLURM)
Launching an Order Pipeline
Order pipelines are the right choice for sample-level sequencing utilities such as read simulation, checksum validation, and read QC.
Open the order
Navigate to the order you want to work on. Use the sequencing or pipeline area for that order, depending on the package and your current workflow.
Review linked sequencing files
Check whether the samples already have linked FASTQ files. This is required for packages such as FASTQ Checksum and FastQC. If the order has no reads yet, you can start with Simulate Reads.
Select an order pipeline
Choose the package you want to run for that order. The current built-in order catalog includes Simulate Reads, FASTQ Checksum, and FastQC.
Configure parameters
Order pipelines typically have narrower configuration than study pipelines. Examples:
| Pipeline | Example parameters |
|---|---|
| Simulate Reads | Mode, read count, read length, replace existing files |
| FASTQ Checksum | Usually no additional configuration |
| FastQC | Usually no additional configuration |
Launch
Confirm and start the run. SeqDesk:
- Generates the package inputs from order samples and linked reads
- Creates a run directory for the package
- Builds the Nextflow execution command
- Starts the pipeline and tracks the run
- Resolves artifacts and validated
Readwriteback after completion
Run Number Format
Each run gets a unique number: {PIPELINE}-{YYYYMMDD}-{NNN} (e.g.,
MAG-20240126-001).
Input Generation
SeqDesk auto-generates the package inputs that Nextflow expects. The exact file shape depends on the package, but the source data always comes from canonical SeqDesk records.
| Scope | Typical generated input |
|---|---|
| Study pipeline | A study-level samplesheet built from selected samples and their reads |
| Order pipeline | A samplesheet or manifest generated from order samples and linked reads |
For the MAG pipeline, each row contains:
| Column | Source |
|---|---|
| sample | Sample alias or ID |
| group | Sample group (from study) |
| short_reads_1 | Path to R1 FASTQ file |
| short_reads_2 | Path to R2 FASTQ file |
The generated input is saved in the run directory, typically as
samplesheet.csv or a package-specific manifest file.
Execution Modes
Local
Nextflow runs directly on the SeqDesk server. Suitable for testing and small datasets.
SLURM
Nextflow submits jobs to a SLURM cluster. Configure in admin settings:
| Setting | Default | Description |
|---|---|---|
| Queue | default | SLURM partition name |
| Cores | 4 | CPUs per job |
| Memory | 16GB | Memory per job |
| Time Limit | 24h | Maximum run time |
| Additional Options | — | Extra SLURM flags |
The SLURM job ID is tracked in the queueJobId field for status monitoring.
Run Directory Structure
Each run creates a directory under the configured pipelineRunDir. The exact
outputs differ by package, but the common execution files are similar:
{PIPELINE}-{YYYYMMDD}-{NNN}/
├── script.sh # Generated Nextflow command
├── samplesheet.csv # Or another generated package input
├── cluster_config.cfg # Nextflow configuration
├── nextflow.log # Execution log
├── trace.txt # Process trace (TSV)
├── output # stdout
├── error # stderr
└── ... # Package-specific outputs and artifacts