Skip to Content
Pipelines & AnalysisAdding Custom Pipelines

Adding Custom Pipelines

SeqDesk uses a modular pipeline package system. Each pipeline is defined by a set of configuration files in the pipelines/ directory.

Package Structure

pipelines/{pipeline-id}/ ├── manifest.json # Inputs, outputs, execution config ├── definition.json # Workflow DAG with step dependencies ├── registry.json # UI configuration and config schema ├── samplesheet.yaml # Samplesheet column definitions ├── README.md # Documentation ├── parsers/ # Optional output parsers │ ├── checkm.yaml │ └── gtdbtk.yaml └── scripts/ └── generate.sh # Optional custom scripts

manifest.json

The manifest defines what the pipeline needs and produces:

{ "id": "my-pipeline", "name": "My Pipeline", "version": "1.0.0", "nfcorePipeline": "nf-core/my-pipeline", "nfcoreVersion": "2.0.0", "inputs": { "samplesheet": { "type": "csv", "required": true, "description": "Input samplesheet" } }, "outputs": { "assemblies": { "type": "directory", "pattern": "Assembly/**/*.fa.gz", "description": "Assembled contigs" } }, "execution": { "profile": "conda", "additionalArgs": "--outdir results" } }

definition.json

The definition describes the Nextflow workflow DAG — each process, its dependencies, and how steps connect:

{ "steps": [ { "id": "FASTQC", "name": "FastQC", "category": "QC", "processes": ["FASTQC_RAW"], "dependsOn": [] }, { "id": "ASSEMBLY", "name": "Assembly", "category": "Assembly", "processes": ["MEGAHIT"], "dependsOn": ["FASTQC"] } ] }

This powers the DAG visualization in the monitoring UI.

registry.json

The registry configures how the pipeline appears in the UI:

{ "id": "my-pipeline", "name": "My Pipeline", "description": "Description shown to users", "category": "Analysis", "canStart": "FACILITY_ADMIN", "showToUsers": true, "requirements": { "pairedEnd": true, "minSamples": 1, "readsRequired": true }, "configSchema": { "stubMode": { "type": "boolean", "default": false, "label": "Stub Mode", "description": "Run in test mode" } } }

samplesheet.yaml

Defines how to generate the input samplesheet from the database:

columns: - name: sample source: sample.sampleAlias fallback: sample.sampleId - name: group source: sample.studyId default: "0" - name: short_reads_1 source: read.file1 required: true - name: short_reads_2 source: read.file2 required: false format: csv header: true

SeqDesk uses this definition to automatically generate the CSV samplesheet from the selected study samples and their assigned reads.

Output Parsers

Optional YAML files in the parsers/ directory define how to parse pipeline outputs into database records:

# parsers/checkm.yaml type: tsv pattern: "GenomeBinning/QC/checkm_summary.tsv" target: bin columns: binName: "Bin Id" completeness: "Completeness" contamination: "Contamination"

Pipeline Adapter

For complex output resolution, you can create a TypeScript adapter in src/lib/pipelines/adapters/. The adapter discovers output files, matches them to samples, and creates database records (Assemblies, Bins, Artifacts).

Registration

New pipelines are automatically discovered from the pipelines/ directory. The pipeline registry loads all registry.json files at startup and makes them available in the UI.

To enable a pipeline, set its enabled flag in the PipelineConfig database table or through the admin settings.