Adding Custom Pipelines

SeqDesk uses a modular pipeline package system. Each installed package lives in pipelines/{pipeline-id}/ and combines runtime metadata, UI configuration, generated input definitions, and optional discovery helpers.

Package Structure


pipelines/{pipeline-id}/
├── manifest.json         # Runtime contract: targets, inputs, outputs, writeback
├── definition.json       # Workflow DAG with step dependencies
├── registry.json         # UI configuration and config schema
├── samplesheet.yaml      # Generated input definitions
├── README.md             # Documentation
└── scripts/
    └── discover-outputs.mjs  # Optional output discovery/writeback helper

manifest.json

The manifest is the runtime source of truth for what a package supports, what it reads, how it executes, what it produces, and which canonical records it may write back to.


{
  "manifestVersion": 1,
  "package": {
    "id": "my-pipeline",
    "name": "My Pipeline",
    "version": "1.0.0",
    "description": "Description shown in SeqDesk"
  },
  "targets": {
    "supported": ["order"]
  },
  "inputs": [
    {
      "id": "reads",
      "scope": "sample",
      "source": "sample.reads",
      "required": true
    }
  ],
  "execution": {
    "type": "nextflow",
    "pipeline": "./workflow",
    "version": "1.0.0",
    "profiles": ["conda"]
  },
  "outputs": [
    {
      "id": "sample_checksums",
      "scope": "sample",
      "destination": "sample_reads",
      "type": "artifact",
      "writeback": {
        "target": "Read",
        "mode": "merge",
        "fields": {
          "checksum1": "checksum1",
          "checksum2": "checksum2"
        }
      },
      "discovery": {
        "pattern": "checksums/*.json",
        "matchSampleBy": "filename"
      }
    }
  ]
}

Important manifest responsibilities:

targets.supported declares whether a package is meant for study, order, or both
inputs[].source describes which SeqDesk records the package consumes
outputs[].discovery tells SeqDesk how to locate produced files
outputs[].writeback declares safe canonical destinations such as Read
actual database writes are still validated and executed centrally by SeqDesk

definition.json

The definition describes the Nextflow workflow DAG — each process, its dependencies, and how steps connect:


{
  "steps": [
    {
      "id": "FASTQC",
      "name": "FastQC",
      "category": "QC",
      "processes": ["FASTQC_RAW"],
      "dependsOn": []
    },
    {
      "id": "ASSEMBLY",
      "name": "Assembly",
      "category": "Assembly",
      "processes": ["MEGAHIT"],
      "dependsOn": ["FASTQC"]
    }
  ]
}

This powers the DAG visualization in the monitoring UI.

registry.json

registry.json now focuses on presentation and configuration rather than being the runtime source of truth for scope or writeback. It configures how the package appears in the UI and which settings the user can edit before launch:


{
  "id": "my-pipeline",
  "name": "My Pipeline",
  "description": "Description shown to users",
  "category": "Analysis",
  "canStart": "FACILITY_ADMIN",
  "showToUsers": true,
  "configSchema": {
    "stubMode": {
      "type": "boolean",
      "default": false,
      "label": "Stub Mode",
      "description": "Run in test mode"
    }
  }
}

samplesheet.yaml

Defines how SeqDesk generates the package input file from database records:


columns:
  - name: sample
    source: sample.sampleAlias
    fallback: sample.sampleId
  - name: group
    source: sample.studyId
    default: "0"
  - name: short_reads_1
    source: read.file1
    required: true
  - name: short_reads_2
    source: read.file2
    required: false
format: csv
header: true

SeqDesk uses this definition to automatically generate the package input from the selected study samples or order-linked reads.

Output Discovery

Many packages can be resolved from declarative discovery patterns in the manifest. For more complex cases, packages can also ship a discovery script such as scripts/discover-outputs.mjs to match files back to samples and emit the metadata keys referenced by writeback.

This keeps order pipelines modular while preserving central validation of actual database updates.

Registration

New pipelines are automatically discovered from the pipelines/ directory. SeqDesk loads the manifest and registry files together, derives pipeline scope from the manifest, and exposes the package through the installed-pipeline list and public registry metadata.

To enable a pipeline, set its enabled flag in the PipelineConfig database table or through the admin settings.