How SeqDesk Works

SeqDesk is a self-hosted platform for managing sequencing orders, running bioinformatics pipelines, and submitting results to public archives. It runs entirely on your own infrastructure — no cloud dependencies, no data leaving your network.

The Big Picture

SeqDesk connects three stages of a typical sequencing facility workflow:

Order & Sample Management — Researchers submit sequencing requests, facility staff track and process them.
Analysis — Automated pipelines (powered by Nextflow) assemble genomes, bin metagenomes, and produce quality reports.
Submission — Results and metadata are packaged and submitted to the European Nucleotide Archive (ENA).

Data Flow


Researcher                    Facility Admin
    │                              │
    ├─ Create Order ──────────────▶│
    ├─ Add Samples                 │
    ├─ Submit ────────────────────▶│
    │                              ├─ Assign FASTQ files to samples
    │                              ├─ Create Study (group samples)
    │                              ├─ Launch Pipeline (MAG, SubMG, …)
    │                              ├─ Review results (assemblies, bins)
    │                              └─ Submit to ENA
    │                              │
    ◀── View results ──────────────┘

Key Entities

Entity	Purpose
Order	A sequencing request containing one or more samples. Tracks status from draft through completion.
Sample	An individual biological sample with metadata (organism, taxonomy, MIxS fields).
Read	A FASTQ file (or pair of files) linked to a sample. Includes checksums for integrity.
Study	A logical grouping of samples for analysis and/or submission.
Pipeline Run	An execution of a Nextflow workflow against a set of samples.
Assembly	Contigs produced by an assembly pipeline (e.g. MAG).
Bin	A genome bin extracted from a metagenomic assembly, with completeness and contamination scores.

Architecture

SeqDesk is built as a single Next.js application that bundles the web UI, API layer, and pipeline orchestration into one process.

Layer	Technology	Role
Frontend	React + Tailwind CSS	Interactive UI with real-time pipeline monitoring
API	Next.js API Routes	REST endpoints for all operations
Database	PostgreSQL	Persistent storage via Prisma ORM
Auth	NextAuth.js	Session-based authentication with role-based access
Pipelines	Nextflow	Workflow execution — local or SLURM cluster

Pipeline System

Pipelines are defined as self-contained packages with a manifest-driven architecture. Each pipeline package includes:

manifest.json — declares inputs, outputs, parameters, and execution commands
definition.json — describes the workflow DAG for visualization
samplesheet.yaml — declarative rules for generating input samplesheets
parsers/ — YAML definitions for extracting structured results from output files

When a pipeline runs, SeqDesk:

Generates the samplesheet from the study’s samples and reads
Builds the Nextflow command with configured parameters
Executes via local process or SLURM submission
Monitors progress through trace files and weblog events
Discovers output files and parses results into the database

Configuration Resolution

SeqDesk merges configuration from multiple sources in this priority order:

Environment variables — highest priority, ideal for deployment automation
Config file — seqdesk.config.json for structured, version-controlled settings
Database — runtime settings changed through the admin UI
Defaults — built-in fallback values

This layered approach lets you override specific values at deployment time while keeping the rest configurable through the UI.

Two Roles

SeqDesk uses a simple role model:

Researcher — creates orders, adds samples, views results. Can only see their own data (unless department sharing is enabled).

Facility Admin — full access to all orders, studies, and settings. Can configure the system, run pipelines, and submit to ENA.

Self-Hosted by Design

SeqDesk is designed to run on your own infrastructure:

No external dependencies — everything runs locally (database, file storage, pipeline execution)
Your data stays on your network — sequencing files are referenced by path, never uploaded to a remote service
Single-command install — npm i -g seqdesk && seqdesk gets you running
Automatic updates — built-in update system with backup and rollback support