Skip to Content
Sequencing FilesAssigning Files to Samples

Assigning Files to Samples

After file discovery, sequencing files need to be linked to the correct samples. SeqDesk provides both automatic matching and manual assignment.

Auto-Detect Matching

The matching engine compares discovered file identifiers against sample records:

  1. Extract the sample identifier from the filename
  2. Compare against sampleId, sampleAlias, and sampleTitle (in priority order)
  3. Calculate a confidence score (0–1)
  4. Suggest the match to the facility admin

Confidence Levels

ScoreLevelAction
≥ 0.9HighAuto-fill suggested (exact or near-exact match)
0.5–0.9MediumSuggested with review required
< 0.5LowNo automatic suggestion

High-confidence matches occur when the filename identifier exactly matches the sample alias or ID. Medium-confidence matches may result from partial name overlaps.

Manual Assignment

For files that do not auto-match or need correction:

  1. Navigate to the Files page or the order’s file tab
  2. Browse the discovered files list
  3. For each file pair (R1/R2), select the target sample from a dropdown
  4. Confirm the assignment

The file browser shows:

  • File path relative to the data base path
  • File size
  • Current assignment status (assigned / unassigned)
  • The sample it is assigned to (if any)

What Gets Stored

When files are assigned, a Read record is created linking the files to the sample:

FieldValue
file1Path to forward reads (R1)
file2Path to reverse reads (R2), null for single-end
checksum1/checksum2MD5 checksums for integrity verification
sampleIdThe assigned sample

File paths are stored relative to the data base path and resolved at runtime.

Bulk Assignment

For orders with many samples, the auto-detect feature can process all samples at once:

  1. Open the order detail page
  2. Click Discover Files
  3. The system scans and suggests matches for all samples
  4. Review the suggestions
  5. Confirm to apply all high-confidence matches

Low-confidence matches are left unassigned for manual review.

Requirements for Pipelines

Before a pipeline can run on a study:

  • All included samples must have at least one read record
  • Read files must exist at the configured paths
  • For paired-end pipelines, both R1 and R2 must be assigned

The pipeline launcher validates these requirements before allowing a run to start. See Running a Pipeline.