Assigning Files to Samples

After file discovery, sequencing files need to be linked to the correct samples. SeqDesk provides both automatic matching and manual assignment.

Auto-Detect Matching

The matching engine compares discovered file identifiers against sample records:

Extract the sample identifier from the filename
Compare against sampleId, sampleAlias, and sampleTitle (in priority order)
Calculate a confidence score (0–1)
Suggest the match to the facility admin

Confidence Levels

Score	Level	Action
≥ 0.9	High	Auto-fill suggested (exact or near-exact match)
0.5–0.9	Medium	Suggested with review required
< 0.5	Low	No automatic suggestion

High-confidence matches occur when the filename identifier exactly matches the sample alias or ID. Medium-confidence matches may result from partial name overlaps.

Manual Assignment

For files that do not auto-match or need correction:

Navigate to the Files page or the order’s file tab
Browse the discovered files list
For each file pair (R1/R2), select the target sample from a dropdown
Confirm the assignment

The file browser shows:

File path relative to the data base path
File size
Current assignment status (assigned / unassigned)
The sample it is assigned to (if any)

What Gets Stored

When files are assigned, a Read record is created linking the files to the sample:

Field	Value
`file1`	Path to forward reads (R1)
`file2`	Path to reverse reads (R2), null for single-end
`checksum1/checksum2`	MD5 checksums for integrity verification
`sampleId`	The assigned sample

File paths are stored relative to the data base path and resolved at runtime.

Bulk Assignment

For orders with many samples, the auto-detect feature can process all samples at once:

Open the order detail page
Click Discover Files
The system scans and suggests matches for all samples
Review the suggestions
Confirm to apply all high-confidence matches

Low-confidence matches are left unassigned for manual review.

Requirements for Pipelines

Before a pipeline can run on a study:

All included samples must have at least one read record
Read files must exist at the configured paths
For paired-end pipelines, both R1 and R2 must be assigned

The pipeline launcher validates these requirements before allowing a run to start. See Running a Pipeline.