Assigning Files to Samples
After file discovery, sequencing files need to be linked to the correct samples. SeqDesk provides both automatic matching and manual assignment.
Auto-Detect Matching
The matching engine compares discovered file identifiers against sample records:
- Extract the sample identifier from the filename
- Compare against
sampleId,sampleAlias, andsampleTitle(in priority order) - Calculate a confidence score (0–1)
- Suggest the match to the facility admin
Confidence Levels
| Score | Level | Action |
|---|---|---|
| ≥ 0.9 | High | Auto-fill suggested (exact or near-exact match) |
| 0.5–0.9 | Medium | Suggested with review required |
| < 0.5 | Low | No automatic suggestion |
High-confidence matches occur when the filename identifier exactly matches the sample alias or ID. Medium-confidence matches may result from partial name overlaps.
Manual Assignment
For files that do not auto-match or need correction:
- Navigate to the Files page or the order’s file tab
- Browse the discovered files list
- For each file pair (R1/R2), select the target sample from a dropdown
- Confirm the assignment
The file browser shows:
- File path relative to the data base path
- File size
- Current assignment status (assigned / unassigned)
- The sample it is assigned to (if any)
What Gets Stored
When files are assigned, a Read record is created linking the files to the sample:
| Field | Value |
|---|---|
file1 | Path to forward reads (R1) |
file2 | Path to reverse reads (R2), null for single-end |
checksum1/checksum2 | MD5 checksums for integrity verification |
sampleId | The assigned sample |
File paths are stored relative to the data base path and resolved at runtime.
Bulk Assignment
For orders with many samples, the auto-detect feature can process all samples at once:
- Open the order detail page
- Click Discover Files
- The system scans and suggests matches for all samples
- Review the suggestions
- Confirm to apply all high-confidence matches
Low-confidence matches are left unassigned for manual review.
Requirements for Pipelines
Before a pipeline can run on a study:
- All included samples must have at least one read record
- Read files must exist at the configured paths
- For paired-end pipelines, both R1 and R2 must be assigned
The pipeline launcher validates these requirements before allowing a run to start. See Running a Pipeline.