Pipelines
Nanoseq
Overview
Flow provides the nf-core/nanoseq v3.1.0 pipeline for processing and analyzing Oxford Nanopore Technologies (ONT) long-read sequencing data. It supports both DNA and RNA sequencing data, including direct RNA sequencing (dRNA-seq), and provides comprehensive quality control, alignment, and downstream analysis.
Nanoseq handles various ONT data types including genomic DNA sequencing, cDNA sequencing, and direct RNA sequencing, with built-in support for both basecalled and raw FAST5 data.
Pipeline Summary
The pipeline performs these key steps:
Basecalling (Optional)
- Guppy basecaller for raw FAST5 files
- Demultiplexing of barcoded samples
- Modified base detection (5mC, 6mA)
Quality Control
- NanoPlot for read statistics
- PycoQC for sequencing run metrics
- FastQC adapted for long reads
Read Processing
- Adapter trimming (Porechop)
- Quality filtering
- Length filtering
Alignment
- minimap2 for genome alignment
- Optional transcriptome alignment
- SAMtools for BAM processing
Downstream Analysis
- Transcript identification (DNA/cDNA)
- Differential expression (RNA)
- Variant calling (DNA)
- Methylation analysis
Visualization
- MultiQC report
- Coverage plots
- Transcript abundance
Input Requirements
Sequencing Data
- FASTQ files (basecalled)
- FAST5 files (raw signal)
- Multi-FAST5 format supported
- Minimum read length: 200bp
- Recommended coverage: 30X (genome), 10M reads (transcriptome)
Sample Sheet Format
group,replicate,barcode,input_file
control,1,barcode01,sample1.fastq.gz
control,2,barcode02,sample2.fastq.gz
treated,1,barcode03,sample3.fastq.gz
treated,2,barcode04,sample4.fastq.gz
Reference Files
- Reference genome (FASTA)
- Annotation file (GTF/GFF3)
- Transcriptome FASTA (optional)
Key Parameters
Input Options
--input
: Sample sheet path--protocol
: Sequencing protocolDNA
: Genomic DNA sequencingcDNA
: PCR-amplified cDNAdirectRNA
: Direct RNA sequencing
Basecalling (Raw Data)
--flowcell
: Flow cell version--kit
: Sequencing kit--guppy_config
: Guppy configuration--guppy_model
: Custom basecalling model--guppy_gpu
: Enable GPU basecalling
Quality Control
--skip_basecalling
: Skip if pre-basecalled--skip_qc
: Skip QC steps--min_read_length
: Minimum read length--max_read_length
: Maximum read length--min_read_qual
: Minimum mean quality
Alignment
--aligner
: Alignment tool (minimap2)--minimap2_opts
: Additional minimap2 options--save_align_intermeds
: Save intermediate files
Analysis Options
--quantification_method
:bambu
: Transcript discovery/quantificationstringtie2
: Alternative quantification
--skip_quantification
: Skip transcript analysis--skip_differential_analysis
: Skip DE analysis
Pipeline Outputs
Quality Control
Sequencing Metrics
nanoplot/
: Read length, quality distributionspycoqc/
: Comprehensive run statisticsfastqc/
: Per-base quality scores
Processing Reports
- Adapter trimming statistics
- Filtering summary
- Demultiplexing report
Alignment Results
BAM Files
- Sorted, indexed alignments
- Alignment statistics
- Coverage tracks (BigWig)
Alignment Metrics
- Mapping rates
- Error profiles
- Coverage statistics
Downstream Analysis
Transcriptomics (RNA/cDNA)
- Gene/transcript counts
- Novel transcript annotations
- Differential expression results
- Isoform switching analysis
Genomics (DNA)
- Variant calls (VCF)
- Structural variants
- Methylation calls (if applicable)
Visualizations
- IGV-ready tracks
- Expression heatmaps
- PCA plots
Protocol-Specific Workflows
Direct RNA Sequencing
--protocol directRNA
--skip_alignment false
--quantification_method bambu
--skip_differential_analysis false
Genomic DNA Analysis
--protocol DNA
--call_variants true
--structural_variants true
--skip_quantification true
cDNA Isoform Analysis
--protocol cDNA
--quantification_method stringtie2
--skip_alignment false
--save_reference_annotation true
Targeted Sequencing
--protocol DNA
--targeted_alignment true
--bed_file targets.bed
--skip_quantification true
Best Practices
Sample Preparation
- Use high molecular weight DNA/RNA
- Avoid PCR amplification when possible
- Include spike-in controls for quantification
- Sequence sufficient depth for your application
Basecalling
- Use latest Guppy version
- Select appropriate config for chemistry
- Enable GPU acceleration for speed
- Consider live basecalling for large runs
Quality Control
- Set length filters based on expected sizes
- Remove low-quality reads (Q7 minimum)
- Check for adapter contamination
- Verify barcode assignments
Analysis
- Use appropriate presets for minimap2
- Enable splice-aware alignment for RNA
- Consider multi-mapped reads for repeats
- Validate novel transcripts
Troubleshooting
Common Issues
Low Yield
- Check RNA/DNA quality (degradation)
- Verify library preparation
- Review pore occupancy
- Examine adapter ligation efficiency
Poor Alignment
- Confirm correct reference genome
- Check for contamination
- Adjust minimap2 parameters
- Consider reference quality
Basecalling Problems
- Update Guppy version
- Verify flow cell/kit selection
- Check GPU memory (if applicable)
- Monitor system resources
Advanced Features
Modified Base Detection
--guppy_model template_r9.4.1_450bps_modbases_5mc_cg_hac.cfg
--skip_demethylation false
--methylation_threshold 0.8
Fusion Detection
--protocol cDNA
--fusion_detection true
--fusion_tool arriba
Allele-Specific Expression
--protocol directRNA
--phased_vcf sample.vcf.gz
--quantification_method bambu
Real-Time Analysis
--watch_path /path/to/sequencing/run
--real_time true
--min_batch_size 4000
Output Interpretation
Key Metrics
- Read N50: Median read length
- Mean Quality: Average Phred score (>Q10 good)
- Mapping Rate: >80% expected for good data
- Transcript Detection: Number of expressed genes
RNA-Seq Specific
- Full-length Reads: Percentage with 5' and 3' ends
- Isoform Diversity: Novel vs known transcripts
- Poly(A) Tail Length: Direct RNA only
- RNA Modifications: If detection enabled
DNA-Seq Specific
- Coverage Uniformity: Evenness across genome
- Variant Quality: QUAL scores in VCF
- SV Validation: Split read support
- Methylation Patterns: CpG island coverage
Additional Resources
- Full documentation: nf-core/nanoseq documentation
- Pipeline source code: GitHub - nf-core/nanoseq
- Oxford Nanopore Technologies: nanoporetech.com
- Support: Join the
#nanoseq
channel on nf-core Slack - Citation: doi.org/10.5281/zenodo.7138644