Pipelines
Ampliseq
Overview
Flow provides the nf-core/ampliseq v2.7.1 pipeline for analyzing 16S rRNA, 18S rRNA, and ITS amplicon sequencing data. It supports demultiplexing, quality control, taxonomic classification, and diversity analysis for microbiome studies.
The pipeline uses DADA2 or QIIME2 for amplicon sequence variant (ASV) calling and provides comprehensive diversity metrics and visualizations for ecological analysis.
Pipeline Summary
The workflow includes:
Quality Control
- Read quality assessment (FastQC)
- Primer trimming (Cutadapt)
- Quality filtering
Denoising & ASV Calling
- DADA2 or QIIME2 denoising
- Chimera removal
- ASV table generation
Taxonomic Classification
- Multiple classifier options:
- DADA2 native classifier
- QIIME2 feature-classifier
- SINTAX
- Reference database assignment
- Multiple classifier options:
Diversity Analysis
- Alpha diversity metrics
- Beta diversity analysis
- Ordination (PCoA, NMDS)
- Differential abundance testing
Visualization
- Interactive plots
- Taxonomic bar plots
- Diversity boxplots
- Rarefaction curves
Input Requirements
Sequencing Data
- Paired-end or single-end FASTQ files
- Demultiplexed or multiplexed samples
- Illumina, IonTorrent, or PacBio platforms
Metadata File
Tab-separated file with sample information:
sampleID group treatment timepoint
sample1 control none T0
sample2 control none T0
sample3 treatment antibiotics T1
Primer Sequences
- Forward primer sequence (required)
- Reverse primer sequence (for paired-end)
- Allow mismatches for degenerate primers
Key Parameters
Basic Settings
--input
: Path to samplesheet--metadata
: Sample metadata file--FW_primer
: Forward primer sequence--RV_primer
: Reverse primer sequence
Amplicon Type
--amplicon_type
: Target region16S
: Bacterial 16S rRNA18S
: Eukaryotic 18S rRNAITS
: Fungal ITS regioncustom
: User-defined markers
Analysis Parameters
--trunclenf
: Forward read truncation--trunclenr
: Reverse read truncation--trunc_qmin
: Quality truncation threshold--max_ee
: Maximum expected errors
Taxonomic Classification
--dada_ref_taxonomy
: Reference databasesilva
: SILVA databasegreengenes
: Greengenesunite
: UNITE (for ITS)pr2
: PR2 (for protists)
--classifier
: Classification method
Diversity Analysis
--metadata_category
: Grouping variable--min_samples
: Minimum samples per group--diversity_alpha_metrics
: Alpha metrics to calculate--diversity_beta_metrics
: Beta metrics to calculate
Pipeline Outputs
ASV Data
ASV Table
ASV_table.tsv
: Abundance matrixASV_sequences.fasta
: Representative sequencesASV_tax.tsv
: Taxonomic assignments
Quality Reports
- Read quality profiles
- Denoising statistics
- Chimera removal stats
Diversity Results
Alpha Diversity
- Shannon index
- Simpson index
- Observed ASVs
- Chao1 richness
- Statistical comparisons
Beta Diversity
- Distance matrices
- PCoA coordinates
- PERMANOVA results
- Ordination plots
Visualizations
Taxonomic Plots
- Relative abundance bar charts
- Krona plots
- Heatmaps
Diversity Plots
- Alpha diversity boxplots
- Beta diversity ordinations
- Rarefaction curves
Quality Control
- MultiQC report
- DADA2 QC plots
- Read tracking table
Example Usage
Standard 16S V3-V4 Analysis
nextflow run nf-core/ampliseq \
--input samplesheet.tsv \
--amplicon_type 16S \
--FW_primer CCTACGGGNGGCWGCAG \
--RV_primer GACTACHVGGGTATCTAATCC \
--metadata metadata.tsv \
--trunclenf 250 \
--trunclenr 200 \
--outdir results \
-profile docker
ITS2 Fungal Analysis
nextflow run nf-core/ampliseq \
--input samplesheet.tsv \
--amplicon_type ITS \
--FW_primer GTGARTCATCGAATCTTTG \
--RV_primer TCCTCCGCTTATTGATATGC \
--its_partial true \
--outdir results \
-profile docker
Long-Read 16S (PacBio)
nextflow run nf-core/ampliseq \
--input samplesheet.tsv \
--pacbio true \
--amplicon_type 16S \
--max_len 1600 \
--min_len 1000 \
--outdir results \
-profile docker
Custom Database Analysis
nextflow run nf-core/ampliseq \
--input samplesheet.tsv \
--FW_primer GTGCCAGCMGCCGCGGTAA \
--RV_primer GGACTACHVGGGTWTCTAAT \
--dada_ref_taxonomy silva_taxonomy.txt \
--dada_ref_tax_levels "Kingdom,Phylum,Class,Order,Family,Genus,Species" \
--outdir results \
-profile docker
Tips and Best Practices
Sample Preparation
- Include negative controls
- Add mock communities for validation
- Randomize samples during sequencing
- Maintain consistent amplification conditions
Parameter Selection
- Set truncation based on quality profiles
- Use appropriate error rates for platform
- Choose reference database for environment
- Include all relevant metadata
Quality Control
- Check rarefaction curves for saturation
- Verify mock community composition
- Examine negative control contamination
- Review taxonomic assignments
Troubleshooting
Common Issues
Issue: Low ASV recovery or many reads filtered out
- Solution: Check primer trimming success in cutadapt logs
- Adjust quality filtering parameters (
--trunclenf
,--trunclenr
) - Review expected amplicon length for your primers
- Verify correct
--amplicon_type
selection
Issue: Poor taxonomic assignment
- Solution: Update classifier database to latest version
- Verify primer specificity for target organisms
- Check if amplicon region has good database coverage
- Try different classifier (
--skip_dada2
to use QIIME2)
Issue: Diversity analysis failures
- Solution: Ensure adequate sequencing depth (>10,000 reads/sample)
- Check for batch effects in PCoA plots
- Use appropriate normalization method for your data
- Verify all samples are included in metadata file
Issue: Memory errors during DADA2
- Solution: Reduce
--max_cpus
to limit parallelization - Process samples in smaller batches
- Increase memory allocation with
--max_memory
- Consider
--sample_inference pseudo
for large datasets
Issue: Primer trimming failures
- Solution: Check primer orientation (try reverse complement)
- Allow more mismatches with
--cutadapt_mismatches
- Verify primer sequences including degenerate bases
- Use
--retain_untrimmed
to diagnose issues
Additional Resources
- Full documentation: nf-core/ampliseq documentation
- Pipeline source code: GitHub - nf-core/ampliseq
- QIIME2 documentation: docs.qiime2.org
- DADA2 tutorial: benjjneb.github.io/dada2
- Support: Join the
#ampliseq
channel on nf-core Slack - Citation: Straub et al. (2020) doi.org/10.3389/fmicb.2020.550420