Flow Logo

Advanced Topics

Testing

Flow employs rigorous testing at every level to ensure reliability, reproducibility, and scientific accuracy. This guide covers testing approaches for different components of the Flow ecosystem.


Testing Philosophy

Flow's testing strategy is built on these principles:

  • Test at every level - Unit, integration, end-to-end, and acceptance tests
  • Automate everything - Manual testing doesn't scale
  • Test early and often - Catch issues before they reach production
  • Real-world scenarios - Test with actual biological data and workflows
  • Performance matters - Test for speed and resource usage, not just correctness

Backend Testing (Django API)

Test Structure

Flow's API tests are organized by Django app:

api/
├── analysis/tests/
│   ├── test_models.py
│   ├── test_views.py
│   ├── test_api.py
│   └── test_celery.py
├── samples/tests/
│   ├── test_models.py
│   ├── test_filters.py
│   └── test_serializers.py
└── tests/
    ├── testcase.py      # Base test class
    └── test_*.py        # Integration tests

Running Tests

# Run all tests
uv run python manage.py test

# Run specific app tests
uv run python manage.py test analysis

# Run specific test class
uv run python manage.py test analysis.tests.test_models.ExecutionModelTest

# Run with coverage
uv run coverage run --source='.' manage.py test
uv run coverage report

Writing Tests

Model Tests

from tests.testcase import TestCase
from analysis.models import Execution

class ExecutionModelTest(TestCase):
    def test_execution_creation(self):
        """Test creating an execution with required fields."""
        execution = Execution.objects.create(
            pipeline_version=self.pipeline_version,
            user=self.user,
            status="pending"
        )
        self.assertEqual(execution.status, "pending")
        self.assertIsNotNone(execution.created)
    
    def test_execution_state_transitions(self):
        """Test valid state transitions for executions."""
        execution = self.create_execution()
        
        # Valid transition
        execution.status = "running"
        execution.save()
        self.assertEqual(execution.status, "running")
        
        # Invalid transition should raise
        execution.status = "pending"
        with self.assertRaises(ValidationError):
            execution.save()

API Tests

class ExecutionAPITest(TestCase):
    def test_get_execution(self):
        """Test retrieving execution details via API."""
        execution = self.create_execution()
        
        response = self.client.get(
            f'/api/executions/{execution.id}/',
            HTTP_AUTHORIZATION=f'Bearer {self.token}'
        )
        
        self.assertEqual(response.status_code, 200)
        data = response.json()
        self.assertEqual(data['id'], str(execution.id))
        self.assertEqual(data['status'], execution.status)
        self.assertEqual(data['pipeline']['name'], execution.pipeline.name)

Celery Task Tests

from unittest.mock import patch, MagicMock
from analysis.celery import run_pipeline

class CeleryTaskTest(TestCase):
    @patch('analysis.celery.nextflow.run')
    def test_run_pipeline_task(self, mock_run):
        """Test pipeline execution via Celery."""
        mock_run.return_value = MagicMock(
            execution_id="NF-123",
            status="completed"
        )
        
        execution = self.create_execution()
        result = run_pipeline.delay(execution.id)
        
        # Wait for task
        result.get(timeout=10)
        
        execution.refresh_from_db()
        self.assertEqual(execution.status, "completed")
        mock_run.assert_called_once()

Test Utilities

Flow provides test utilities for common scenarios:

# tests/testcase.py
class TestCase(DjangoTestCase):
    """Base test class with helper methods."""
    
    def setUp(self):
        self.user = self.create_user()
        self.client = self.create_authenticated_client()
    
    def create_user(self, **kwargs):
        return mixer.blend(User, **kwargs)
    
    def create_sample(self, **kwargs):
        defaults = {
            "name": "Test Sample",
            "user": self.user,
            "organism": self.create_organism()
        }
        defaults.update(kwargs)
        return mixer.blend(Sample, **defaults)
    
    def create_authenticated_client(self):
        client = Client()
        client.force_authenticate(user=self.user)
        return client

Frontend Testing (React)

Test Structure

src/
├── components/
│   ├── Button.jsx
│   └── Button.test.jsx
├── pages/
│   ├── ProjectPage.jsx
│   └── ProjectPage.test.jsx
└── __tests__/
    ├── integration/
    └── e2e/

Running Tests

# Run all tests
npm test

# Run in watch mode
npm test -- --watch

# Run with coverage
npm test -- --coverage

# Run specific test file
npm test Button.test.jsx

Writing Component Tests

import { render, screen, fireEvent } from '@testing-library/react';
import { MockedProvider } from '@apollo/client/testing';
import ProjectCard from './ProjectCard';

describe('ProjectCard', () => {
  const mockProject = {
    id: '123',
    name: 'Test Project',
    sampleCount: 5,
    created: '2024-01-01',
    owner: { username: 'testuser' }
  };
  
  test('renders project information', () => {
    render(
      <MockedProvider>
        <ProjectCard project={mockProject} />
      </MockedProvider>
    );
    
    expect(screen.getByText('Test Project')).toBeInTheDocument();
    expect(screen.getByText('5 samples')).toBeInTheDocument();
  });
  
  test('handles click events', () => {
    const handleClick = jest.fn();
    
    render(
      <MockedProvider>
        <ProjectCard 
          project={mockProject}
          onClick={handleClick}
        />
      </MockedProvider>
    );
    
    fireEvent.click(screen.getByRole('article'));
    expect(handleClick).toHaveBeenCalledWith(mockProject);
  });
});

Testing API Calls

import { getProject } from '../api';

// Mock the API module
jest.mock('../api');

test('loads and displays project', async () => {
  // Mock API response
  getProject.mockResolvedValue({
    id: '123',
    name: 'Test Project',
    samples: []
  });
  
  render(<ProjectPage projectId="123" />);
  
  // Loading state
  expect(screen.getByText('Loading...')).toBeInTheDocument();
  
  // Wait for API call
  await waitFor(() => {
    expect(screen.getByText('Test Project')).toBeInTheDocument();
  });
  
  expect(getProject).toHaveBeenCalledWith('123');
});

E2E Tests with Playwright

// tests/e2e/create-project.spec.js
const { test, expect } = require('@playwright/test');

test.describe('Project Creation', () => {
  test.beforeEach(async ({ page }) => {
    await page.goto('/login');
    await page.fill('[name="username"]', 'testuser');
    await page.fill('[name="password"]', 'testpass');
    await page.click('button[type="submit"]');
  });
  
  test('creates new project', async ({ page }) => {
    await page.goto('/projects/new');
    
    // Fill form
    await page.fill('[name="name"]', 'E2E Test Project');
    await page.fill('[name="description"]', 'Created by E2E test');
    
    // Submit
    await page.click('button:has-text("Create Project")');
    
    // Verify redirect and creation
    await expect(page).toHaveURL(/\/projects\/\d+/);
    await expect(page.locator('h1')).toHaveText('E2E Test Project');
  });
});

Pipeline Testing

nf-test Framework

Flow pipelines use nf-test for testing:

# Install nf-test
conda install -c bioconda nf-test

# Run all tests
nf-test test

# Run specific test
nf-test test tests/main.nf.test

# Generate test snapshot
nf-test test --update-snapshot

Writing Pipeline Tests

// tests/main.nf.test
nextflow_pipeline {
    name "Test RNA-seq pipeline"
    script "../main.nf"
    
    test("Default parameters") {
        when {
            params {
                input = "$baseDir/test_data/samplesheet.csv"
                genome = "GRCh38"
                outdir = "test_results"
            }
        }
        
        then {
            assert workflow.success
            assert path("test_results/multiqc/multiqc_report.html").exists()
            assert path("test_results/star").list().size() == 3
        }
    }
    
    test("Single-end mode") {
        when {
            params {
                input = "$baseDir/test_data/samplesheet_se.csv"
                single_end = true
            }
        }
        
        then {
            assert workflow.success
            assert !path("test_results/star/*_R2.fastq.gz").exists()
        }
    }
}

Module Testing

// tests/modules/star_align.nf.test
nextflow_process {
    name "Test STAR alignment"
    script "../../modules/nf-core/star/align/main.nf"
    process "STAR_ALIGN"
    
    test("Paired-end alignment") {
        when {
            process {
                input[0] = Channel.of([
                    [id: "test"],
                    [file("test_R1.fastq.gz"), file("test_R2.fastq.gz")]
                ])
                input[1] = file("star_index")
                input[2] = file("genome.gtf")
            }
        }
        
        then {
            assert process.success
            assert process.out.bam.size() == 1
            assert path(process.out.bam[0][1]).exists()
        }
    }
}

Integration Testing

API Integration Tests

class FlowIntegrationTest(TestCase):
    """Test complete workflows across multiple components."""
    
    def test_complete_analysis_workflow(self):
        """Test from upload to results."""
        # 1. Upload data
        upload_response = self.client.post(
            '/api/upload/',
            {'file': self.test_file, 'project': self.project.id},
            format='multipart'
        )
        data_id = upload_response.data['id']
        
        # 2. Create sample
        sample = self.create_sample(data=[data_id])
        
        # 3. Run pipeline
        execution = self.run_pipeline(
            pipeline="RNA-seq",
            samples=[sample.id]
        )
        
        # 4. Wait for completion
        self.wait_for_execution(execution)
        
        # 5. Verify results
        results = self.get_execution_results(execution)
        self.assertIn('multiqc_report.html', results)
        self.assertIn('gene_counts.txt', results)

Acceptance Tests

# acceptance-tests/features/complete_workflow.feature
Feature: Complete Analysis Workflow
    As a researcher
    I want to analyze my RNA-seq data
    So that I can identify differentially expressed genes
    
    Scenario: Run RNA-seq analysis
        Given I am logged in as "researcher@example.com"
        And I have uploaded "sample_R1.fastq.gz" and "sample_R2.fastq.gz"
        When I create a sample named "Treatment Sample"
        And I run the "RNA-seq" pipeline with default parameters
        Then the execution should complete successfully
        And I should see a MultiQC report
        And I should see gene expression counts
        And I should be able to download all results

Performance Testing

Load Testing with Locust

# locustfile.py
from locust import HttpUser, task, between

class FlowUser(HttpUser):
    wait_time = between(1, 3)
    
    def on_start(self):
        # Login
        response = self.client.post(
            "/api/auth/login/",
            json={"username": "testuser", "password": "testpass"}
        )
        self.token = response.json()["token"]
        self.client.headers.update({
            "Authorization": f"Bearer {self.token}"
        })
    
    @task(3)
    def view_projects(self):
        self.client.get("/api/projects/")
    
    @task(2)
    def view_samples(self):
        self.client.get("/api/samples/")
    
    @task(1)
    def run_pipeline(self):
        self.client.post(
            "/api/executions/",
            json={
                "pipeline": "RNA-seq",
                "samples": [1, 2, 3]
            }
        )

Pipeline Performance Tests

# Benchmark pipeline performance
time nextflow run main.nf \
    -profile test,docker \
    -with-report performance.html \
    -with-timeline timeline.html \
    -with-dag dag.png

# Analyze resource usage
nextflow log last -f 'name,status,realtime,pcpu,vmem'

CI/CD Pipeline

GitHub Actions Workflow

# .github/workflows/test.yml
name: Test Suite
on: [push, pull_request]

jobs:
  backend:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:14
        env:
          POSTGRES_PASSWORD: postgres
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
    
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: '3.11'
      
      - name: Install dependencies
        run: |
          pip install uv
          uv pip install -r requirements.txt
      
      - name: Run tests
        env:
          DATABASE_URL: postgresql://postgres:postgres@localhost/test
        run: |
          uv run python manage.py test
          uv run mypy .
  
  frontend:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-node@v3
        with:
          node-version: '18'
      
      - name: Install and test
        run: |
          npm ci
          npm test -- --coverage
          npm run build
  
  pipelines:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: nf-core/setup-nextflow@v1
      
      - name: Run pipeline tests
        run: |
          nf-test test

Best Practices

  1. Test Data Management

    • Use minimal test datasets that execute quickly
    • Store test data in version control when small
    • Use data fixtures for consistent testing
  2. Test Isolation

    • Each test should be independent
    • Clean up after tests (use transactions)
    • Mock external services
  3. Coverage Goals

    • Aim for >80% code coverage
    • Focus on critical paths
    • Test error conditions
  4. Performance

    • Keep test suite fast (<10 minutes)
    • Parallelize where possible
    • Use test databases in memory
  5. Documentation

    • Document what each test verifies
    • Include examples in docstrings
    • Maintain test data documentation
Previous
Writing Pipelines