Flow Logo

Administration

Installation

Flow can be deployed in multiple ways depending on your needs:

  • Cloud Hosted - Use the managed Flow platform at app.flow.bio
  • Self-Hosted - Deploy Flow in your own infrastructure
  • Local Development - Run Flow on your laptop for testing or development

The easiest way to use Flow is through our managed cloud platform:

  1. Visit app.flow.bio
  2. Create an account
  3. Start uploading data and running pipelines

No installation required! The cloud platform includes:

  • Automatic updates and maintenance
  • Scalable compute resources
  • Data backup and security
  • Technical support

Self-Hosted Deployment

Organizations can deploy Flow in their own infrastructure for complete control over data and compute resources.

Prerequisites

  • Docker and Docker Compose (v2.0+)
  • 16GB RAM minimum (32GB recommended)
  • 100GB storage for application data
  • Additional storage for biological data
  • Linux server (Ubuntu 20.04+ recommended)

Quick Start with Docker Compose

  1. Clone the deployment repository

    git clone https://github.com/goodwright/flow-deploy
    cd flow-deploy
    
  2. Configure environment variables

    cp .env.example .env
    # Edit .env with your settings
    
  3. Set up required volumes

    mkdir -p volumes/{db,uploads,executions,pipelines,media,configs}
    chmod -R 777 volumes/  # Adjust permissions as needed
    
  4. Start Flow

    docker-compose up -d
    
  5. Access Flow

    • Frontend: http://localhost:3000
    • API: http://localhost:8000
    • Admin: http://localhost:8000/admin

Production Configuration

SSL/TLS Setup

For production, configure HTTPS using a reverse proxy:

# /etc/nginx/sites-available/flow
server {
    listen 443 ssl http2;
    server_name flow.yourdomain.com;
    
    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;
    
    location / {
        proxy_pass http://localhost:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
    
    location /api/ {
        proxy_pass http://localhost:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Database Configuration

For production, use an external PostgreSQL database:

# docker-compose.override.yml
services:
  api:
    environment:
      - DATABASE_URL=postgresql://user:pass@db.yourdomain.com/flow
      - REDIS_URL=redis://redis.yourdomain.com:6379

Storage Configuration

Network Attached Storage (NAS)
# docker-compose.override.yml
volumes:
  uploads:
    driver: local
    driver_opts:
      type: nfs
      o: addr=nas.yourdomain.com,rw
      device: ":/flow/uploads"
Cloud Storage (S3-compatible)
# .env
STORAGE_BACKEND=s3
AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_STORAGE_BUCKET_NAME=flow-data
AWS_S3_REGION_NAME=us-east-1

Kubernetes Deployment

For enterprise deployments, use our Helm charts:

# Add Flow Helm repository
helm repo add flow https://charts.flow.bio
helm repo update

# Install Flow
helm install flow flow/flow-app \
  --namespace flow \
  --create-namespace \
  --values values.yaml

Example values.yaml:

# values.yaml
api:
  replicas: 3
  resources:
    requests:
      memory: "2Gi"
      cpu: "1"
    limits:
      memory: "4Gi"
      cpu: "2"

celery:
  workers:
    replicas: 5
    resources:
      requests:
        memory: "4Gi"
        cpu: "2"
      limits:
        memory: "8Gi"
        cpu: "4"

postgresql:
  enabled: false
  external:
    host: "postgres.yourdomain.com"
    database: "flow"
    existingSecret: "flow-db-secret"

storage:
  className: "fast-ssd"
  size: "1Ti"

HPC Integration

Connect Flow to your HPC cluster:

Slurm Configuration

# flow-custom/slurm.py
SLURM_CONFIG = {
    "partition": "compute",
    "account": "bioinformatics",
    "time": "24:00:00",
    "mem": "32G",
    "cpus": 8,
    "modules": [
        "nextflow/23.04.3",
        "singularity/3.8.0"
    ]
}

# SSH connection to login node
HPC_SSH_CONFIG = {
    "hostname": "hpc.yourdomain.com",
    "username": "flow-service",
    "key_filename": "/secrets/hpc-key"
}

Shared Filesystem

Mount your HPC filesystem:

# docker-compose.override.yml
volumes:
  hpc-data:
    driver: local
    driver_opts:
      type: nfs
      o: addr=hpc-nfs.yourdomain.com,rw
      device: ":/shared/flow"

services:
  celery:
    volumes:
      - hpc-data:/mnt/hpc

Local Development

Run Flow locally for development or testing.

Prerequisites

  • Python 3.11+
  • Node.js 18+
  • Docker Desktop
  • 8GB RAM minimum

Backend Setup

  1. Clone the API repository

    git clone https://github.com/goodwright/flow-api
    cd flow-api/api
    
  2. Install dependencies

    pip install uv
    uv pip install -r requirements.txt
    
  3. Set up database

    uv run python manage.py migrate
    uv run python manage.py createsuperuser
    
  4. Start the API server

    FRONTEND_URL=http://localhost:3019 \
    SERVE_FILES=yes \
    UPLOADS_ROOT=./uploads \
    CONFIGS_ROOT=./configs \
    EXECUTIONS_ROOT=./executions \
    PIPELINES_ROOT=./pipelines \
    MEDIA_ROOT=./media \
    BULK_DOWNLOADS_ROOT=./downloads \
    uv run python manage.py runserver 8019
    

Frontend Setup

  1. Clone the frontend repository

    git clone https://github.com/goodwright/flow-front
    cd flow-front
    
  2. Install dependencies

    npm install
    
  3. Start the development server

    PORT=3019 \
    REACT_APP_BACKEND=http://localhost:8019 \
    REACT_APP_MEDIA=http://localhost:8019/media \
    REACT_APP_DATA=http://localhost:8019/data \
    npm start
    

Message Queue Setup

  1. Start RabbitMQ

    docker run -d \
      --name flow-rabbit \
      -p 5673:5673 \
      -e RABBITMQ_NODE_PORT=5673 \
      rabbitmq:3
    
  2. Start Celery worker

    cd flow-api/api
    EX_BROKER_URL=amqp://guest:guest@localhost:5673 \
    UPLOADS_ROOT=./uploads \
    CONFIGS_ROOT=./configs \
    EXECUTIONS_ROOT=./executions \
    PIPELINES_ROOT=./pipelines \
    uv run celery -A analysis worker -l INFO
    

Initial Configuration

  1. Access the admin interface

    • Navigate to http://localhost:8019/admin
    • Log in with your superuser credentials
  2. Set up pipeline categories

    • Create at least one category and subcategory
  3. Add a test pipeline

    • Add pipeline repo: https://github.com/goodwright/nfcore-wrappers
    • Create pipeline: "Faidx"
    • Add version: "dev" (branch: master)
    • Path: wrappers/faidx.nf
    • Schema: schema/faidx.json
  4. Configure Nextflow

    // Create pipeline config in admin
    docker.enabled = true
    process {
        container = 'biocontainers/samtools:1.16.1'
        cpus = 2
        memory = '4.GB'
    }
    

System Requirements

Minimum Requirements

  • CPU: 4 cores
  • RAM: 16GB
  • Storage: 100GB SSD
  • Network: 100Mbps
  • CPU: 16+ cores
  • RAM: 64GB+
  • Storage: 1TB+ NVMe SSD
  • Network: 1Gbps+
  • Database: PostgreSQL 14+ (dedicated)
  • Cache: Redis 6+
  • Queue: RabbitMQ 3.9+

Scaling Considerations

  • API servers: 1 per 50 concurrent users
  • Celery workers: 1 per 10 concurrent pipeline executions
  • Database: Use read replicas for large deployments
  • Storage: Plan for 10-100GB per analysis project

Troubleshooting

Common Issues

Services won't start

# Check logs
docker-compose logs -f api

# Verify port availability
sudo lsof -i :8000
sudo lsof -i :3000

Database connection errors

# Test database connection
docker-compose exec api python manage.py dbshell

# Reset database
docker-compose down -v
docker-compose up -d

File permission errors

# Fix volume permissions
sudo chown -R 1000:1000 volumes/

Pipeline execution failures

# Check Celery worker logs
docker-compose logs -f celery

# Verify Docker is accessible
docker run hello-world

Getting Help

Previous
Permissions