Installation

Flow can be deployed in multiple ways depending on your needs:

Cloud Hosted - Use the managed Flow platform at app.flow.bio
Self-Hosted - Deploy Flow in your own infrastructure
Local Development - Run Flow on your laptop for testing or development

Cloud Hosted (Recommended)

The easiest way to use Flow is through our managed cloud platform:

Visit app.flow.bio
Create an account
Start uploading data and running pipelines

No installation required! The cloud platform includes:

Automatic updates and maintenance
Scalable compute resources
Data backup and security
Technical support

Self-Hosted Deployment

Organizations can deploy Flow in their own infrastructure for complete control over data and compute resources.

Prerequisites

Docker and Docker Compose (v2.0+)
16GB RAM minimum (32GB recommended)
100GB storage for application data
Additional storage for biological data
Linux server (Ubuntu 20.04+ recommended)

Quick Start with Docker Compose

Clone the deployment repository

git clone https://github.com/goodwright/flow-deploy
cd flow-deploy

Configure environment variables

cp .env.example .env
# Edit .env with your settings

Set up required volumes

mkdir -p volumes/{db,uploads,executions,pipelines,media,configs}
chmod -R 777 volumes/  # Adjust permissions as needed

Start Flow
```
docker-compose up -d
```
Access Flow
- Frontend: http://localhost:3000
- API: http://localhost:8000
- Admin: http://localhost:8000/admin

Production Configuration

SSL/TLS Setup

For production, configure HTTPS using a reverse proxy:

# /etc/nginx/sites-available/flow
server {
    listen 443 ssl http2;
    server_name flow.yourdomain.com;
    
    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;
    
    location / {
        proxy_pass http://localhost:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
    
    location /api/ {
        proxy_pass http://localhost:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Database Configuration

For production, use an external PostgreSQL database:

# docker-compose.override.yml
services:
  api:
    environment:
      - DATABASE_URL=postgresql://user:pass@db.yourdomain.com/flow
      - REDIS_URL=redis://redis.yourdomain.com:6379

Storage Configuration

Network Attached Storage (NAS)

# docker-compose.override.yml
volumes:
  uploads:
    driver: local
    driver_opts:
      type: nfs
      o: addr=nas.yourdomain.com,rw
      device: ":/flow/uploads"

Cloud Storage (S3-compatible)

# .env
STORAGE_BACKEND=s3
AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_STORAGE_BUCKET_NAME=flow-data
AWS_S3_REGION_NAME=us-east-1

Kubernetes Deployment

For enterprise deployments, use our Helm charts:

# Add Flow Helm repository
helm repo add flow https://charts.flow.bio
helm repo update

# Install Flow
helm install flow flow/flow-app \
  --namespace flow \
  --create-namespace \
  --values values.yaml

Example values.yaml:

# values.yaml
api:
  replicas: 3
  resources:
    requests:
      memory: "2Gi"
      cpu: "1"
    limits:
      memory: "4Gi"
      cpu: "2"

celery:
  workers:
    replicas: 5
    resources:
      requests:
        memory: "4Gi"
        cpu: "2"
      limits:
        memory: "8Gi"
        cpu: "4"

postgresql:
  enabled: false
  external:
    host: "postgres.yourdomain.com"
    database: "flow"
    existingSecret: "flow-db-secret"

storage:
  className: "fast-ssd"
  size: "1Ti"

HPC Integration

Connect Flow to your HPC cluster:

Slurm Configuration

# flow-custom/slurm.py
SLURM_CONFIG = {
    "partition": "compute",
    "account": "bioinformatics",
    "time": "24:00:00",
    "mem": "32G",
    "cpus": 8,
    "modules": [
        "nextflow/23.04.3",
        "singularity/3.8.0"
    ]
}

# SSH connection to login node
HPC_SSH_CONFIG = {
    "hostname": "hpc.yourdomain.com",
    "username": "flow-service",
    "key_filename": "/secrets/hpc-key"
}

Shared Filesystem

Mount your HPC filesystem:

# docker-compose.override.yml
volumes:
  hpc-data:
    driver: local
    driver_opts:
      type: nfs
      o: addr=hpc-nfs.yourdomain.com,rw
      device: ":/shared/flow"

services:
  celery:
    volumes:
      - hpc-data:/mnt/hpc

Local Development

Run Flow locally for development or testing.

Prerequisites

Python 3.11+
Node.js 18+
Docker Desktop
8GB RAM minimum

Backend Setup

Clone the API repository

git clone https://github.com/goodwright/flow-api
cd flow-api/api

Install dependencies

pip install uv
uv pip install -r requirements.txt

Set up database

uv run python manage.py migrate
uv run python manage.py createsuperuser

Start the API server

FRONTEND_URL=http://localhost:3019 \
SERVE_FILES=yes \
UPLOADS_ROOT=./uploads \
CONFIGS_ROOT=./configs \
EXECUTIONS_ROOT=./executions \
PIPELINES_ROOT=./pipelines \
MEDIA_ROOT=./media \
BULK_DOWNLOADS_ROOT=./downloads \
uv run python manage.py runserver 8019

Frontend Setup

Clone the frontend repository

git clone https://github.com/goodwright/flow-front
cd flow-front

Install dependencies
```
npm install
```

Start the development server

PORT=3019 \
REACT_APP_BACKEND=http://localhost:8019 \
REACT_APP_MEDIA=http://localhost:8019/media \
REACT_APP_DATA=http://localhost:8019/data \
npm start

Message Queue Setup

Start RabbitMQ

docker run -d \
  --name flow-rabbit \
  -p 5673:5673 \
  -e RABBITMQ_NODE_PORT=5673 \
  rabbitmq:3

Start Celery worker

cd flow-api/api
EX_BROKER_URL=amqp://guest:guest@localhost:5673 \
UPLOADS_ROOT=./uploads \
CONFIGS_ROOT=./configs \
EXECUTIONS_ROOT=./executions \
PIPELINES_ROOT=./pipelines \
uv run celery -A analysis worker -l INFO

Initial Configuration

Access the admin interface
- Navigate to http://localhost:8019/admin
- Log in with your superuser credentials
Set up pipeline categories
- Create at least one category and subcategory
Add a test pipeline
- Add pipeline repo: https://github.com/goodwright/nfcore-wrappers
- Create pipeline: "Faidx"
- Add version: "dev" (branch: master)
- Path: wrappers/faidx.nf
- Schema: schema/faidx.json

Configure Nextflow

// Create pipeline config in admin
docker.enabled = true
process {
    container = 'biocontainers/samtools:1.16.1'
    cpus = 2
    memory = '4.GB'
}

System Requirements

Minimum Requirements

CPU: 4 cores
RAM: 16GB
Storage: 100GB SSD
Network: 100Mbps

Recommended for Production

CPU: 16+ cores
RAM: 64GB+
Storage: 1TB+ NVMe SSD
Network: 1Gbps+
Database: PostgreSQL 14+ (dedicated)
Cache: Redis 6+
Queue: RabbitMQ 3.9+

Scaling Considerations

API servers: 1 per 50 concurrent users
Celery workers: 1 per 10 concurrent pipeline executions
Database: Use read replicas for large deployments
Storage: Plan for 10-100GB per analysis project

Troubleshooting

Common Issues

Services won't start

# Check logs
docker-compose logs -f api

# Verify port availability
sudo lsof -i :8000
sudo lsof -i :3000

Database connection errors

# Test database connection
docker-compose exec api python manage.py dbshell

# Reset database
docker-compose down -v
docker-compose up -d

File permission errors

# Fix volume permissions
sudo chown -R 1000:1000 volumes/

Pipeline execution failures

# Check Celery worker logs
docker-compose logs -f celery

# Verify Docker is accessible
docker run hello-world

Getting Help

Documentation: docs.flow.bio
Community Forum: community.flow.bio
GitHub Issues: github.com/goodwright/flow/issues
Enterprise Support: support@flow.bio