Administration
Installation
Flow can be deployed in multiple ways depending on your needs:
- Cloud Hosted - Use the managed Flow platform at app.flow.bio
- Self-Hosted - Deploy Flow in your own infrastructure
- Local Development - Run Flow on your laptop for testing or development
Cloud Hosted (Recommended)
The easiest way to use Flow is through our managed cloud platform:
- Visit app.flow.bio
- Create an account
- Start uploading data and running pipelines
No installation required! The cloud platform includes:
- Automatic updates and maintenance
- Scalable compute resources
- Data backup and security
- Technical support
Self-Hosted Deployment
Organizations can deploy Flow in their own infrastructure for complete control over data and compute resources.
Prerequisites
- Docker and Docker Compose (v2.0+)
- 16GB RAM minimum (32GB recommended)
- 100GB storage for application data
- Additional storage for biological data
- Linux server (Ubuntu 20.04+ recommended)
Quick Start with Docker Compose
Clone the deployment repository
git clone https://github.com/goodwright/flow-deploy cd flow-deploy
Configure environment variables
cp .env.example .env # Edit .env with your settings
Set up required volumes
mkdir -p volumes/{db,uploads,executions,pipelines,media,configs} chmod -R 777 volumes/ # Adjust permissions as needed
Start Flow
docker-compose up -d
Access Flow
- Frontend: http://localhost:3000
- API: http://localhost:8000
- Admin: http://localhost:8000/admin
Production Configuration
SSL/TLS Setup
For production, configure HTTPS using a reverse proxy:
# /etc/nginx/sites-available/flow
server {
listen 443 ssl http2;
server_name flow.yourdomain.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
location / {
proxy_pass http://localhost:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
location /api/ {
proxy_pass http://localhost:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
Database Configuration
For production, use an external PostgreSQL database:
# docker-compose.override.yml
services:
api:
environment:
- DATABASE_URL=postgresql://user:pass@db.yourdomain.com/flow
- REDIS_URL=redis://redis.yourdomain.com:6379
Storage Configuration
Network Attached Storage (NAS)
# docker-compose.override.yml
volumes:
uploads:
driver: local
driver_opts:
type: nfs
o: addr=nas.yourdomain.com,rw
device: ":/flow/uploads"
Cloud Storage (S3-compatible)
# .env
STORAGE_BACKEND=s3
AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_STORAGE_BUCKET_NAME=flow-data
AWS_S3_REGION_NAME=us-east-1
Kubernetes Deployment
For enterprise deployments, use our Helm charts:
# Add Flow Helm repository
helm repo add flow https://charts.flow.bio
helm repo update
# Install Flow
helm install flow flow/flow-app \
--namespace flow \
--create-namespace \
--values values.yaml
Example values.yaml
:
# values.yaml
api:
replicas: 3
resources:
requests:
memory: "2Gi"
cpu: "1"
limits:
memory: "4Gi"
cpu: "2"
celery:
workers:
replicas: 5
resources:
requests:
memory: "4Gi"
cpu: "2"
limits:
memory: "8Gi"
cpu: "4"
postgresql:
enabled: false
external:
host: "postgres.yourdomain.com"
database: "flow"
existingSecret: "flow-db-secret"
storage:
className: "fast-ssd"
size: "1Ti"
HPC Integration
Connect Flow to your HPC cluster:
Slurm Configuration
# flow-custom/slurm.py
SLURM_CONFIG = {
"partition": "compute",
"account": "bioinformatics",
"time": "24:00:00",
"mem": "32G",
"cpus": 8,
"modules": [
"nextflow/23.04.3",
"singularity/3.8.0"
]
}
# SSH connection to login node
HPC_SSH_CONFIG = {
"hostname": "hpc.yourdomain.com",
"username": "flow-service",
"key_filename": "/secrets/hpc-key"
}
Shared Filesystem
Mount your HPC filesystem:
# docker-compose.override.yml
volumes:
hpc-data:
driver: local
driver_opts:
type: nfs
o: addr=hpc-nfs.yourdomain.com,rw
device: ":/shared/flow"
services:
celery:
volumes:
- hpc-data:/mnt/hpc
Local Development
Run Flow locally for development or testing.
Prerequisites
- Python 3.11+
- Node.js 18+
- Docker Desktop
- 8GB RAM minimum
Backend Setup
Clone the API repository
git clone https://github.com/goodwright/flow-api cd flow-api/api
Install dependencies
pip install uv uv pip install -r requirements.txt
Set up database
uv run python manage.py migrate uv run python manage.py createsuperuser
Start the API server
FRONTEND_URL=http://localhost:3019 \ SERVE_FILES=yes \ UPLOADS_ROOT=./uploads \ CONFIGS_ROOT=./configs \ EXECUTIONS_ROOT=./executions \ PIPELINES_ROOT=./pipelines \ MEDIA_ROOT=./media \ BULK_DOWNLOADS_ROOT=./downloads \ uv run python manage.py runserver 8019
Frontend Setup
Clone the frontend repository
git clone https://github.com/goodwright/flow-front cd flow-front
Install dependencies
npm install
Start the development server
PORT=3019 \ REACT_APP_BACKEND=http://localhost:8019 \ REACT_APP_MEDIA=http://localhost:8019/media \ REACT_APP_DATA=http://localhost:8019/data \ npm start
Message Queue Setup
Start RabbitMQ
docker run -d \ --name flow-rabbit \ -p 5673:5673 \ -e RABBITMQ_NODE_PORT=5673 \ rabbitmq:3
Start Celery worker
cd flow-api/api EX_BROKER_URL=amqp://guest:guest@localhost:5673 \ UPLOADS_ROOT=./uploads \ CONFIGS_ROOT=./configs \ EXECUTIONS_ROOT=./executions \ PIPELINES_ROOT=./pipelines \ uv run celery -A analysis worker -l INFO
Initial Configuration
Access the admin interface
- Navigate to http://localhost:8019/admin
- Log in with your superuser credentials
Set up pipeline categories
- Create at least one category and subcategory
Add a test pipeline
- Add pipeline repo:
https://github.com/goodwright/nfcore-wrappers
- Create pipeline: "Faidx"
- Add version: "dev" (branch: master)
- Path:
wrappers/faidx.nf
- Schema:
schema/faidx.json
- Add pipeline repo:
Configure Nextflow
// Create pipeline config in admin docker.enabled = true process { container = 'biocontainers/samtools:1.16.1' cpus = 2 memory = '4.GB' }
System Requirements
Minimum Requirements
- CPU: 4 cores
- RAM: 16GB
- Storage: 100GB SSD
- Network: 100Mbps
Recommended for Production
- CPU: 16+ cores
- RAM: 64GB+
- Storage: 1TB+ NVMe SSD
- Network: 1Gbps+
- Database: PostgreSQL 14+ (dedicated)
- Cache: Redis 6+
- Queue: RabbitMQ 3.9+
Scaling Considerations
- API servers: 1 per 50 concurrent users
- Celery workers: 1 per 10 concurrent pipeline executions
- Database: Use read replicas for large deployments
- Storage: Plan for 10-100GB per analysis project
Troubleshooting
Common Issues
Services won't start
# Check logs
docker-compose logs -f api
# Verify port availability
sudo lsof -i :8000
sudo lsof -i :3000
Database connection errors
# Test database connection
docker-compose exec api python manage.py dbshell
# Reset database
docker-compose down -v
docker-compose up -d
File permission errors
# Fix volume permissions
sudo chown -R 1000:1000 volumes/
Pipeline execution failures
# Check Celery worker logs
docker-compose logs -f celery
# Verify Docker is accessible
docker run hello-world
Getting Help
- Documentation: docs.flow.bio
- Community Forum: community.flow.bio
- GitHub Issues: github.com/goodwright/flow/issues
- Enterprise Support: support@flow.bio