Advanced Topics
Basics of Flow
Flow is a comprehensive bioinformatics platform that makes it easy to analyze biological data at scale. Whether you're running a single RNA-seq analysis or managing data for an entire research institute, Flow provides the tools and infrastructure you need.
What is Flow?
Flow is a cloud-based platform that brings together everything you need for bioinformatics analysis:
- Data Management: Upload, organize, and share your biological data
- Pipeline Execution: Run state-of-the-art analysis pipelines with a few clicks
- Results Visualization: Explore your results through interactive reports and visualizations
- Collaboration: Share data and results with your team or the broader scientific community
Who uses Flow?
Flow is designed for:
- Wet lab scientists who need to analyze sequencing data without command-line expertise
- Bioinformaticians who want to standardize and scale their analysis workflows
- Research groups looking for a centralized platform to manage their data
- Core facilities that need to deliver consistent, reproducible analyses
Core Concepts Overview
Flow is built around five interconnected concepts that work together to enable seamless bioinformatics analysis:
- Projects: Organize related samples and analyses
- Samples: Represent biological specimens and their data
- Data: The actual files used and generated
- Pipelines: Pre-configured analysis workflows
- Executions: Individual analysis runs with full tracking
For a detailed explanation of these concepts and how they relate to each other, see our comprehensive Core Concepts Guide.
The Flow Workflow
A typical analysis in Flow follows these steps:
1. Upload Your Data
Start by uploading your sequencing files. Flow supports:
- Direct upload from your computer
- Import from cloud storage (S3, Google Cloud)
- Transfer from sequencing facilities
2. Organize into Projects and Samples
Create a project to organize your work, then create samples with appropriate metadata:
My RNA-seq Project/
├── Control Sample 1
│ ├── control1_R1.fastq.gz
│ └── control1_R2.fastq.gz
├── Control Sample 2
│ ├── control2_R1.fastq.gz
│ └── control2_R2.fastq.gz
└── Treatment Sample 1
├── treatment1_R1.fastq.gz
└── treatment1_R2.fastq.gz
3. Select and Configure a Pipeline
Choose an appropriate pipeline for your data type:
- Browse the pipeline catalog
- Review the pipeline documentation
- Configure parameters (or use sensible defaults)
- Select which samples to process
4. Monitor Execution
Once launched, you can:
- Track progress in real-time
- View resource usage
- Check quality control metrics
- Get notified when complete
5. Explore Results
When the pipeline finishes:
- View interactive quality reports
- Download processed data
- Share results with collaborators
- Export figures for publications
Key Features
Reproducibility
Every analysis in Flow is fully reproducible:
- Pipeline versions are tracked
- Parameters are recorded
- Software environments are containerized
- You can re-run any analysis exactly
Collaboration
Flow makes it easy to work with others:
- Share projects with specific users or groups
- Control permissions (view-only, can run analyses, full access)
- Track who did what and when
- Comment on results and findings
Scalability
Flow scales with your needs:
- Process one sample or thousands
- Automatic resource allocation
- Parallel processing for speed
- No infrastructure to manage
Security
Your data is protected:
- Encrypted storage and transmission
- Fine-grained access controls
- Audit trails for compliance
- Regular backups
Getting Started
Ready to begin? Here's how to get started with Flow:
- Create an account - Sign up for free
- Upload your first dataset - Get your data into Flow
- Run a pipeline - Analyze your data
- Explore the results - Understand your outputs
For more detailed information about Flow's architecture and technical implementation, see the Architecture Guide.