API Overview

Flow provides comprehensive APIs for programmatic access to all platform features. Whether you're automating workflows, building integrations, or creating custom interfaces, Flow's APIs give you full control over your bioinformatics data and analyses.

API Types

Flow offers three ways to interact with the platform programmatically:

1. REST API

Comprehensive endpoints for all platform operations:

File uploads and downloads with chunking support
Sample and project management
Pipeline execution and monitoring
Data management with sharing and permissions
Group and user management
Advanced search and filtering
Real-time execution monitoring

Base URL: https://api.flow.bio/

2. GraphQL API

Flexible query interface for complex data retrieval:

Single request for related data
Precise field selection
Real-time subscriptions
Batch mutations
Type-safe schema

Endpoint: https://api.flow.bio/graphql

3. Python Client (flowbio)

High-level Python library that wraps both REST and GraphQL APIs:

Simplified authentication with automatic token refresh
Chunked file upload with progress tracking
Pipeline execution and monitoring
Batch operations
Async support

Installation: pip install flowbio

Authentication

All API requests require authentication using JWT (JSON Web Tokens).

Obtaining Tokens

import flowbio

client = flowbio.Client()
client.login("username", "password")

# Access tokens are managed automatically by the client
print(f"Access token: {client.access_token}")
print(f"Refresh token: {client.refresh_token}")

Using Tokens

Include the access token in the Authorization header:

Authorization: Bearer <access_token>

Token Management

Access tokens expire after 5 minutes
Refresh tokens are valid for 7 days
The Python client automatically refreshes tokens
For direct API access, use the token endpoint:

curl -X GET https://api.flow.bio/token \
  -H "Authorization: Bearer <access_token>"

Core Concepts

Permissions

Flow uses a three-level permission system:

Read (1) - View the resource
Edit (2) - Modify the resource
Share (3) - Manage permissions for the resource

Privacy

Resources can be:

Private - Only accessible to owner and explicitly shared users/groups
Public - Accessible to all authenticated users

Ownership

Every resource has an owner (user) and follows these rules:

Owners have full permissions (share level)
Permissions can be granted to users or groups
Group members inherit group permissions

Rate Limits

To ensure fair usage and system stability:

API Requests: 1000 requests per hour per user
File Upload: 100 uploads per hour per user
File Download: 500 downloads per hour per user
Search: 300 searches per hour per user

Rate limit headers are included in responses:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1640995200

Error Handling

API Errors

REST endpoints use standard HTTP status codes:

200 OK - Success
201 Created - Resource created
400 Bad Request - Invalid request
401 Unauthorized - Authentication required
403 Forbidden - Insufficient permissions
404 Not Found - Resource not found
429 Too Many Requests - Rate limit exceeded
500 Internal Server Error - Server error

Pagination

Flow uses two pagination methods:

Offset-based Pagination (REST API)

Used for most list endpoints:

# Using the Python client
samples = client.get_samples(limit=50, offset=100)

# Direct API call
response = requests.get(
    'https://api.flow.bio/samples/owned',
    headers={'Authorization': f'Bearer {token}'},
    params={'limit': 50, 'offset': 100}
)

Response includes count and page information:

{
  "count": 150,
  "page": 3,
  "samples": [...]
}

Cursor-based Pagination (GraphQL)

Used for GraphQL queries:

query GetSamples($cursor: String) {
  samples(first: 20, after: $cursor) {
    edges {
      node { id, name }
    }
    pageInfo {
      hasNextPage
      endCursor
    }
  }
}

Filtering and Searching

Basic Filtering

Most list endpoints support filtering:

# Using the Python client
samples = client.get_samples(
    filter="RNA-seq",
    organism="human",
    project=123
)

# Direct API call
response = requests.get(
    'https://api.flow.bio/samples/owned',
    headers={'Authorization': f'Bearer {token}'},
    params={
        'filter': 'RNA-seq',
        'organism': 'human',
        'project': 123
    }
)

Advanced Search

Dedicated search endpoints for complex queries:

# Quick search across all entities
results = client.search("BRCA1")

# Entity-specific search
samples = client.search_samples(
    filter="cancer",
    organism="human",
    created_after="2024-01-01"
)

# Direct API call
response = requests.get(
    'https://api.flow.bio/samples/search',
    headers={'Authorization': f'Bearer {token}'},
    params={
        'filter': 'cancer',
        'organism': 'human',
        'created_after': '2024-01-01'
    }
)

File Operations

Uploading Files

File uploads use a chunked approach for large files:

# Using the Python client
import flowbio

client = flowbio.Client()
client.login("username", "password")

data = client.upload_data(
    "/path/to/file.fastq.gz",
    progress=True,
    retries=5
)

Downloading Files

Flow supports multiple download methods:

Individual Files

# Direct download with nginx acceleration
curl -H "Authorization: Bearer <token>" \
  https://api.flow.bio/downloads/<data_id>/<filename> \
  -o output.fastq.gz

Bulk Downloads

# Request bulk download
job = client.create_bulk_download(
    data_ids=[123, 124, 125],
    name="my_dataset.zip"
)

# Check status
status = client.get_download_status(job.id)

# Download when ready
if status['status'] == 'completed':
    client.download_file(status['download_url'], 'dataset.zip')

Best Practices

Performance

Batch Operations: Use bulk endpoints when available
Field Selection: In GraphQL, request only needed fields
Pagination: Use appropriate page sizes (50-100 items)
Caching: Implement client-side caching for static data

Security

Token Storage: Never store tokens in code or version control
HTTPS Only: Always use encrypted connections
Minimal Permissions: Request only necessary access levels
Audit Logs: Monitor API usage for anomalies

Error Handling

Retry Logic: Implement exponential backoff for transient errors
Rate Limiting: Respect rate limit headers
Validation: Validate inputs before sending requests
Logging: Log errors with context for debugging

Getting Started

Core Guides

Authentication - Set up API authentication
Quick Start - Your first API calls
Python Client - Using the flowbio library

API References

REST API Reference - Complete REST endpoints
GraphQL Schema - GraphQL types and queries
Error Handling - Error codes and recovery

Feature Guides

Uploading Data - File upload strategies
Downloading Data - Efficient data retrieval
Search & Discovery - Advanced search features
Permissions - Access control system

Support

API Status: status.flow.bio
Community Forum: community.flow.bio/api
GitHub Issues: github.com/goodwright/flow/issues
Email Support: api-support@flow.bio