-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #205 from seqeralabs/main-fusion-docs-audit
Fusion docs overhaul
- Loading branch information
Showing
18 changed files
with
625 additions
and
406 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
--- | ||
title: Get started | ||
description: "Use the Fusion v2 file system in Seqera Platform and Nextflow" | ||
date: "23 Aug 2024" | ||
tags: [fusion, storage, compute, file system, posix, client] | ||
--- | ||
|
||
Use Fusion directly in Seqera Platform compute environments, or add Fusion to your Nextflow pipeline configuration. | ||
|
||
### Seqera Platform | ||
|
||
Use Fusion directly in the following Seqera Platform compute environments: | ||
- [AWS Batch](https://docs.seqera.io/platform/latest/compute-envs/aws-batch) | ||
- [Azure Batch](https://docs.seqera.io/platform/latest/compute-envs/azure-batch) | ||
- [Google Cloud Batch](https://docs.seqera.io/platform/latest/compute-envs/google-cloud-batch) | ||
- [Amazon EKS](https://docs.seqera.io/platform/latest/compute-envs/eks) | ||
- [Google GKE](https://docs.seqera.io/platform/latest/compute-envs/gke) | ||
|
||
See the Platform compute environment page for your cloud provider for Fusion configuration instructions and optimal compute and storage recommendations. | ||
|
||
### Nextflow | ||
|
||
:::note | ||
Fusion requires Nextflow `22.10.0` or later. | ||
::: | ||
|
||
Fusion integrates with Nextflow directly and does not require any installation or change in pipeline code. It only requires to use of a container runtime or a container computing service such as Kubernetes, AWS Batch, or Google Cloud Batch. | ||
|
||
#### Nextflow installation | ||
|
||
If you already have Nextflow installed, update to the latest version using this command: | ||
|
||
```bash | ||
nextflow -self-update | ||
``` | ||
|
||
Otherwise, install Nextflow with this command: | ||
|
||
```bash | ||
curl get.nextflow.io | bash | ||
``` | ||
|
||
#### Fusion configuration | ||
|
||
To enable Fusion in your Nextflow pipeline, add the following snippet to your `nextflow.config` file: | ||
|
||
```groovy | ||
fusion.enabled = true | ||
wave.enabled = true | ||
tower.accessToken = '<your Platform access token>' //optional | ||
``` | ||
|
||
:::tip | ||
The use of the Platform access token is not mandatory, however, it's required to enable access to private repositories | ||
and it allows higher service rate limits compared to anonymous users. | ||
::: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
--- | ||
title: AWS Batch | ||
description: "Use Fusion with AWS Batch and S3 storage" | ||
date: "23 Aug 2024" | ||
tags: [fusion, storage, compute, aws batch, s3] | ||
--- | ||
|
||
Fusion simplifies and improves the efficiency of Nextflow pipelines in [AWS Batch](https://aws.amazon.com/batch/) in several ways: | ||
|
||
- No need to use the AWS CLI tool for copying data to and from S3 storage. | ||
- No need to create a custom AMI or create custom containers to include the AWS CLI tool. | ||
- Fusion uses an efficient data transfer and caching algorithm that provides much faster throughput compared to AWS CLI and does not require a local copy of data files. | ||
- By replacing the AWS CLI with a native API client, the transfer is much more robust at scale. | ||
|
||
### Platform AWS Batch compute environments | ||
|
||
Seqera Platform supports Fusion in Batch Forge and manual AWS Batch compute environments. | ||
|
||
See [AWS Batch](https://docs.seqera.io/platform/latest/compute-envs/aws-batch) for compute and storage recommendations and instructions to enable Fusion. | ||
|
||
### Nextflow CLI | ||
|
||
:::tip | ||
Fusion file system implements a lazy download and upload algorithm that runs in the background to transfer files in | ||
parallel to and from the object storage into the container-local temporary directory (`/tmp`). To achieve optimal performance, set up an SSD volume as the temporary directory. | ||
|
||
Several AWS EC2 instance types include one or more NVMe SSD volumes. These volumes must be formatted to be used. See [SSD instance storage](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ssd-instance-store.html) for details. Seqera Platform automatically formats and configures NVMe instance storage with the “Fast instance storage” option when you create an AWS Batch compute environment. | ||
::: | ||
|
||
1. Add the following to your `nextflow.conf` file: | ||
|
||
```groovy | ||
process.executor = 'awsbatch' | ||
process.queue = '<YOUR AWS BATCH QUEUE>' | ||
process.scratch = false | ||
process.containerOptions = '-v /path/to/ssd:/tmp' // Required for SSD volumes | ||
aws.region = '<YOUR AWS REGION>' | ||
fusion.enaled = true | ||
wave.enabled = true | ||
``` | ||
|
||
Replace `<YOUR AWS BATCH QUEUE>` and `<YOUR AWS REGION>` with your AWS Batch queue and region. | ||
|
||
1. Run the pipeline with the usual run command: | ||
|
||
``` | ||
nextflow run <YOUR PIPELINE SCRIPT> -w s3://<YOUR-BUCKET>/work | ||
``` | ||
|
||
Replace `<YOUR PIPELINE SCRIPT>` with your pipeline Git repository URI and `<YOUR-BUCKET>` with your S3 bucket. |
Oops, something went wrong.