Skip to content

Commit

Permalink
Updated docs
Browse files Browse the repository at this point in the history
  • Loading branch information
davidecrs committed Sep 18, 2024
1 parent 1c6ff56 commit e35e150
Show file tree
Hide file tree
Showing 5 changed files with 24 additions and 13 deletions.
4 changes: 0 additions & 4 deletions docs/source/contributes/extdb.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,10 +115,6 @@ This section is very simple, we only need to add the conda env file for our pack
![envs](assets/images/extdb/envs_key.png)

### Step 6.6: `external_db` section
```{note}
Still under optimization
```

This section is useful to organize external databases for the package that we are going to integrate. In this example, we need an external database (extdb).

In this section:
Expand Down
4 changes: 0 additions & 4 deletions docs/source/contributes/magspackage.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,10 +143,6 @@ This section is very simple, we only need to add the conda env file for our pack
![envs](assets/images/magspackage/envs_key.png)

### Step 6.6: `external_db` section
```{note}
Still under optimization
```

This section is useful to organize external databases for the package that we are going to integrate. In this example, we need an external database (extdb).

In this section:
Expand Down
5 changes: 1 addition & 4 deletions docs/source/contributes/simplepackage.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,11 +140,8 @@ This section is very simple, we only need to add the conda env file for our pack
![envs](assets/images/simplepackage/envs.png)

### Step 6.6: `external_db` section
```{note}
Still under optimization
```

This section is useful to organize external databases for the package that we are going to integrate. In this example, we won't need any external database. Look to this example to understand how this section works.
This section is useful to organize external databases for the package that we are going to integrate. In this example, we won't need any external database. Look to [this example](extdb) to understand how this section works.

However, let's do a brief introduction to this section:
- each package that requires an extdb has a key which contains two other keys:
Expand Down
19 changes: 18 additions & 1 deletion docs/source/tips/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,21 @@ Welcome to our section of _Frequently Asked Questions_. Over time, we plan to up

<br>

## How to ignore samples that failed a module computation
For instance, if the assembly failed for some samples, you can ignore them in the next modules when you are preparing the scripts with the `geomosaic prerun` (using the option `--ignore_samples`).


## How to submit an array job with the option to execute only a specific number of samples at the time?
Using Slurm specification, users can set a specific number of jobs in execution.
Once `geomosaic prerun` has been executed, you can modify the slurm script of geomosaic by adding `%2` to the line of the array job.
Specifically, the result should be something like this:
```
...
#SBATCH --array=1-32%2
...
```
meaning that slurm can only execute 2 jobs at the time.


## How can I see the SLURM queue of my jobs
Let's assume that my account is `dcorso`, I can see the queue using the following commands `squeue -u dcorso` (specific for my account) or `squeue`.
Expand All @@ -15,7 +30,7 @@ squeue -u dcorso
```

## How can I check SLURM status of my jobs?
Assuming that you know which is the job id (for example 123456), you can use the following command
Assuming that you know which is the job ID (for example 123456), in clusters that have SLURM you can use the following command
```
sacct -j 123456
```
Expand All @@ -25,6 +40,8 @@ If you are working with a TMUX session, we suggest to add `less` in pipe to be a
sacct -j 123456 | less
```

This command is useful to see the logs and then the samples that failed a computation. For instance, if the assembly failed for some samples, you can ignore them in the next modules when you are preparing the scripts with the `geomosaic prerun` (using the option `--ignore_samples`).

## How can I use the same conda environments and external databases for different execute of Geomosaic?
As we have suggested also in the [Walkthrough tutorial](../walkthrough/tutorial.md#geomosaic-setup---command), it is a good practice to specify the same folder for the `-c` and `-e` options in the geomosaic setup, respectively for the conda environment and external databases.

Expand Down
5 changes: 5 additions & 0 deletions docs/source/tips/suggestions.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,11 @@ In this page you will find some basic suggestions about the Geomosaic execution

<br>

## Create a specific log folder for each tool
When executing the `geomosaic prerun`, we suggest putting a specific folder for the tool that we are running or for our workflow. For instance, if we are running the assembly using metaspades with the `geomosaic unit` command, in the prerun you can use `-f slurm_logs/metaspades` to put all the corresponding logs in that folder.

## Delete corresponding output folder before re-executing a tool
If you would like to execute a tool again in a specific module, we suggest deleting the previous output folder from sample directories or renaming it. Indeed, snakemake can check if the output does exist, and in this case, it avoids rerunning the computation.

## Execute commands with geomosaic conda environment activated
The geomosaic conda environment must be activate __before__ each command, even when submitting jobs through SLURM or executing the workflow using GNU Parallel.
Expand Down

0 comments on commit e35e150

Please sign in to comment.