Updated docs

giovannellilab · Sep 18, 2024 · e35e150 · e35e150
1 parent 1c6ff56
commit e35e150
Show file tree

Hide file tree

Showing 5 changed files with 24 additions and 13 deletions.
diff --git a/docs/source/contributes/extdb.md b/docs/source/contributes/extdb.md
@@ -115,10 +115,6 @@ This section is very simple, we only need to add the conda env file for our pack
 ![envs](assets/images/extdb/envs_key.png)
 
 ### Step 6.6: `external_db` section
-```{note}
-Still under optimization
-```
-
 This section is useful to organize external databases for the package that we are going to integrate. In this example, we need an external database (extdb).
 
 In this section:

diff --git a/docs/source/contributes/magspackage.md b/docs/source/contributes/magspackage.md
@@ -143,10 +143,6 @@ This section is very simple, we only need to add the conda env file for our pack
 ![envs](assets/images/magspackage/envs_key.png)
 
 ### Step 6.6: `external_db` section
-```{note}
-Still under optimization
-```
-
 This section is useful to organize external databases for the package that we are going to integrate. In this example, we need an external database (extdb).
 
 In this section:

diff --git a/docs/source/contributes/simplepackage.md b/docs/source/contributes/simplepackage.md
@@ -140,11 +140,8 @@ This section is very simple, we only need to add the conda env file for our pack
 ![envs](assets/images/simplepackage/envs.png)
 
 ### Step 6.6: `external_db` section
-```{note}
-Still under optimization
-```
 
-This section is useful to organize external databases for the package that we are going to integrate. In this example, we won't need any external database. Look to this example to understand how this section works.
+This section is useful to organize external databases for the package that we are going to integrate. In this example, we won't need any external database. Look to [this example](extdb) to understand how this section works.
 
 However, let's do a brief introduction to this section:
 - each package that requires an extdb has a key which contains two other keys:

diff --git a/docs/source/tips/faq.md b/docs/source/tips/faq.md
@@ -6,6 +6,21 @@ Welcome to our section of _Frequently Asked Questions_. Over time, we plan to up
 
 <br>
 
+## How to ignore samples that failed a module computation
+For instance, if the assembly failed for some samples, you can ignore them in the next modules when you are preparing the scripts with the `geomosaic prerun` (using the option `--ignore_samples`).
+
+
+## How to submit an array job with the option to execute only a specific number of samples at the time?
+Using Slurm specification, users can set a specific number of jobs in execution.
+Once `geomosaic prerun` has been executed, you can modify the slurm script of geomosaic by adding `%2` to the line of the array job.
+Specifically, the result should be something like this:
+```
+...
+#SBATCH --array=1-32%2
+...
+```
+meaning that slurm can only execute 2 jobs at the time. 
+
 
 ## How can I see the SLURM queue of my jobs
 Let's assume that my account is `dcorso`, I can see the queue using the following commands `squeue -u dcorso` (specific for my account) or `squeue`.
@@ -15,7 +30,7 @@ squeue -u dcorso
 ```
 
 ## How can I check SLURM status of my jobs?
-Assuming that you know which is the job id (for example 123456), you can use the following command
+Assuming that you know which is the job ID (for example 123456), in clusters that have SLURM you can use the following command 
 ```
 sacct -j 123456
 ```
@@ -25,6 +40,8 @@ If you are working with a TMUX session, we suggest to add `less` in pipe to be a
 sacct -j 123456 | less
 ```
 
+This command is useful to see the logs and then the samples that failed a computation. For instance, if the assembly failed for some samples, you can ignore them in the next modules when you are preparing the scripts with the `geomosaic prerun` (using the option `--ignore_samples`).
+
 ## How can I use the same conda environments and external databases for different execute of Geomosaic?
 As we have suggested also in the [Walkthrough tutorial](../walkthrough/tutorial.md#geomosaic-setup---command), it is a good practice to specify the same folder for the `-c` and `-e` options in the geomosaic setup, respectively for the conda environment and external databases.
 

diff --git a/docs/source/tips/suggestions.md b/docs/source/tips/suggestions.md
@@ -6,6 +6,11 @@ In this page you will find some basic suggestions about the Geomosaic execution
 
 <br>
 
+## Create a specific log folder for each tool
+When executing the `geomosaic prerun`, we suggest putting a specific folder for the tool that we are running or for our workflow. For instance, if we are running the assembly using metaspades with the `geomosaic unit` command, in the prerun you can use `-f slurm_logs/metaspades` to put all the corresponding logs in that folder.
+
+## Delete corresponding output folder before re-executing a tool
+If you would like to execute a tool again in a specific module, we suggest deleting the previous output folder from sample directories or renaming it. Indeed, snakemake can check if the output does exist, and in this case, it avoids rerunning the computation.
 
 ## Execute commands with geomosaic conda environment activated
 The geomosaic conda environment must be activate __before__ each command, even when submitting jobs through SLURM or executing the workflow using GNU Parallel.