Skip to content

Commit

Permalink
Updated contributes section
Browse files Browse the repository at this point in the history
  • Loading branch information
davidecrs committed Jul 30, 2024
1 parent db75441 commit d752475
Show file tree
Hide file tree
Showing 3 changed files with 80 additions and 41 deletions.
36 changes: 24 additions & 12 deletions docs/source/contributes/extdb.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,11 +35,15 @@ In this example we are going to integrate a package that perform a functional an

We need to create the package folder inside the corresponding module, which in this case is `assembly_func_annotation`. Since we are going to integrate the program called `KOfam Scan`, we can create a folder called `kofam_scan`.

{: .highlight }
> {: .warning }
> __Do not__ use any special characters or insert spaces in the name.
>
> Just rely on _underscore_ and all lower-case characters
```{warning}
__Do not__ use any special characters or insert spaces in the name.
```

```{admonition} Highlight
:class: important
Just rely on _underscore_ and all lower-case characters
```


## Step 4: Create package's snakefiles
Expand All @@ -51,8 +55,9 @@ Now we need to create the three files where we are going to implement all the ne

For now you can leave them empty.

{: .important }
```{important}
The names for this file are standard and are the same for each package. Do not change the filenames.
```

![modules_folder](assets/images/extdb/module.png)

Expand All @@ -76,8 +81,9 @@ We can skip the `order` section, since our module `assembly_func_annotation` alr

Similarly, we can skip also this section since our module `assembly_func_annotation` already existed before this integration.

{: .important }
> If you want to understand what really means a dependency in Geomosaic, you can read this [Modules Dependencies description](../modules.md#description).
```{important}
If you want to understand what really means a dependency in Geomosaic, you can read this [Modules Dependencies description](../modules.md#description).
```

### Step 6.3: `modules` section

Expand All @@ -86,16 +92,20 @@ In the corresponding `modules` section, we need to add the name of our package i

In particolar, the **key** (the blu string in the image) is the string that will come out in the terminal as a choice, during the workflow decision, while the **value** (in orange) is the actual name of the package, the one that we used also to create the folder in step 3.

{: .important }
```{important}
Package name on the **value** must match with the folder created in the step 3
```

![modules_key](assets/images/extdb/modules_key.png)

### Step 6.4: `additional_input` section
If the package does require any additional input, you can integrate this input in the corresponding section of `additional_input`. In this case we don't need to put any additional argument.

{: .highlight }
```{admonition} Highlight
:class: important
Additional arguments are parameters that are widely known in the metagenomic workflow and that should be chosen by the user, as for example Completeness and Contamination.
```

In this section we have inserted also the possibility to specificy a folder that contains HMM models (for `assembly_hmm_annotation` and `mags_hmm_annotation`), as well as the name of the output folder these two modules in order to have different output name folder for different sets of HMMs.

Expand All @@ -105,8 +115,9 @@ This section is very simple, we only need to add the conda env file for our pack
![envs](assets/images/extdb/envs_key.png)

### Step 6.6: `external_db` section
{: .note }
```{note}
Still under optimization
```

This section is useful to organize external databases for the package that we are going to integrate. In this example, we need an external database (extdb).

Expand Down Expand Up @@ -134,8 +145,9 @@ In this folder, we create two files named:
- `snakefile.smk`
- `target.txt`

{: .important }
```{important}
Do not change this filenames.
```

![modules_extdb](assets/images/extdb/modules_extdb.png)

Expand Down
34 changes: 23 additions & 11 deletions docs/source/contributes/magspackage.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,21 +28,26 @@ git checkout -b mags_kofam_scan

## Step 2: Create the module folder (if does not exists)

{: .note }
As we can intuitively think, KOfam Scan is a package for the functional annotation, in this case for the mags. Before this integration, the only package that belonged to this module was DRAM. However, DRAM usually takes in input a folder that contains all the fasta files of the mags, which is different from the input of the kofam scan, which takes the predicted orf from a single MAG.
```{note}
As we can intuitively think, KOfam Scan is a package for the functional annotation, in this case for the mags. Before this integration, the only package that belonged to this module was DRAM. However, DRAM usually takes in input a folder that contains all the fasta files of the mags, which is different from the input of the kofam scan, which takes the predicted orf from a single MAG.
```

Due to this difference, both packages cannot be in the same modules as the dependencies are different. So I decided to change the module belonging to DRAM, calling it `mags_metabolism_annotation` and insert `mags_kofam_scan` in the `mags_functional_annotation`, which depends on the mags_orf_prediction. Modules names are just to describe what the packages do, however, it would have been the same if it was `mags_functional_annotation` for DRAM and `mags_functional_annotation_2` for kofam scan.

## Step 3: Create the package folder

We need to create the package folder inside the corresponding module, which in this case is `mags_func_annotation`. Since we are going to integrate the program called `KOfam Scan` for MAGs, we can create a folder called `mags_kofam_scan`.

{: .highlight }
> {: .warning }
> __Do not__ use any special characters or insert spaces in the name.
>
> Just rely on _underscore_ and all lower-case characters

```{warning}
__Do not__ use any special characters or insert spaces in the name.
```

```{admonition} Highlight
:class: important
Just rely on _underscore_ and all lower-case characters
```

## Step 4: Create package's snakefiles

Expand All @@ -53,8 +58,9 @@ Now we need to create the three files where we are going to implement all the ne

For now you can leave them empty.

{: .important }
```{important}
The names for this file are standard and are the same for each package. Do not change the filenames.
```

## Step 5: create the corresponding `conda` env file
For this step, we don't need to create the corresponding `conda` env file as we have already created for the [Integration Example 2](extdb.md#step-5-create-the-corresponding-conda-env-file)
Expand Down Expand Up @@ -95,8 +101,9 @@ In the corresponding `modules` section, we need to add the name of our package i

In particolar, the **key** (the blu string in the image) is the string that will come out in the terminal as a choice, during the workflow decision, while the **value** (in orange) is the actual name of the package, the one that we used also to create the folder in step 3.

{: .important }
```{important}
Package name on the **value** must match with the folder created in the step 3
```

Here we can see how DRAM was moved into the `mags_metabolism_annotation` module.
```
Expand All @@ -121,8 +128,12 @@ Here we can see how DRAM was moved into the `mags_metabolism_annotation` module.
### Step 6.4: `additional_input` section
If the package does require any additional input, you can integrate this input in the corresponding section of `additional_input`. In this case we don't need to put any additional argument.

{: .highlight }

```{admonition} Highlight
:class: important
Additional arguments are parameters that are widely known in the metagenomic workflow and that should be chosen by the user, as for example Completeness and Contamination.
```

In this section we have inserted also the possibility to specificy a folder that contains HMM models (for `assembly_hmm_annotation` and `mags_hmm_annotation`), as well as the name of the output folder these two modules in order to have different output name folder for different sets of HMMs.

Expand All @@ -132,8 +143,9 @@ This section is very simple, we only need to add the conda env file for our pack
![envs](assets/images/magspackage/envs_key.png)

### Step 6.6: `external_db` section
{: .note }
```{note}
Still under optimization
```

This section is useful to organize external databases for the package that we are going to integrate. In this example, we need an external database (extdb).

Expand Down
51 changes: 33 additions & 18 deletions docs/source/contributes/simplepackage.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,26 +32,33 @@ git checkout -b fastqc
## Step 2: Create the module folder (if does not exists)
In this case we are going to integrate a package that belongs to a module related to the quality checks of the reads after the `pre_processing` step, so we create a module folder called `reads_qc` inside the `modules` folder (Figure below in [Step 4](#step-4-create-packages-snakefiles)).

{: .important }
```{important}
This step is necessary only if the module folder does not exists.
```

```{warning}
__Do not__ use any special characters or insert spaces in the name.
```

{: .highlight }
> {: .warning }
> __Do not__ use any special characters or insert spaces in the name.
>
> Just rely on _underscore_ and all lower-case characters
```{admonition} Highlight
:class: important
Just rely on _underscore_ and all lower-case characters
```

## Step 3: Create the package folder

In this step we create the package folder inside the module of interest. In this case, our package folder will be `fastqc_readscount` (Figure below in [Step 4](#step-4-create-packages-snakefiles)).

{: .highlight }
> {: .warning }
> __Do not__ use any special characters or insert spaces in the name.
>
> Just rely on _underscore_ and all lower-case characters
```{warning}
__Do not__ use any special characters or insert spaces in the name.
```

```{admonition} Highlight
:class: important
Just rely on _underscore_ and all lower-case characters
```

## Step 4: Create package's snakefiles

Expand All @@ -62,8 +69,9 @@ Create three code files inside the package folder, with the following filename:

For now you can leave them empty.

{: .important }
```{important}
The names for this file are standard and are the same for each package. Do not change the filenames.
```

![modules_folder](assets/images/simplepackage/modulefolder.png)

Expand All @@ -86,8 +94,9 @@ Since `reads_qc` is a module that we thought to be after the processing of the r

### Step 6.2: `graph` section

{: .important }
> Before going further in this section, you should understand what really means a dependency in Geomosaic in this [Modules Dependencies description](../modules.md#description).
```{important}
Before going further in this section, you should understand what really means a dependency in Geomosaic in this [Modules Dependencies description](../modules.md#description).
```

The package that we are going to integrate in this module, depends on the output reads obtained from the `pre_processing` modules, so we put in graph the following line:

Expand All @@ -104,19 +113,24 @@ In the correspongin `modules` section, we need to add the name of the module, wh
- `choices` - which is a dictionary containing all the packages belonging to that module. <br>
In particolar, the **key** (the blu string in the image) is the String that will come out in the terminal as a choice, during the workflow decision, while the **value** (in orange) is the actual name of the package, the one that we used also to create the folder created in step 3.

{: .important }
```{important}
Package name on the **value** must match with the folder created in the step 3
```
{: .important }
```{important}
Remember the last comma after the last parenthesis.
```
![modules](assets/images/simplepackage/modules.png)
### Step 6.4: `additional_input` section
If the package does require any additional input, you can integrate this input in the corresponding section of `additional_input`. In this case we don't need to put any additional argument.
{: .highlight }
```{admonition} Highlight
:class: important
Additional arguments are parameters that are widely known in the metagenomic workflow and that should be chosen by the user, as for example Completeness and Contamination.
```

In this section we have inserted also the possibility to specificy a folder that contains HMM models (for `assembly_hmm_annotation` and `mags_hmm_annotation`), as well as the name of the output folder these two modules in order to have different output name folder for different sets of HMMs.

Expand All @@ -126,8 +140,9 @@ This section is very simple, we only need to add the conda env file for our pack
![envs](assets/images/simplepackage/envs.png)

### Step 6.6: `external_db` section
{: .note }
```{note}
Still under optimization
```

This section is useful to organize external databases for the package that we are going to integrate. In this example, we won't need any external database. Look to this example to understand how this section works.

Expand Down

0 comments on commit d752475

Please sign in to comment.