Skip to content

Commit

Permalink
Prepare for bioconda (#21)
Browse files Browse the repository at this point in the history
* fix: entirely remove `pairix` (not `pypairix`) dependency

* ci: move micomplete to conda deps

* ci: rm docker files

* ci: move artifacts to /src

* ci: comment out definition of $CXX variable

* bump to 1.3.6
  • Loading branch information
js2264 authored Feb 21, 2025
1 parent 5fd32e3 commit 30b3d81
Show file tree
Hide file tree
Showing 18 changed files with 59 additions and 123 deletions.
2 changes: 1 addition & 1 deletion .codecov.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@ coverage:
default:
target: auto
threshold: 1%
if_ci_failed: ignore #success, failure, error, ignore
if_ci_failed: error #success, failure, error, ignore
informational: false
only_pulls: false
9 changes: 0 additions & 9 deletions .dockerignore

This file was deleted.

3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -139,4 +139,5 @@ artifacts/
gen-louvain/
networkanalysis/
pairix/
bowtie2/
bowtie2/
bin/
7 changes: 6 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,13 @@

# Change Log

All notable changes to this project will be documented in this file.

## [1.3.6] - 2025-02-21
- Prepare for bioconda release.
- Only rely on python `pypairix` package (installable from pip).
- Binaries are now put into `./bin` instead of `./external/artifacts/`.
- micomplete is now avaible from `bioconda`.

## [1.3.4] - 2025-02-19
- Package now relies on `pyproject.toml` for build configuration with `hatch`.
- Binaries for `louvain` and `leiden` clustering algorithms are now embedded in the package.
Expand Down
21 changes: 0 additions & 21 deletions Dockerfile

This file was deleted.

2 changes: 1 addition & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
include README.md
include external/artifacts/*
include bin/*
recursive-exclude test_data *
recursive-exclude tests *
25 changes: 2 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/metator.svg)
[![Build Status](https://github.com/koszullab/metator/actions/workflows/ci.yml/badge.svg)](https://github.com/koszullab/metaTOR/actions)
[![codecov](https://codecov.io/gh/koszullab/metator/branch/master/graph/badge.svg)](https://codecov.io/gh/koszullab/metator)
<!-- [![Docker Cloud Build Status](https://img.shields.io/docker/cloud/build/koszullab/metator)](https://hub.docker.com/r/koszullab/metator) -->
[![Read the docs](https://readthedocs.org/projects/metator/badge)](https://metator.readthedocs.io)
[![License: GPLv3](https://img.shields.io/badge/License-GPL%203-0298c3.svg)](https://opensource.org/licenses/bo-3.0)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)
Expand Down Expand Up @@ -44,7 +43,6 @@ before installing `metator`:
* Python `3.9` to `3.11` is required.
* The following dependencies should also be locally installed and available in the `$PATH`:
* [`bowtie2`](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) or `bwa`
* [`pairix`](https://github.com/4dn-dcic/pairix)
* [`samtools`](https://www.htslib.org/download/)
* [`hmmer`](http://hmmer.org/documentation.html)
* [`prodigal`](https://github.com/hyattpd/Prodigal)
Expand All @@ -53,42 +51,23 @@ before installing `metator`:
* The following non-pythonic librairies are **embedded** when installing `metator` with `pip`: [`louvain 0.3`](https://sourceforge.net/projects/louvain/files/GenericLouvain/) and [`leiden 1.3.0`](https://github.com/CWTSLeiden/networkanalysis).

```sh
# Install bowtie2, sameools, hmmer, prodigal and java-jdk:
# Install bowtie2, samtools, hmmer, prodigal and java-jdk:
sudo apt update && sudo apt install bowtie2 samtools hmmer prodigal default-jdk

# Also install pairix:
wget https://github.com/4dn-dcic/pairix/archive/refs/tags/0.3.9.zip -O pairix-0.3.9.zip
unzip pairix-0.3.9.zip
mv pairix-0.3.9 ~/.local/lib/pairix
cd ~/.local/lib/pairix
make
chmod +x bin/pairix
echo 'export PATH=$PATH:~/.local/lib/pairix/bin' >> ~/.bashrc

# Install metator from Pypi
pip3 install metator
```

To use the development version:

```sh
# Install bowtie2, sameools, hmmer, prodigal, java-jdk and pairix, see above
# Install bowtie2, sameools, hmmer, prodigal, java-jdk, see above

git clone https://github.com/koszullab/metator
cd metator
pip3 install -e .[dev]
```

<!--
### Using docker container
A dockerfile is also available if that is of interest. You may fetch the image by running the following:
```sh
docker pull koszullab/metator
```
-->

## Usage

```sh
Expand Down
35 changes: 9 additions & 26 deletions external/setup_dependencies.sh
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
#!/bin/bash

## Purge existing artifacts (required for local rebuild)
rm -rf artifacts/ gen-louvain/ pairix/ bowtie2/ networkanalysis/
rm -rf ../bin/ gen-louvain/ networkanalysis/

## Install louvain
tar -k -xzf louvain-generic.tar.gz
cd gen-louvain
sed -i 's/^CXX=g++/#&/' Makefile
make
cd ..

Expand All @@ -14,30 +15,12 @@ cd ..
mkdir -p networkanalysis/build/libs/
cp networkanalysis-1.3.0.jar networkanalysis/build/libs/

## Install pairix
# wget https://github.com/4dn-dcic/pairix/archive/refs/tags/0.3.9.zip -O pairix-0.3.9.zip
# zip -d pairix-0.3.9.zip "pairix-0.3.9/samples/*"
# unzip pairix-0.3.9.zip
# mv pairix-0.3.9 pairix
# cd pairix
# make
# chmod +x bin/pairix
# cd ..

# ## Install bowtie2
# wget https://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.5.1/bowtie2-2.5.1-source.zip/download -O bowtie2-2.5.1-source.zip
# zip -d bowtie2-2.5.1-source.zip "bowtie2-2.5.1/example/*"
# unzip bowtie2-2.5.1-source.zip
# mv bowtie2-2.5.1/ bowtie2
# cd bowtie2
# make
# cd ..

## Move artifacts to the correct location
mkdir -p artifacts/networkanalysis/build artifacts/pairix artifacts/bowtie2/bin
mv gen-louvain/ artifacts/
mv networkanalysis/build artifacts/networkanalysis/
# mv pairix/* artifacts/pairix/
# mv bowtie2/bowtie2* artifacts/bowtie2/bin/
mkdir -p ../bin/
mv gen-louvain/louvain ../bin/
mv gen-louvain/convert ../bin/
mv gen-louvain/hierarchy ../bin/
mv gen-louvain/matrix ../bin/
mv networkanalysis/build/libs/networkanalysis-1.3.0.jar ../bin/networkanalysis-1.3.0.jar

rm -rf gen-louvain/ pairix/ bowtie2/ networkanalysis/
rm -rf gen-louvain/ networkanalysis/
2 changes: 1 addition & 1 deletion metator.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,6 @@ dependencies:
- hmmer
- gcc
- java-jdk
- micomplete
- pip:
- micomplete
- metator
10 changes: 5 additions & 5 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "hatchling.build"

[project]
name = "metator"
version = "1.3.4"
version = "1.3.6"
description = "A pipeline for binning metagenomic datasets from metaHiC data."
readme = "README.md"
requires-python = ">=3.9,<3.13"
Expand Down Expand Up @@ -45,11 +45,11 @@ dependencies = [
"scipy",
"seaborn",
"looseversion",
"micomplete"
"micomplete",
"pypairix"

# NON PIP DEPENDENCIES
#"bowtie2"
#"pairix"
#"bwa"
#"samtools"
#"prodigal"
Expand Down Expand Up @@ -114,13 +114,13 @@ path = "src/metator/version.py"
allow-direct-references = true

[[tool.hatch.build.hooks.build-scripts.scripts]]
out_dir = "external/artifacts/"
out_dir = "bin/"
work_dir = "external"
commands = ["bash setup_dependencies.sh"]
artifacts = []

[tool.hatch.build.force-include]
"external/artifacts" = "metator/external/artifacts"
"bin" = "metator/bin"

[tool.black]
line-length = 130
Expand Down
11 changes: 5 additions & 6 deletions src/metator/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,8 @@
from .version import __version__ as version
from . import *

__author__ = "Amaury Bignaud, Jacques Serizay, Lyam Baudry, Théo Foutel-Rodier,\
Martial Marbouty"
__copyright__ = "Copyright © 2017-2018, Institut Pasteur, Paris, France"
__author__ = "Amaury Bignaud, Jacques Serizay, Lyam Baudry, Théo Foutel-Rodier, Martial Marbouty"
__copyright__ = "Copyright © 2017-2025, Institut Pasteur, Paris, France"
__credits__ = [
"Amaury Bignaud",
"Jacques Serizay",
Expand Down Expand Up @@ -40,6 +39,6 @@ def is_editable_install():
__metator_root__ = __metator_source__
if is_editable_install():
__metator_root__ = os.path.abspath(os.path.join(__metator_source__, "../../"))
__leiden_dir__ = Path(__metator_root__, "external", "artifacts", "networkanalysis", "build", "libs")
LEIDEN_PATH = str(next(__leiden_dir__.glob("networkanalysis-1.3.0*.jar")))
LOUVAIN_PATH = str(Path(__metator_root__, "external", "artifacts", "gen-louvain"))
__bin_dir__ = Path(__metator_root__, "bin")
LEIDEN_PATH = str(next(__bin_dir__.glob("networkanalysis-1.3.0*.jar")))
LOUVAIN_PATH = str(__bin_dir__)
2 changes: 1 addition & 1 deletion src/metator/commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -1372,7 +1372,7 @@ class Pairs(AbstractCommand):
"""Sort, compress and index pairs files for faster assess to the data.
Sort the pairs file using pairtools. Compress them using bgzip. Index them
using pairix.
using pypairix.
usage:
pairs [--force] [--remove] [--threads=1] <pairsfile>...
Expand Down
4 changes: 2 additions & 2 deletions src/metator/contact_map.py
Original file line number Diff line number Diff line change
Expand Up @@ -237,9 +237,9 @@ def extract_pairs(metator_data):
)
)
for pairs_file in metator_data.pairs_files:
# Check if the pairix index exist.
# Check if the pypairix index exist.
pairs_data = mio.get_pairs_data(pairs_file)
# Need a sorted (chr1 chr2 pos1 pos2) pair file indexed with pairix.
# Need a sorted (chr1 chr2 pos1 pos2) pair file indexed with pypairix.
for contig_id1, contig in enumerate(metator_data.contigs):
# Only need to retrieve the upper triangle.
for contig_id2 in range(contig_id1, len(metator_data.contigs)):
Expand Down
31 changes: 15 additions & 16 deletions src/metator/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
- check_fasta_index
- check_is_fasta
- check_louvain_cpp
- check_pairix
- check_pypairix
- check_pairtools
- generate_fasta_index
- generate_temp_dir
Expand Down Expand Up @@ -48,6 +48,7 @@
from metator.log import logger
from os.path import join, exists, isfile
from random import getrandbits
from packaging import version
from packaging.version import Version


Expand Down Expand Up @@ -177,21 +178,20 @@ def check_louvain_cpp(louvain_path):
return True


def check_pairix():
def check_pypairix():
"""
Function to test if pairix is in the path.
Function to test if pypairix is available.
Returns:
--------
bool:
True if pairix found in the path, False otherwise.
True if pypairix is available.
"""
try:
pairix = sp.check_output(f"pairix --help", stderr=sp.STDOUT, shell=True)
except sp.CalledProcessError:
logger.error("Cannot find 'pairix' in your path please install it or add it in your path.")
raise ImportError
return False
v = version.parse(pypairix.__version__)
except AttributeError:
logger.error("Cannot find 'pypairix' installed.")
raise AttributeError
return True


Expand Down Expand Up @@ -293,15 +293,15 @@ def get_pairs_data(pairfile, threads=1, remove=False, force=False):
str :
Path to the sorted and indexed pair file.
"""
# Check if pairix index exists, generate it otherwise.
# Check if pypairix index exists, generate it otherwise.
try:
pairs_data = pypairix.open(pairfile)
except pypairix.PairixError:
try:
pairfile_sorted = f"{os.path.splitext(pairfile)[0]}_sorted.pairs.gz"
pairs_data = pypairix.open(pairfile_sorted)
except pypairix.PairixError:
logger.warning("No pairix index found. Build the index.")
logger.warning("No pypairix index found. Build the index.")
pairfile = sort_pairs_pairtools(pairfile, threads, remove, force)
pairs_data = pypairix.open(pairfile)
return pairs_data
Expand Down Expand Up @@ -771,8 +771,8 @@ def sort_pairs_pairtools(pairfile, threads=1, remove=False, force=False):
# Extract basename of the file.
basename = os.path.splitext(pairfile)[0]

# Test if pairix and pairtools are installed and in the path.
_ = check_pairix()
# Test if pypairix and pairtools are installed and in the path.
_ = check_pypairix()
_ = check_pairtools()

# Set the force parameter and delete files or raise an error accodringly.
Expand Down Expand Up @@ -800,9 +800,8 @@ def sort_pairs_pairtools(pairfile, threads=1, remove=False, force=False):
process = sp.Popen(cmd, shell=True)
_out, _err = process.communicate()
# Indexed pairs.
cmd = f"set -eu ; pairix{force} {basename}_sorted.pairs.gz"
process = sp.Popen(cmd, shell=True)
_out, _err = process.communicate()
force_pypairix = 1 if force else 0
pypairix.build_index(f"{basename}_sorted.pairs.gz", force=force_pypairix)

# Remove original pairfile if remove setup.
if remove:
Expand Down
2 changes: 1 addition & 1 deletion src/metator/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
final ouptut of metaTOR.
scaffold Scaffold a metator bin based on pairs files.
pairs Sort the pairs file using pairtools. Compress them using bgzip.
Index them using pairix.
Index them using pypairix.
host Detect bacterial host from a metaHiC network binned by metaTOR
given a annotated MGE list.
mge Build MGE MAGs based on metagenomic binning using metabat2
Expand Down
6 changes: 3 additions & 3 deletions src/metator/mge.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,14 +68,14 @@ def build_matrix(
mat = np.zeros((n, n))
# Write one pair file for all the ones given.
for pairs_file in pairs_files:
# Check if the pairix index exist
# Check if the pypairix index exist
try:
pairs_data = pypairix.open(pairs_file)
pypairix_index = True
except pypairix.PairixError:
logger.warning("No pairix index found. Iterates on the pairs.")
logger.warning("No pypairix index found. Iterates on the pairs.")
pypairix_index = False
# Need a sorted (chr1 chr2 pos1 pos2) pair file indexed with pairix.
# Need a sorted (chr1 chr2 pos1 pos2) pair file indexed with pypairix.
if pypairix_index:
for i, contig in enumerate(contigs):
# Only need to retrieve the upper triangle.
Expand Down
6 changes: 3 additions & 3 deletions tests/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,6 @@ def is_editable_install():
__metator_root__ = __metator_source__
if is_editable_install():
__metator_root__ = os.path.abspath(os.path.join(__metator_source__, "../../"))
__leiden_dir__ = Path(__metator_root__, "external", "artifacts", "networkanalysis", "build", "libs")
LEIDEN_PATH = str(next(__leiden_dir__.glob("networkanalysis-1.3.0*.jar")))
LOUVAIN_PATH = str(Path(__metator_root__, "external", "artifacts", "gen-louvain"))
__bin_dir__ = Path(__metator_root__, "bin")
LEIDEN_PATH = str(next(__bin_dir__.glob("networkanalysis-1.3.0*.jar")))
LOUVAIN_PATH = str(__bin_dir__)
Loading

0 comments on commit 30b3d81

Please sign in to comment.