diff --git a/README.rst b/README.rst index dbbbb65..4849d58 100644 --- a/README.rst +++ b/README.rst @@ -127,7 +127,7 @@ SISTR will automatically initialize database of *Salmonella* serovar determinati The SISTR database v1.3 got minor updates by collapsing some of the serovars detailed in ``CHANGELOG.md`` file - SISTR v1.1 database is available at https://zenodo.org/records/13618515 or via a direct url https://zenodo.org/records/13618515/files/SISTR_V_1.1_db.tar.gz?download=1 -- SISTR v1.3 database is available at https://zenodo.org/records/13693495 or via a direct url https://zenodo.org/records/13693495/files/SISTR_V_1.1.3_db.tar.gz?download=1 +- SISTR v1.3 database is available at https://zenodo.org/records/13693495 or va a direct url https://zenodo.org/records/13693495/files/SISTR_V_1.1.3_db.tar.gz?download=1 Dependencies @@ -171,66 +171,66 @@ If you run ``sistr -h``, you should see the following usage info: [-l [LIST_OF_SEROVARS]] [-v] [-V] [F [F ...]] -SISTR (Salmonella In Silico Typing Resource) Command-line Tool -============================================================== -Serovar predictions from whole-genome sequence assemblies by determination of antigen gene and cgMLST gene alleles using BLAST. + SISTR (Salmonella In Silico Typing Resource) Command-line Tool + ============================================================== + Serovar predictions from whole-genome sequence assemblies by determination of antigen gene and cgMLST gene alleles using BLAST. -Note about using the "--use-full-cgmlst-db" flag: + Note about using the "--use-full-cgmlst-db" flag: The "centroid" allele database is ~10% the size of the full set so analysis is much quicker with the "centroid" vs "full" set of alleles. Results between 2 cgMLST allele sets should not differ. -If you find this program useful in your research, please cite as: - -The Salmonella In Silico Typing Resource (SISTR): an open web-accessible tool for rapidly typing and subtyping draft Salmonella genome assemblies. -Catherine Yoshida, Peter Kruczkiewicz, Chad R. Laing, Erika J. Lingohr, Victor P.J. Gannon, John H.E. Nash, Eduardo N. Taboada. -PLoS ONE 11(1): e0147101. doi: 10.1371/journal.pone.0147101 - -positional arguments: - F Input genome FASTA file - -optional arguments: - -h, --help show this help message and exit - -i fasta_path genome_name, --input-fasta-genome-name fasta_path genome_name - fasta file path to genome name pair - -f OUTPUT_FORMAT, --output-format OUTPUT_FORMAT - Output format (json, csv, tab, pickle) - -o OUTPUT_PREDICTION, --output-prediction OUTPUT_PREDICTION - SISTR serovar prediction output path - -M, --more-results Output more detailed results (-M) and all antigen - search blastn results (-MM) - -p CGMLST_PROFILES, --cgmlst-profiles CGMLST_PROFILES - Output CSV file destination for cgMLST allelic - profiles - -n NOVEL_ALLELES, --novel-alleles NOVEL_ALLELES - Output FASTA file destination of novel cgMLST alleles - from input genomes - -a ALLELES_OUTPUT, --alleles-output ALLELES_OUTPUT - Output path of allele sequences and info to JSON - -T TMP_DIR, --tmp-dir TMP_DIR - Base temporary working directory for intermediate - analysis files. - -K, --keep-tmp Keep temporary analysis files. - --use-full-cgmlst-db Use the full set of cgMLST alleles which can include - highly similar alleles. By default the smaller - "centroid" alleles or representative alleles are used - for each marker. - --no-cgmlst Do not run cgMLST serovar prediction - -m, --run-mash Determine Mash MinHash genomic distances to Salmonella - genomes with trusted serovar designations. Mash binary - must be in accessible via $PATH (e.g. /usr/bin). - --qc Perform basic QC to provide level of confidence in - serovar prediction results. - -t THREADS, --threads THREADS - Number of parallel threads to run sistr_cmd analysis. - -l [LIST_OF_SEROVARS], --list-of-serovars [LIST_OF_SEROVARS] - A path to a single column text file containing list of - serovar(s) to check serovar prediction against. Report - predicted serovar in "serovar_in_serovar_list" field - as Y (present) and N (absent) in the list. The default - list will be used if not file specified. - -v, --verbose Logging verbosity level (-v == show warnings; -vvv == - show debug info) - -V, --version show program's version number and exit - + If you find this program useful in your research, please cite as: + + The Salmonella In Silico Typing Resource (SISTR): an open web-accessible tool for rapidly typing and subtyping draft Salmonella genome assemblies. + Catherine Yoshida, Peter Kruczkiewicz, Chad R. Laing, Erika J. Lingohr, Victor P.J. Gannon, John H.E. Nash, Eduardo N. Taboada. + PLoS ONE 11(1): e0147101. doi: 10.1371/journal.pone.0147101 + + positional arguments: + F Input genome FASTA file + + optional arguments: + -h, --help show this help message and exit + -i fasta_path genome_name, --input-fasta-genome-name fasta_path genome_name + fasta file path to genome name pair + -f OUTPUT_FORMAT, --output-format OUTPUT_FORMAT + Output format (json, csv, tab, pickle) + -o OUTPUT_PREDICTION, --output-prediction OUTPUT_PREDICTION + SISTR serovar prediction output path + -M, --more-results Output more detailed results (-M) and all antigen + search blastn results (-MM) + -p CGMLST_PROFILES, --cgmlst-profiles CGMLST_PROFILES + Output CSV file destination for cgMLST allelic + profiles + -n NOVEL_ALLELES, --novel-alleles NOVEL_ALLELES + Output FASTA file destination of novel cgMLST alleles + from input genomes + -a ALLELES_OUTPUT, --alleles-output ALLELES_OUTPUT + Output path of allele sequences and info to JSON + -T TMP_DIR, --tmp-dir TMP_DIR + Base temporary working directory for intermediate + analysis files. + -K, --keep-tmp Keep temporary analysis files. + --use-full-cgmlst-db Use the full set of cgMLST alleles which can include + highly similar alleles. By default the smaller + "centroid" alleles or representative alleles are used + for each marker. + --no-cgmlst Do not run cgMLST serovar prediction + -m, --run-mash Determine Mash MinHash genomic distances to Salmonella + genomes with trusted serovar designations. Mash binary + must be in accessible via $PATH (e.g. /usr/bin). + --qc Perform basic QC to provide level of confidence in + serovar prediction results. + -t THREADS, --threads THREADS + Number of parallel threads to run sistr_cmd analysis. + -l [LIST_OF_SEROVARS], --list-of-serovars [LIST_OF_SEROVARS] + A path to a single column text file containing list of + serovar(s) to check serovar prediction against. Report + predicted serovar in "serovar_in_serovar_list" field + as Y (present) and N (absent) in the list. The default + list will be used if not file specified. + -v, --verbose Logging verbosity level (-v == show warnings; -vvv == + show debug info) + -V, --version show program's version number and exit + Example Usage -------------