SINBAD: A pipeline for processing SINgle cell Bisulfite sequencing samples and Analysis of Data

SINBAD is an R package for processing single cell DNA methylation data. It accepts FASTQ files as input, performs demultiplexing, adapter trimmming, mapping, quantification, dimensionality reduction and differential methylation analysis for single cell DNA methylation datasets.

NOTE: SINBAD is tested with paired snmC-Seq data.

System requirements

R 3.6.0 or later version is required for installation.

Installation

To install SINBAD, type the following command in R command prompt:

devtools::install_github("yasin-uzun/SINBAD")

Once you have installed the SINBAD, you can verify that it is installed correctly as follows:

SINBAD::test()

If SINBAD is installed without any problems, you should see the following message:

>SINBAD installation is ok.

Dependencies

SINBAD has following software dependencies:

Adapter Trimmer: Cutadapt
Aligner: Bismark
Duplicate removal: samtools
Perl dependencies: SINBAD uses two perl scripts for demultiplexing (see below).

You can install these tools by yourself. For convenience, we provide the binaries in here . Please cite the specific tool when you use it, in addition to SINBAD.

You can download the perl scripts from our repository.

You also need genomic sequence and annotated genomic regions for quantification of methylation calls. We provide the sequence data for hg38 and mm10 assemblies in here.

Graphical User Interface

SINBAD can be run using simple R instructions. It also has an easy to use Graphical User Interface (GUI). The users with no R programming background can use the GUI to process and analyze their single cell DNA methylation sequencing datasets. Please see the user manual (below) on how to use SINBAD via GUI.

User Manual

Detailed instructions for using SINBAD are available in the SINBAD User Manual. You can find the information about seeting the parameters and executing the analysis steps in the manual.

Configuration

To run SINBAD, you need three configuration files to modify:

config.general.R : Sets the progam paths to be used by SINBAD. You need to edit this file only once.
config.genome.R : Sets the genomic information and paths to be used by SINBAD. You need to generate one for each organism. We provide the built-in configuration for hg38 assembly.
config.project.R : You need to configure this file for your project.

You can download the templates for the configuration files from the repository and edit them for your purposes.

Running

SINBAD is run in two steps:

Read configuration files:

read_configs(config_dir)

config_dir should point to your configuration file directory (mentioned above).

Process data:

process_sample_wrapper(raw_fastq_dir, demux_index_file, working_dir, sample_name)

raw_fastq_dir should point to the directory containing FASTQ files as the input.
demux_index_file should point to the demultiplexing index file for the FASTQ files.
working_dir should point to the directory where all the outputs will be placed into.
sample_name (optional) is the name for the sample or project.

This function reads FASTQ files, demultiplexes them into single cells, performs filtering, mapping (alignment), DNA methylation calling and quantification, dimensionality reduction, clustering and differential methylation analysis for the given input. All the outputs are placed into related directories in working_dir.

Example Data

For testing SINBAD, we provide example single ended and pair ended datasets generated with snmC-Seq protocol.

Citation

If you use SINBAD in your study, please cite it as follows:

SINBAD: A pipeline for processing SINgle cell Bisulfite sequencing samples and Analysis of Data, GitHub, 2021.

Contact

For any questions or comments, please contact Yasin Uzun (uzuny at email chop edu)

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
R		R
config_files		config_files
docs		docs
inst/extdata		inst/extdata
perl		perl
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SINBAD: A pipeline for processing SINgle cell Bisulfite sequencing samples and Analysis of Data

System requirements

Installation

Dependencies

Graphical User Interface

User Manual

Configuration

Running

Example Data

Citation

Contact

About

Releases

Packages

Contributors 2

Languages

yasin-uzun/SINBAD

Folders and files

Latest commit

History

Repository files navigation

SINBAD: A pipeline for processing SINgle cell Bisulfite sequencing samples and Analysis of Data

System requirements

Installation

Dependencies

Graphical User Interface

User Manual

Configuration

Running

Example Data

Citation

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages