Skip to content

Latest commit

 

History

History
34 lines (25 loc) · 1.38 KB

README.md

File metadata and controls

34 lines (25 loc) · 1.38 KB

Implementation of Context Binning and Model Clustering for Compression of Genetic Data

My master's thesis written as part of the computer science course at Jagiellonian University.

Abstract

In recent years, there happened a gigantic leap in the speed of DNA sequencing methods, which allowed us to sequence DNAs of complex organisms, such as humans, quickly. However, this leads to increasing demand for disk storage, as the sizes of the databases containing such data can easily reach dozens of terabytes. In his article "Context binning, model clustering and adaptivity for data compression of genetic data", Jarek Duda proposes promising compression techniques that should help build a compressor better than the current state of the art. This thesis describes the compressor built to evaluate those techniques, tests it with real-world data and compares it to other genetic data compression tools.

Download

The PDF file can be downloaded from the GitHub Releases page.

Building

Make sure you have Inkscape and a distribution of LaTeX installed in your system.

make

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.