Skip to content

Commit

Permalink
cleaning up some crumbs in the methods section
Browse files Browse the repository at this point in the history
  • Loading branch information
Russell Neches committed Jun 1, 2017
1 parent ef2b9f7 commit 1519fd9
Showing 1 changed file with 20 additions and 24 deletions.
44 changes: 20 additions & 24 deletions source/FishPoo/methods.tex
Original file line number Diff line number Diff line change
Expand Up @@ -5,48 +5,44 @@ \subsection{Specimen collection and housing}
\subfile{FishPoo/figure1}
\subfile{FishPoo/table1}

\begin{figure}
\includegraphics[width=\textwidth]{FishPoo/figures/mcgee_tree.pdf}
\caption{A maximum likelihood phylogeny of the host organisms.}
\label{FP_host_tree}
\end{figure}
%\begin{figure}
% \includegraphics[width=\textwidth]{FishPoo/figures/mcgee_tree.pdf}
% \caption{A maximum likelihood phylogeny of the host organisms.}
% \label{FP_host_tree}
%\end{figure}

\subsection{Sample collection}

\subsection{Sample preparation and processing}
\subfile{FishPoo/figure2}

\subsection{Sequencing}
\subsection{Sample collection}

\subsection{Building the observation table}
Specimens fed until satiated and placed into a sterile autoclave bag prepared with 10 liters of molecular water augmented with a small amount of sodium chloride and calcium chloride. Sterile charcoal pellets were added to sequester nitrates, and a sterile plastic tube was submerged and connected to an air pump to aerate the water. Stool was removed using a sterile serological pipette and frozen.

Adapter removal, quality trimming and overlap alignment is performed using Trimmomatic.
\subsection{Sample preparation, processing and sequencing}

Chimera checking with {\tt vsearch}
Stool samples were subjected to bead beating for 60 seconds and DNA was extracted using MoBio PowerSoil DNA Isolation kit. 16S PCR and Illumina sequencing was carried out using the Earth Microbiome 16S Illumina Amplicon Protocol.

Unique reads are identified using {\tt hat-trie}
\subsection{Building the observation table}

A table of observation counts is constructed as a {\tt Pandas} {\tt DataFrame}object, and a count threshold is applied. Tables of raw counts and normalized counts are written as comma separated value files, and the corresponding sequences are written as a FASTA file.
Adapter removal, quality trimming and overlap alignment is performed using Trimmomatic. Chimera are identified with {\tt vsearch}, and unique reads are identified using {\tt hat-trie}. A table of observation counts is constructed as a {\tt Pandas} {\tt DataFrame}object, and a count threshold is applied. Tables of raw counts and normalized counts are written as comma separated value files, and the corresponding sequences are written as a FASTA file.

\subsection{Building phylogeny of OTUs}

Alignment of the observed sequences is performed using {\tt Clustal Omega}, and an approximate maximum likelihood phylogeny is constructed using {\tt FastTree}.

\begin{figure}
\includegraphics[width=\textwidth]{FishPoo/figures/fishpoo_tree.png}
\caption{An approximate maximum likelihood phylogeny of the guest organisms.}
\label{FP_guest_tree}
\end{figure}
%\begin{figure}
% \includegraphics[width=\textwidth]{FishPoo/figures/fishpoo_tree.png}
% \caption{An approximate maximum likelihood phylogeny of the guest organisms.}
% \label{FP_guest_tree}
%\end{figure}

\subsection{Building phylogeny of hosts}
\subfile{FishPoo/figure3}

\subsection{Processing co-phylogenies with {\tt SuchTree}}

The host tree and guest tree are loaded as {\tt SuchTree} objects, and linked together through the observation table as a {\tt SuchLinkedTrees} object. The {\tt SuchTree} class allows for extremely rapid traversals of large trees, enabling distance correlations to be efficiently computed. The {\tt SuchLinkedTrees} class leverages this to compute graph adjacency and graph Laplacian matrixes of subtrees of host and guest phylogenies. Spectral decomposition and kernel densities of graph Laplacians are computed using {\tt numpy}, and the Jensen-Shannon divergence is calculated between each pair of spectral densities using the {\tt entropy} function in {\tt scipy.stats}. UPGMA clustering is performed using {\tt scipy.cluster.hierarchy.linkage}.

\subsection{Correlation-based analysis}
subsection{Correlation-based analysis}

For each clade of guest organisms, the {\tt SuchLinkedTrees.linked\_distances} function is used to calculate the pairwise distances through the host and guest trees for every pair of non-null observations in the link matrix, as described by Hommola {\em et al.} \cite{hommola2009permutation} The Pierson's correlation for these distances is computed using the {\tt pearsonr} function from {\tt scipy.stats}.

\subsection{Literature search for comparative analysis}

\subsection{Spectral density analysis}
\subsection{Literature search for comparative analysis}

0 comments on commit 1519fd9

Please sign in to comment.