Software and Simulation Code

shiny-app C.O.M.I.C.S.

Calling Outlier loci from Multi-dimensional data using Invariant Coordinate Selection (COMICS)

Identifying loci that are under selection versus those that are evolving neutrally is a common challenge in evolutionary genetics. Moreover, with the increase in sequence data, genomic studies have begun to incorporate the use of multiple methods to identify candidate loci under selection. Composite methods are usually implemented to transform the data into a multi-dimensional scatter where outliers are identified using a distance metric, the most common being Mahalanobis distance.

We designed a wrapper shiny app to implement invariant coordinate selection (ICS) to identify outliers in multi-dimensional space, which has proven to outperform other methods designed to identify outliers in multidimensional data. The wrapper is designed to take genomic data and allow the user to load custom based information for their genomes to perform analyses.

Link to GitHub repository

GBS-tools

The generation of reduced representation libraries has become a common resource available for the study of larger numbers of individuals from model and non-model organisms. Numerous tools are available for the direct analysis of NGS data generated from reduced representation libraries, but to data no tool is available to take into consideration the problem of allelic dropout. This phenomenon, already described in several publications is the result of the fact that any polymorphism in linkage with losses of restriction sites in the genome (a common method for generating reduced representation libraries) will result in the lack of observation of the single nucleotide variant associated with the loss of the restriction site.

GBS-tools available here, is a tool that looks at the distribution of variants in a multisample vcf file and infers the likelihood that the SNPs given a model in which there is a mixture of restriction sites presence/absence. We strongly encourage examining using this tool, specially when the studied organism is highly heterozygous.

Source code and documentation

The publication supporting the development of this tool can be found here:
Cooke TF, Yee M-C, Muzzio M, Adams A, Bell R, Cornejo OE, Kelley JL, Bailliet G, Bravi CM, Bustamante CD, Kenny EE. (2016) GBStools: A unified approach for reduced representation sequencing and genotyping. PLoS Genetics. 12(2): e1005631. doi:10.1371/journal.pgen.1005631

PopRange

PopRange is an “ecologically driven” population genetic simulation software developed by Kimberly McManus for R, while working under my supervision.

Part of the great features of PopRange is that allows to simulate MetaPopulations in a grid and simulate Wright-Fisher models with selection and modify assumptions about the ecological models for the demographic of the population of interest (for instance: you can run populations that follow a logistic growth model).

Source code and documentation

Oscillations in Continuous Culture of Single Clones

In our paper on the oscillatory behavior of Streptococcus pneumoniae in continuous culture on chemostats (Cornejo et al. 2009) we proposed a system of ordinary differential equations (ODE) aimed at explaining the maintenance of an oscillatory dynamics based on the production of a toxin which auto-regulates its production. The most mathematically oriented will find the clear resemblance of the model proposed with classic Lotka-Volterra systems. In the file Oscillations.R we provide with the code to play and examine computer simulations of this model and produce results similar to those obtained in the paper. I hope that this code is helpful for people interested in similar problems in analogous systems.

Also, In the file bacterial_recombination.R I have laid out the code for simple simulations to numerically explore the dynamics of recombination in bacteria mediated by F plasmids or for High frequency of recombination (HFr) strains. We have explored the use of ordinary differential equations in informing experiments aimed at estimating the maximum rate of recombination of bacteria.

This section will be updated with the R code soon.