Date of Award

Summer 8-8-2017

Degree Type

Closed Dissertation

Degree Name

Doctor of Philosophy (PhD)


Computer Science

First Advisor

Alexander Zelikovsky

Second Advisor

Pavel Skums

Third Advisor

Rober Harrison

Fourth Advisor

Yury Khudyakov


The ability of Next-Generation Sequencing (NGS) to produce massive quantities of genomic data inexpensively has allowed to study the structure of viral populations from an infected host at an unprecedented resolution. As a result of a high rate of mutation and recombination events, an RNA virus exists as a heterogeneous "swarm". Virologists and computational epidemiologists are widely using NGS data to study viral populations. However, discerning rare variants is muddled by the presence of errors introduced by the sequencing technology. We develop and implement time- and cost-efficient strategy for NGS of multiple viral samples, and computational methods to analyze large quantities of NGS data and to handle sequencing errors. In particular, we present: (i) combinatorial pooling strategy for massive NGS of viral samples; (ii) kGEM and 2SNV — methods for viral population haplotyping; (iii) ShotMCF — a Multicommodity Flow (MCF) based method for frequency estimation of viral haplotypes; (iv) QUASIM — an agent-based simulator of viral evolution taking in account viral variants and immune response.