Current Search: phylogeny (x)
-
-
Title
-
MODELING AND PARTITIONING THE NUCLEOTIDE EVOLUTIONARY PROCESS FOR PHYLOGENETIC AND COMPARATIVE GENOMIC INFERENCE.
-
Creator
-
Castoe, Todd, Parkinson, Christopher, University of Central Florida
-
Abstract / Description
-
The transformation of genomic data into functionally relevant information about the composition of biological systems hinges critically on the field of computational genome biology, at the core of which lies comparative genomics. The aim of comparative genomics is to extract meaningful functional information from the differences and similarities observed across genomes of different organisms. We develop and test a novel framework for applying complex models of nucleotide evolution to solve...
Show moreThe transformation of genomic data into functionally relevant information about the composition of biological systems hinges critically on the field of computational genome biology, at the core of which lies comparative genomics. The aim of comparative genomics is to extract meaningful functional information from the differences and similarities observed across genomes of different organisms. We develop and test a novel framework for applying complex models of nucleotide evolution to solve phylogenetic and comparative genomic problems, and demonstrate that these techniques are crucial for accurate comparative evolutionary inferences. Additionally, we conduct an exploratory study using vertebrate mitochondrial genomes as a model to identify the reciprocal influences that genome structure, nucleotide evolution, and multi-level molecular function may have on one another. Collectively this work represents a significant and novel contribution to accurately modeling and characterizing patterns of nucleotide evolution, a contribution that enables the enhanced detection of patterns of genealogical relationships, selection, and function in comparative genomic datasets. Our work with entire mitochondrial genomes highlights a coordinated evolutionary shift that simultaneously altered genome architecture, replication, nucleotide evolution and molecular function (of proteins, RNAs, and the genome itself). Current research in computational biology, including the advances included in this dissertation, continue to close the gap that impedes the transformation of genomic data into powerful tools for the analysis and understanding of biological systems function.
Show less
-
Date Issued
-
2007
-
Identifier
-
CFE0001548, ucf:47138
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0001548
-
-
Title
-
ALGORITHMS FOR HAPLOTYPE INFERENCE AND BLOCK PARTITIONING.
-
Creator
-
Vijaya Satya, Ravi, Mukherjee, Amar, University of Central Florida
-
Abstract / Description
-
The completion of the human genome project in 2003 paved the way for studies to better understand and catalog variation in the human genome. The International HapMap Project was started in 2002 with the aim of identifying genetic variation in the human genome and studying the distribution of genetic variation across populations of individuals. The information collected by the HapMap project will enable researchers in associating genetic variations with phenotypic variations. Single Nucleotide...
Show moreThe completion of the human genome project in 2003 paved the way for studies to better understand and catalog variation in the human genome. The International HapMap Project was started in 2002 with the aim of identifying genetic variation in the human genome and studying the distribution of genetic variation across populations of individuals. The information collected by the HapMap project will enable researchers in associating genetic variations with phenotypic variations. Single Nucleotide Polymorphisms (SNPs) are loci in the genome where two individuals differ in a single base. It is estimated that there are approximately ten million SNPs in the human genome. These ten million SNPS are not completely independent of each other - blocks (contiguous regions) of neighboring SNPs on the same chromosome are inherited together. The pattern of SNPs on a block of the chromosome is called a haplotype. Each block might contain a large number of SNPs, but a small subset of these SNPs are sufficient to uniquely dentify each haplotype in the block. The haplotype map or HapMap is a map of these haplotype blocks. Haplotypes, rather than individual SNP alleles are expected to effect a disease phenotype. The human genome is diploid, meaning that in each cell there are two copies of each chromosome - i.e., each individual has two haplotypes in any region of the chromosome. With the current technology, the cost associated with empirically collecting haplotype data is prohibitively expensive. Therefore, the un-ordered bi-allelic genotype data is collected experimentally. The genotype data gives the two alleles in each SNP locus in an individual, but does not give information about which allele is on which copy of the chromosome. This necessitates computational techniques for inferring haplotypes from genotype data. This computational problem is called the haplotype inference problem. Many statistical approaches have been developed for the haplotype inference problem. Some of these statistical methods have been shown to be reasonably accurate on real genotype data. However, these techniques are very computation-intensive. With the international HapMap project collecting information from nearly 10 million SNPs, and with association studies involving thousands of individuals being undertaken, there is a need for more efficient methods for haplotype inference. This dissertation is an effort to develop efficient perfect phylogeny based combinatorial algorithms for haplotype inference. The perfect phylogeny haplotyping (PPH) problem is to derive a set of haplotypes for a given set of genotypes with the condition that the haplotypes describe a perfect phylogeny. The perfect phylogeny approach to haplotype inference is applicable to the human genome due to the block structure of the human genome. An important contribution of this dissertation is an optimal O(nm) time algorithm for the PPH problem, where n is the number of genotypes and m is the number of SNPs involved. The complexity of the earlier algorithms for this problem was O(nm^2). The O(nm) complexity was achieved by applying some transformations on the input data and by making use of the FlexTree data structure that has been developed as part of this dissertation work, which represents all the possible PPH solution for a given set of genotypes. Real genotype data does not always admit a perfect phylogeny, even within a block of the human genome. Therefore, it is necessary to extend the perfect phylogeny approach to accommodate deviations from perfect phylogeny. Deviations from perfect phylogeny might occur because of recombination events and repeated or back mutations (also referred to as homoplasy events). Another contribution of this dissertation is a set of fixed-parameter tractable algorithms for constructing near-perfect phylogenies with homoplasy events. For the problem of constructing a near perfect phylogeny with q homoplasy events, the algorithm presented here takes O(nm^2+m^(n+m)) time. Empirical analysis on simulated data shows that this algorithm produces more accurate results than PHASE (a popular haplotype inference program), while being approximately 1000 times faster than phase. Another important problem while dealing real genotype or haplotype data is the presence of missing entries. The Incomplete Perfect Phylogeny (IPP) problem is to construct a perfect phylogeny on a set of haplotypes with missing entries. The Incomplete Perfect Phylogeny Haplotyping (IPPH) problem is to construct a perfect phylogeny on a set of genotypes with missing entries. Both the IPP and IPPH problems have been shown to be NP-hard. The earlier approaches for both of these problems dealt with restricted versions of the problem, where the root is either available or can be trivially re-constructed from the data, or certain assumptions were made about the data. We make some novel observations about these problems, and present efficient algorithms for unrestricted versions of these problems. The algorithms have worst-case exponential time complexity, but have been shown to be very fast on practical instances of the problem.
Show less
-
Date Issued
-
2006
-
Identifier
-
CFE0001244, ucf:46894
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0001244
-
-
Title
-
THE GLYCINE AND PROLINE REDUCTASE SYSTEMS: AN EVOLUTIONARY PERSPECTIVE AND PRESCENCE IN ENTEROBACTERIACEAE.
-
Creator
-
Witt, Joshua, Self, William, University of Central Florida
-
Abstract / Description
-
The Glycine and Proline Reduction systems are two of the best characterized selenoenzymes in bacteria and have been found to occur in a wide variety of clostridia . These enzymes are utilized to reduce glycine or D-proline to obtain energy via substrate level phosporylation or membrane gradients, respectively [6, 7]. This includes the pathogens C. difficile and C. botulinum [5, 8]. Strains of C. difficile are activate toxigenic pathways whenever either of these pathways is active within the...
Show moreThe Glycine and Proline Reduction systems are two of the best characterized selenoenzymes in bacteria and have been found to occur in a wide variety of clostridia . These enzymes are utilized to reduce glycine or D-proline to obtain energy via substrate level phosporylation or membrane gradients, respectively [6, 7]. This includes the pathogens C. difficile and C. botulinum [5, 8]. Strains of C. difficile are activate toxigenic pathways whenever either of these pathways is active within the cell [5, 8]. Though evolutionary studies have been conducted on ammonia producing bacteria none has been done to directly characterize these two system by themselves. This includes an understanding of whether or not this system is transferred between organisms, as many of the clostridia that are to be studied are known to have an "open genome." [8, 10] With this information we were able to generate a phylogenic model of the proline and glycine reduction systems. Through this analysis, we were able to account for many clostridial organisms that contain the system, but also many other organisms as well. These included enterobacteriaceae including a strain of the model organism, Escherichia coli. It was further concluded that Glycine Reductase was a much less centralized system and included a wide range of taxa while Proline Reductase was much more centralized to being within the phyla of firmicutes. It was also concluded that the strain of E. coli has a fully functional operon for Glycine Reductase.
Show less
-
Date Issued
-
2013
-
Identifier
-
CFH0004506, ucf:45149
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFH0004506
-
-
Title
-
Beyond building a tree: Phylogeny of pitvipers and exploration of evolutionary patterns.
-
Creator
-
Fenwick, Allyson, Parkinson, Christopher, Hoffman, Eric, Crampton, William, Wiens, John, University of Central Florida
-
Abstract / Description
-
As generic and higher-scale evolutionary relationships are increasingly well understood, systematists move research in two directions: 1) understanding species-level relationships with dense taxon sampling, and 2) evaluating evolutionary patterns using phylogeny. In this study I address both foci of systematic research using pitvipers, subfamily Crotalinae. For direction one, I evaluate the relationships of 96% of pitvipers by combining independent sets of molecular and phenotypic data. I...
Show moreAs generic and higher-scale evolutionary relationships are increasingly well understood, systematists move research in two directions: 1) understanding species-level relationships with dense taxon sampling, and 2) evaluating evolutionary patterns using phylogeny. In this study I address both foci of systematic research using pitvipers, subfamily Crotalinae. For direction one, I evaluate the relationships of 96% of pitvipers by combining independent sets of molecular and phenotypic data. I find the inclusion of species with low numbers of informative characters (i.e. less than 100) negatively impacts resolution of the phylogeny, and the addition of independent datasets has no effect on or a small benefit to confidence in estimated evolutionary relationships. Combined evidence is extremely useful in evaluating taxonomy; I use it with South American bothropoid pitvipers. Previous work found the genus Bothrops paraphyletic, but no study had included enough species to propose a taxonomic resolution. I resolve the relationships of 90% of bothropoid pitvipers, and support the paraphyly of Bothrops as previously defined, but find it consists of three well-supported clades distinguished by distinct habitats and geographic ranges. I propose the division of Bothrops sensu lato into three genera.To address research direction two, I investigate the change in reproductive mode from egg-laying (oviparity) to livebearing (viviparity) in vipers, as well as the expansion of pitvipers through South America. I resolve the phylogeny and the divergence times for subgroups of interest then use model comparison and ancestral character state or geographic range estimation to trace the evolution of reproductive mode or geographic range across evolutionary history. For vertebrates, the predominant explanation for the evolution of reproductive mode is Dollo's Law of unidirectional evolution. This law has been challenged for a number of characters in different systems, but the phylogenetic methods that found those violations were criticized. I find support for unidirectional evolution in two analyses and rejection of it in others, and therefore do not reject Dollo's Law for the evolution of reproductive mode in vipers. In the case of geographic range, dozens of hypotheses have been proposed to explain the great biodiversity in South America, but tests of these hypotheses are lacking. I define specific time- and space-based predictions for seven hypotheses based on geological and climatic events (-) uplift of the Andes Mountains, saltwater inundation of inland areas, change in river flow, and Pleistocene climate changes. I find some support for half of the hypotheses, including one allopatric, one parapatric, and one based on climate change. I conclude that the evolution of South American pitvipers is extremely complex. Through fulfillment of both systematic research directions, I generated new knowledge about pitvipers and evolutionary processes. My methods of evaluating evolutionary patterns provide frameworks for different research questions in these areas, and I suggest that other researchers apply similar techniques to evaluate other portions of the Tree of Life.
Show less
-
Date Issued
-
2012
-
Identifier
-
CFE0004535, ucf:49236
-
Format
-
Document (PDF)
-
PURL
-
http://purl.flvc.org/ucf/fd/CFE0004535