Current Search: Bioinformatics (x)
View All Items
- Title
- The microbial ecosystem of beer spoilage and souring: Competition and cooperation in the age of bioinformatics.
- Creator
-
Kettring, Andrew, Moore, Sean, Cole, Alexander, Self, William, University of Central Florida
- Abstract / Description
-
The brewing industry generates $350 billion in revenue in the US annually, representing 1.9% of the gross domestic product. Spoilage is a persistent problem throughout production and distribution that causes economic loss, and is therefore meticulously avoided. Contrarily, artisanal sour beers are necessarily produced by a diverse variety of these spoilage organisms metabolically interacting in symbiosis as a microbial ecosystem. We sought to gain insight into factors driving assembly of...
Show moreThe brewing industry generates $350 billion in revenue in the US annually, representing 1.9% of the gross domestic product. Spoilage is a persistent problem throughout production and distribution that causes economic loss, and is therefore meticulously avoided. Contrarily, artisanal sour beers are necessarily produced by a diverse variety of these spoilage organisms metabolically interacting in symbiosis as a microbial ecosystem. We sought to gain insight into factors driving assembly of microbial communities by testing a long-debated Darwinian hypothesis. A collection of community members were screened in co-culture and novel bioinformatics tools were developed to predict observed interactions. A fundamental understanding of these relationships is paramount to beer production and sets a precedent for the study of similar microbial communities that impact human health.
Show less - Date Issued
- 2017
- Identifier
- CFE0007288, ucf:52147
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007288
- Title
- Efficient String Graph Construction Algorithm.
- Creator
-
Morshed, S.M. Iqbal, Yooseph, Shibu, Zhang, Shaojie, Valliyil Thankachan, Sharma, University of Central Florida
- Abstract / Description
-
In the field of genome assembly research where assemblers are dominated by de Bruijn graph-based approaches, string graph-based assembly approach is getting more attention because of its ability to losslessly retain information from sequence data. Despite the advantages provided by a string graph in repeat detection and in maintaining read coherence, the high computational cost for constructing a string graph hinders its usability for genome assembly. Even though different algorithms have...
Show moreIn the field of genome assembly research where assemblers are dominated by de Bruijn graph-based approaches, string graph-based assembly approach is getting more attention because of its ability to losslessly retain information from sequence data. Despite the advantages provided by a string graph in repeat detection and in maintaining read coherence, the high computational cost for constructing a string graph hinders its usability for genome assembly. Even though different algorithms have been proposed over the last decade for string graph construction, efficiency is still a challenge due to the demand for processing a large amount of sequence data generated by NGS technologies. Therefore, in this thesis, we provide a novel, linear time and alphabet-size-independent algorithm SOF which uses the property of irreducible edges and transitive edges to efficiently construct string graph from an overlap graph. Experimental results show that SOF is at least 2 times faster than the string graph construction algorithm provided in SGA, one of the most popular string graph-based assembler, while maintaining almost the same memory footprint as SGA. Moreover, the availability of SOF as a subprogram in the SGA assembly pipeline will give user facilities to access the preprocessing and postprocessing steps for genome assembly provided in SGA.
Show less - Date Issued
- 2019
- Identifier
- CFE0007504, ucf:52635
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0007504
- Title
- Computational Methods for Comparative Non-coding RNA Analysis: From Structural Motif Identification to Genome-wide Functional Classification.
- Creator
-
Zhong, Cuncong, Zhang, Shaojie, Hu, Haiyan, Hua, Kien, Li, Xiaoman, University of Central Florida
- Abstract / Description
-
Non-coding RNA (ncRNA) plays critical functional roles such as regulation, catalysis, and modification etc. in the biological system. Non-coding RNAs exert their functions based on their specific structures, which makes the thorough understanding of their structures a key step towards their complete functional annotation. In this dissertation, we will cover a suite of computational methods for the comparison of ncRNA secondary and 3D structures, and their applications to ncRNA molecular...
Show moreNon-coding RNA (ncRNA) plays critical functional roles such as regulation, catalysis, and modification etc. in the biological system. Non-coding RNAs exert their functions based on their specific structures, which makes the thorough understanding of their structures a key step towards their complete functional annotation. In this dissertation, we will cover a suite of computational methods for the comparison of ncRNA secondary and 3D structures, and their applications to ncRNA molecular structural annotation and their genome-wide functional survey.Specifically, we have contributed the following five computational methods. First, we have developed an alignment algorithm to compare RNA structural motifs, which are recurrent RNA 3D structural fragments. Second, we have improved upon the previous alignment algorithm by incorporating base-stacking information and devise a new branch-and-bond algorithm. Third, we have developed a clustering pipeline for RNA structural motif classification using the above alignment methods. Fourth, we have generalized the clustering pipeline to a genome-wide analysis of RNA secondary structures. Finally, we have devised an ultra-fast alignment algorithm for RNA secondary structure by using the sparse dynamic programming technique.A large number of novel RNA structural motif instances and ncRNA elements have been discovered throughout these studies. We anticipate that these computational methods will significantly facilitate the analysis of ncRNA structures in the future.
Show less - Date Issued
- 2013
- Identifier
- CFE0004966, ucf:49580
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004966
- Title
- BIOINFORMATIC ANALYSIS OF SOLANACEAE CHLOROPLAST GENOMES AND CHARACTERIZATION OF AN ARABIDOPSIS PROTEIN DISULFIDE ISOMERASE IN TRANSGENIC TOBACCO CHLOROPLASTS.
- Creator
-
Grevich, Justin, Daniell, Henry, University of Central Florida
- Abstract / Description
-
Throughout history, traditional plant breeding has been used to provide resistance to pests, disease and other forms of environmental stress, as well as to increase yield and improve upon quality and processing attributes. Over the last decade, the advancement in sequencing technology and bioinformatic analysis has unleashed a wealth of knowledge about chloroplast genetic organization and evolution. The lack of complete plastid genome sequences is one of the major limitations in advancing...
Show moreThroughout history, traditional plant breeding has been used to provide resistance to pests, disease and other forms of environmental stress, as well as to increase yield and improve upon quality and processing attributes. Over the last decade, the advancement in sequencing technology and bioinformatic analysis has unleashed a wealth of knowledge about chloroplast genetic organization and evolution. The lack of complete plastid genome sequences is one of the major limitations in advancing plastid genetic engineering to other useful crops. This is due to the fact that plastid genome sequences are essential for the identification of endogenous regulatory sequences and optimal sites for homologous recombination. Analysis of four Solanaceae genomes revealed significant genetic modifications in both coding and non-coding regions. Repeat analysis with Reputer revealed 33 to 45 direct and inverted repeats ? 30bp with at least 90% homology. All but five of the 42 repeats shared among all four genomes were located in the exact same genes or intergenic regions, suggesting a functional role. Intergenic analysis found four regions that are 100 percent identical in all four Solanaceae genomes. Such highly conserved intergenic regions are ideal targets for multi-species transformation cassettes. Protein disulfide isomerases (PDI) are a family of proteins known to function as molecular chaperones and aid in the formation of disulfide bonds during protein folding. They contain at least one thioredoxin domain used for the formation, isomerization, and reduction/oxidation of disulfide bonds. Bioinformatic analysis identified 13 PDI-like (PDIL) proteins found in Arabidopsis that contain at least one thioredoxin domain. In addition to the above-mentioned characteristics, PDIs have been shown to be directly involved in the translational regulation of the psbA mRNA in response to light and could potentially increase the efficiency of chloroplast engineering in plants. Human serum albumin (HSA) requires 17 disulfide bonds to be properly folded and is an ideal candidate for assessing the disulfide bond formation, protein folding, and other chaperone-like characteristics of PDIL proteins. Therefore, I have coexpressed HSA in order to further characterize an Arabidopsis PDIL protein, atPDIL5-4, and in particular, the redox control of the psbA 5'UTR. Interestingly, the polyclonal antibody used for identifying the PDIL protein cross-reacted and identified other proteins, but not the transgenic atPDIL5-4. Results of these investigations will be presented.
Show less - Date Issued
- 2006
- Identifier
- CFE0001083, ucf:46776
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0001083
- Title
- NEW COMPUTATIONAL APPROACHES FOR MULTIPLE RNA ALIGNMENT AND RNA SEARCH.
- Creator
-
DeBlasio, Daniel, Zhang, Shaojie, University of Central Florida
- Abstract / Description
-
In this thesis we explore the the theory and history behind RNA alignment. Normal sequence alignments as studied by computer scientists can be completed in $O(n^2)$ time in the naive case. The process involves taking two input sequences and finding the list of edits that can transform one sequence into the other. This process is applied to biology in many forms, such as the creation of multiple alignments and the search of genomic sequences. When you take into account the RNA sequence...
Show moreIn this thesis we explore the the theory and history behind RNA alignment. Normal sequence alignments as studied by computer scientists can be completed in $O(n^2)$ time in the naive case. The process involves taking two input sequences and finding the list of edits that can transform one sequence into the other. This process is applied to biology in many forms, such as the creation of multiple alignments and the search of genomic sequences. When you take into account the RNA sequence structure the problem becomes even harder. Multiple RNA structure alignment is particularly challenging because covarying mutations make sequence information alone insufficient. Existing tools for multiple RNA alignments first generate pair-wise RNA structure alignments and then build the multiple alignment using only the sequence information. Here we present PMFastR, an algorithm which iteratively uses a sequence-structure alignment procedure to build a multiple RNA structure alignment. PMFastR also has low memory consumption allowing for the alignment of large sequences such as 16S and 23S rRNA. Specifically, we reduce the memory consumption to $\sim O(band^2*m)$ where $band$ is the banding size. Other solutions are $\sim O(n^2*m)$ where $n$ and $m$ are the lengths of the target and query respectively. The algorithm also provides a method to utilize a multi-core environment. We present results on benchmark data sets from BRAliBase, which shows PMFastR outperforms other state-of-the-art programs. Furthermore, we regenerate 607 Rfam seed alignments and show that our automated process creates similar multiple alignments to the manually-curated Rfam seed alignments. While these methods can also be applied directly to genome sequence search, the abundance of new multiple species genome alignments presents a new area for exploration. Many multiple alignments of whole genomes are available and these alignments keep growing in size. These alignments can provide more information to the searcher than just a single sequence. Using the methodology from sequence-structure alignment we developed AlnAlign, which searches an entire genome alignment using RNA sequence structure. While programs have been readily available to align alignments, this is the first to our knowledge that is specifically designed for RNA sequences. This algorithm is presented only in theory and is yet to be tested.
Show less - Date Issued
- 2009
- Identifier
- CFE0002736, ucf:48166
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0002736
- Title
- Signals Delivered By Interleukin-7 Regulate The Activities Of Bim And JunD In T Lymphocytes.
- Creator
-
Ruppert, Shannon, Khaled, Annette, Self, William, Zervos, Antonis, Teter, Kenneth, University of Central Florida
- Abstract / Description
-
Interleukin-7 (IL-7) is an essential cytokine for lymphocyte growth that has the potential for promoting proliferation and survival. While the survival and proliferative functions of IL-7 are well established, the identities of IL-7 signaling components in pathways other than JAK/STAT, that accomplish these tasks remain poorly defined. To this end, we used IL-7 dependent T-cells to examine those components necessary for cell growth and survival. Our studies revealed two novel signal...
Show moreInterleukin-7 (IL-7) is an essential cytokine for lymphocyte growth that has the potential for promoting proliferation and survival. While the survival and proliferative functions of IL-7 are well established, the identities of IL-7 signaling components in pathways other than JAK/STAT, that accomplish these tasks remain poorly defined. To this end, we used IL-7 dependent T-cells to examine those components necessary for cell growth and survival. Our studies revealed two novel signal transducers of the IL-7 growth signal: BimL and JunD. IL-7 promoted the activity of JNK (Jun N-terminal Kinase), and that JNK, in turn, drove the expression of JunD, a component of the Activating Protein 1 (AP-1) transcription factors. Inhibition of JNK/JunD blocked glucose uptake and HXKII gene expression, indicating that this pathway was responsible for promoting HXKII expression. After a bioinformatics survey to reveal possible JunD-regulated genes activated early in the IL-7 signaling cascade, our search revealed that JunD could control the expression of proteins involved in signal transduction, cell survival and metabolism, including Pim-1. Pim-1, an IL-7 induced protein, was inhibited upon JNK or JunD inhibition. Our hypothesis that JunD positively regulated proliferation was confirmed when the proliferation of primary CD8+ T-cells cultured with IL-7 was impaired upon treatment with JunD siRNA. These results show that the IL-7 signal is more complex than the JAK/STAT pathway, activating JNK and JunD to induce rapid growth through the expression of metabolic factors like HXKII and Pim-1. When metabolic activities are inhibited, cells undergo autophagy, or cell scavenging, to provide essential nutrients. Pro-apoptotic Bim was evaluated for its involvement in autophagy. Bim is a BH3-only member of the Bcl-2 family that contributes to T-cell death. Partial rescue of T-cells occurs when Bim and the interleukin-7 receptor are deleted, implicating Bim in IL-7-deprived T-cell apoptosis. Alternative splicing results in three different isoforms: BimEL, BimL, and BimS. To study the effect of Bim deficiency and define the function of the major isoforms, Bim-containing and Bim-deficient T-cells, dependent on IL-7 for growth, were used. Loss of Bim in IL-7-deprived T-cells delayed apoptosis, but blocked the degradative phase of autophagy. The conversion of LC3-I to LC3-II was observed in Bim-deficient T-cells, but p62, which is degraded in autolysosomes, accumulated. To explain this, BimL, was found to support acidification of lysosomes associated with autophagic vesicles. Key findings showed that inhibition of lysosomal acidification accelerated death upon IL-7 withdrawal only in Bim-containing T-cells, indicating that in these cells autophagy was protective. IL-7 dependent T-cells lacking Bim were insensitive to inhibition of autophagy or lysosomal acidification. BimL co-immunoprecipitated with dynein and Lamp1-containing vesicles, indicating BimL could be an adaptor for dynein to facilitate loading of lysosomes. In Bim deficient T-cells, lysosome-tracking probes revealed vesicles of less acidic pH. Over-expression of BimL restored acidic vesicles in Bim deficient T-cells, while other isoforms, BimEL and BimS, associated with intrinsic cell death. These results reveal a novel role for BimL in lysosomal positioning that may be required for the formation of functional autolysosomes during autophagy.
Show less - Date Issued
- 2012
- Identifier
- CFE0004435, ucf:49331
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0004435
- Title
- Finding Consensus Energy Folding Landscapes Between RNA Sequences.
- Creator
-
Burbridge, Joshua, Zhang, Shaojie, Hu, Haiyan, Jha, Sumit, University of Central Florida
- Abstract / Description
-
In molecular biology, the secondary structure of a ribonucleic acid (RNA) molecule is closely related to its biological function. One problem in structural bioinformatics is to determine the two- and three-dimensional structure of RNA using only sequencing information, which can be obtained at low cost. This entails designing sophisticated algorithms to simulate the process of RNA folding using detailed sets of thermodynamic parameters. The set of all chemically feasible structures an RNA...
Show moreIn molecular biology, the secondary structure of a ribonucleic acid (RNA) molecule is closely related to its biological function. One problem in structural bioinformatics is to determine the two- and three-dimensional structure of RNA using only sequencing information, which can be obtained at low cost. This entails designing sophisticated algorithms to simulate the process of RNA folding using detailed sets of thermodynamic parameters. The set of all chemically feasible structures an RNA molecule can assume, as well as the energy associated with each structure, is called its energy folding landscape. This research focuses on defining and solving the problem of finding the consensus landscape between multiple RNA molecules. Specifically, we discuss how this problem is equivalent to the problem of Balanced Global Network Alignment, and what effect a solution to this problem would have on our understanding of RNA.Because this problem is known to be NP-hard, we instead define an approximate consensus on a landscape of reduced size, which dramatically reduces the searching space associated with the problem. We use the program RNASLOpt to enumerate all stable local optimal secondary structures in multiple landscapes within a certain energy and stability range of the minimum free energy (MFE) structure. We then encode these using an extended structural alphabet and perform sequence alignment using a structural substitution matrix to find and rank the best matches between the sets based on stability, energy, and structural distance. We apply this method to twenty landscapes from four sets of riboswitches from Bacillus subtillis in order to predict their native (")on(") and (")off(") structures. We find that this method significantly reduces the size of the list of candidate structures, as well as increasing the ranking of previously obscure secondary structures, resulting in more accurate predictions overall. Advances in the field of structural bioinformatics can help elucidate the underlying mechanisms of many genetic diseases.
Show less - Date Issued
- 2015
- Identifier
- CFE0006210, ucf:51109
- Format
- Document (PDF)
- PURL
- http://purl.flvc.org/ucf/fd/CFE0006210