Computational Downstream Analysis of High-Throughput RNA-Sequencing Data
Sahoo, Bikram
Citations
Abstract
The advent of RNA sequencing (RNA-seq) technology has significantly advanced transcriptome-related research. The availability of RNA-seq data has spurred computational biologists to develop algorithms that process this data in a statistically rigorous manner, yielding biologically meaningful results. Recent advancements in bioinformatics algorithms enable the extraction of gene expression, fusion, and pathway information as the most immediate results from RNA-seq data. The ongoing progress in computational biology further promises to expand the utility of RNA-seq data in transcriptome-based biological research.
In this dissertation, we introduce a method to detect retained introns in RNA-seq data, with the aim of developing a vaccine against cancers harboring p53 mutations. We discuss our approaches to generating unique gene signatures to elucidate the role of sensory nerve interference in the anti-melanoma immune response and to study racial disparities in triple-negative breast cancer. We propose a clustering algorithm combined with statistical methods to analyze the heterogeneity in quadruple-negative breast cancer. Additionally, we conducted a benchmarking study to assess the resilience of machine learning classification algorithms on SARS-CoV-2 genome sequences, particularly those generated with long-read specific errors.
In summary, this research provides novel methodologies for exploring RNA-seq data and their application to real-world biological research.
