Graduation Date

Fall 12-18-2015

Document Type


Degree Name

Doctor of Philosophy (PhD)


Pathology & Microbiology

First Advisor

Wing C. Chan

Second Advisor

Kai Fu


Follicular lymphoma (FL) is the second most common lymphoma in the United States. Although it is generally an indolent lymphoma, FL is not curable, and, in about 30% of patients, the FL undergoes transformation into an aggressive lymphoma (tFL) with marked worsening of prognosis. To identify mutations preferentially present in tFL, we performed whole exome sequencing (WES) on paired FL and tFL arising in the same patients and developed a mutational analysis pipeline. After we identified potentially important genes that have been found to be mutated in our paired FL and tFL study, we constructed a custom capture platform including these genes as well as other genes known to be mutated in B-cell lymphomas. We were able to use this focused sequencing platform to analyze additional samples at greater sequencing depth. Clonal architecture and evolution can be readily identified; however, the DNA samples were fragmented using restriction enzymes, which compromised duplicate analysis. We developed a new approach with a statistical model to solve the problems. Samples from uninvolved tissue of the same patients are commonly used to distinguish germline variants from somatic mutations; however, the germline DNA was often not available for our samples. , We designed a filtering based method to limit the number of germline variants that would be mistakenly called somatic mutations and validated this approach using a dataset with paired normal samples. We also introduced a novel idea based on machine learning to predict somatic mutations from paired FL and tFL samples without healthy tissue. Five machine learning algorithms were tested in datasets with known somatic mutations, and their performance was evaluated by statistical measures. The results indicated somatic mutations can be reliably predicted. In order to provide complementary information, we integrated our mutation data with copy number abnormality data and found genes more frequently mutated in tFL cases. The recurrently mutated genes are often involved in epigenetic regulation, the JAK-STAT or the NF-κB pathway, immune surveillance, and cell cycle regulation, or are transcription factors involved in B cell development. As no entirely tFL specific mutations are found, the transformation event needs to cooperate with pre-existing alterations and future studies will focus on identifying cooperative mutations for FL transformation.