Moseley Laboratory Research Interests
Develop computational methods, models, and tools for analyzing and interpreting many types of biological and biophysical data that enable new understanding of biological systems and related disease processes.
Our approach involves:
Most of the applications of these new methods, tools, and models are in the areas of metabolomics, systems biochemistry, and structural bioinformatics.
- Leveraging relevant information from large public scientific repositories and knowledgebases.
- Developing appropriate methods to analyze specific types of biological data.
- Creating new models that facilitate the integration of diverse types of biological data.
- Implementing system-wide analyses that integrate omics-level datasets.
Membrane proteins are essential for many biological functions. They comprise roughly one third of all sequenced genomes, and represent 70% of all current drug targets. However, fewer than 1500 of the ~100,000 protein structure entries in the worldwide Protein Data Bank (PDB) involve integral membrane proteins as of June 2009. This is because they are difficult to crystallize for x-ray crystallographic studies and difficult to solubilize for solution nuclear magnetic resonance (NMR) studies. Magic-angle spinning solid-state NMR (MAS SSNMR) represents a fast developing experimental method that has great potential to provide structural and dynamics information of membrane proteins without the sample limitations of other techniques. We are developing automated analysis tools that will aid in the analysis of SSNMR data and specifically tailored for SSNMR data from membrane protein samples. Specifically our lab is focusing on developing and testing algorithms that will automate all analysis steps from raw SSNMR spectral data to protein resonance assignments for uniformly 13C/15N-labeled membrane proteins. This development will provide necessary analysis tools for expansion of MAS SSNMR and its application to membrane proteins into the broader biological community.
Zinc ions bound to proteins serve a wide variety of catalytic, structural, and signal transduction purposes in biological systems. Zinc is the only metal ion seen in all six classes of enzymes. Iron and zinc are the most abundant trace elements in the human body. Roughly 2800 human proteins are predicted to be zinc-binding which equates to 10% of the human genome. Change in zinc trafficking is now associated with a variety of diseases, including Alzheimer’s, Parkinson’s, type 2 diabetes, and pathological conditions related to neural and myocardial ischemia. Recently, great strides were made in predicting zinc-binding from 3D structure and from sequence across many genomes. However, classification, characterization, and sequence annotation of zinc-binding lags behind. We are developing analyses that automates classification and characterization of metal ion coordination for both annotation and functional prediction.
With the improvements in mass spectrometry and nuclear magnetic resonance, there is an explosion of metabolomics data being collected on a variety of cells and tissue associated with human diseases, especially cancer. The weight of the data requires the development of automated analysis methods that are truly robust. We are developing ways to combine analyses of NMR and mass spectrometry metabolomics data that can lead to robust metabolite analysis. Such new methods will allow a wealth of metabolomics data to be brought into the analysis and deconvolution of metabolic pathways. For example, we are developing a combined simulated annealing and genetic algorithms method called GAIMS to analyze NMR and FT-ICR-MS isotopomer data of uridine diphospho-N-acetylglucosamine (UDP-GlcNAc) and uridine diphospho-N-acetylgalactosamine (UDP-GalNAc) extracted from tissue culture grown on 13-C enriched media. Both metabolites are used in O-glycosylation of proteins which serves cellular regulatory roles in nutrient sensing, protein degradation, and gene expression. We are applying these analyses to metabolomics data collected from cancer tissue cultures treated with potential cancer chemoprevention agents to better understand how these agents change cancer cell metabolism.
Interaction Network-centric Cancer Mutational Pattern Analyses
Lung cancer is the leading cause of cancer death worldwide, with 160,000 deaths in the US annually
. The state of Kentucky ranks highest in lung cancer incidence and mortality, with the Central Appalachian region of Kentucky (AppKY) ranking the highest of the highest. Squamous cell carcinoma (SQCC) of the lung from AppKY has uniquely high mutation rates in PCMTD1 and IDH1 genes in comparison to The Cancer Genome Atlas (TCGA), suggesting that pathways including these genes are likely important for cancer development in this population. Therefore, we have developed analyses for placing these genes within molecular interaction networks constructed from known protein-protein interactions and gene-products with related function. In this application, we have found mutually exclusive mutational patterns between PCMTD1 and related histone methylases and between IDH1 and related histone demethylases, suggesting that mutations in these pathways directing histone methylation and demethylation are important in SQCC cancer development and may be related to AppKY-specific environmental factors.
Ontologies are used extensively in scientific knowledgebases to organize the wealth of available biological information. Gene Ontology (GO) is currently one of the most comprehensive ontologies for annotating gene and gene products with respect to biological function. We have several tools for utilizing biological functional annotations such as GO for a variety of knowledge extraction and utilization purposes. GOcats is novel tool that organizes GO into subgraphs representing user-defined concepts, while ensuring that all appropriate relations are congruent with respect to scoping semantics. categoryCompare is a flexible framework for enrichment of feature annotations and comparisons between enrichment of annotations across two or more experimental groups. Both tools have command line interfaces and application programming interfaces which have been designed to work together to facilitate a range of functional annotation and omics integration analyses.