Moseley Laboratory Research Interests
Develop computational methods/models/tools for analyzing, integrating, and interpreting many types of biological and biophysical data that enable new understanding of biological systems and related disease processes.
Our approach involves:
Most of the applications of these new methods, tools, and models are in the areas of omics, systems biochemistry, and structural bioinformatics.
- Leveraging relevant information from large public scientific repositories and knowledgebases.
- Developing appropriate methods to analyze specific types of biological data.
- Creating new models that facilitate the integration of diverse types of biological data.
- Implementing system-wide analyses that integrate omics-level datasets.
Our lab provides bioinformatics and systems biology expertise for the analysis and interpretation of SIRM experiments. Our goal is to develop a combination of bioinformatic, biostatistical, and systems biochemical tools implemented in an integrated data analysis pipeline that will allow broad application of SIRM from the discovery of specific metabolic phenotypes representing biological and disease states of interest to a mechanism-based understanding of a wide range of specific human disease processes with particular metabolic phenotypes. Our new tools are already providing novel metabolic pathway-specific analyses of complex SIRM datasets. For example, we have used a moiety model analysis of SIRM mass spectrometer data to quantitate the relative importance of specific metabolic pathways in the biosynthesis of UDP-GlcNAc in prostate cancer cell culture. Subsequent analyses determined which pathways were impacted by potential cancer therapeutics. As we implement a complete SIRM-based data analysis pipeline, our ultimate goal is to integrate metabolomics datasets with other major omics datasets including epigenomics, genomics, transcriptomics, and proteomics datasets in full systems biochemical analyses that can determine which gene-regulatory, signaling, and metabolic pathways are mechanistically involved in specific human diseases.
|| Figure 1: (a) Chemical substructure model representing the possible number of 13C incorporation from 13C6-Glc tracer into UDP-GlcNAc, accounting for the observed FT-ICR-MS isotopologue peaks. (b) Structure of UDP-GlcNAc annotated by its chemical substructures and their biosynthetic pathways from 13C6-Glc, as in Fig. 2. U = uracil, R = ribose, A = acetyl, G=glucose. NAc-Glucose utilizes Gln as the nitrogen donor. (c) Fit of optimized chemical substructure model parameters to FT-ICR-MS isotopologue data of UDP-GlcNAc extracted from a LN3 prostate cancer cell culture after 48 hours of growth in 13C6-Glc.
- Moseley HNB. Correcting for the Effects of Natural Abundance in Stable Isotope Resolved Metabolomics Experiments Involving Ultra-High Resolution Mass Spectrometry. BMC Bioinformatics 11:139, 2010. Citations: 91 (Google Scholar). PMCID: PMC2848236
- Moseley HNB, Lane AN, Belshoff AC, Higashi RM and Fan TW. A novel deconvolution method for modeling UDP-N-acetyl-D-glucosamine biosynthetic pathways based on (13)C mass isotopologue profiles under non-steady-state conditions. BMC Biol 9:37, 2011. Citations: 61 (Google Scholar). PMCID: PMC3126751.
- Moseley HNB. Error Analysis and Propagation in Metabolomics Data Analysis. Comput Struct Biotechnol J 4:2013. Citations: 44 (Google Scholar). PMCID: PMC3647477.
- Joshua M. Mitchell, Teresa W-.M. Fan, Andrew N. Lane, and Hunter N.B. Moseley. “Development and in silico evaluation of large-scale metabolite identification methos using functional group detection for metabolomics” Frontiers in Genetics, 5, 237 (2014). Citations: 27 (Google Scholar). PMCID: PMC4112935.
- Mitchell JM, Flight RM, Wang QJ, Higashi RM, Fan TW, Lane AN, and Moseley HNB. High Peak Density Artifacts in Fourier Transform Mass Spectra and their Effects on Data Analysis. Metabolomics 14:125, 2018. Citations: 14 (Google Scholar). PMCID: PMC6153687
- Jin H and Moseley HNB. Moiety Modeling Framework for Deriving Moiety Abundances from Mass Spectrometry Measured Isotopologues. BMC Bioinformatics 20:524, 2019. Citations: 4 (Google Scholar).
- Mitchell JM, Flight RM, and Moseley HNB. Small Molecule Isotope Resolved Formula Enumeration: a Methodology for Assigning Isotopologues and Metabolites in Fourier Transform Mass Spectra. Analytical Chemistry 91:8933, 2019. Citations: 5 (Google Scholar). PMID: 31260262 DOI: 10.1021/acs.analchem.9b00748
- Jin H and Moseley HNB. Robust Moiety Model Selection Using Mass Spectrometry Measured Isotopologues. Metabolites 10, 118 (2020). Citations: 3 (Google Scholar).
- Christian D. Powell and Hunter N.B. Moseley. "The mwtab Python library for RESTful Access and Enhanced Quality Control, Deposition, and Curation of the Metabolomics Workbench Data Repository" Metabolites 11, 163 (2021).
| Structural bioinformatics of metalloproteins has historically been hampered by significant numbers of aberrant coordination geometries that prevented systematic classification. My lab has developed combined functional and structural analyses of metalloproteins that have identified aberrant clusters of coordination geometries (CG) of metal ion ligation in the top 5 most abundant metalloproteins. Most of these aberrant CGs are due to multidentate ligands that create compressed ligand-metal-ligand angles below 60°. These angles cause serious deviations from canonical CG models and greatly hamper the ability to characterize metalloproteins both structurally and functionally. Our methods detect coordinating ligands without expectations based on canonical CGs and in a statistically robust manner, producing estimated false positive and false negative rates of ~0.11% and ~1.2%, respectively. Also, our improved analyses of bond-length distributions have revealed bond-length modes specific to chemical functional groups involved in multidentation. By recognizing aberrant CGs in our clustering analyses, high correlations above 0.9 are achieved between structural and functional descriptions of metal ion coordination. This work has been impactful to the field by highlighting the unexpected presence of significant numbers of non-canonical CGs and in characterizing their structural, functional, and chemical characteristics. Our publications made the cover of the May 2017 issue of Proteins.
- Yao S, Flight RM, Rouchka EC, Moseley HNB. A less biased analysis of metalloproteins reveals novel zinc coordination geometries. Proteins 83:1470, 2015. Citations: 22 (Google Scholar). PMCID: PMC4539273
- Yao S, Flight RM, Rouchka EC, Moseley HNB. Aberrant coordination geometries discovered in the most abundant metalloproteins. Proteins 85:885, 2017. Citations: 4 (Google Scholar). doi:10.1002/prot.25257
- Yao S, Flight RM, Rouchka EC, Moseley HNB. Perspectives and expectations in structural bioinformatics of metalloproteins. Proteins 85:938, 2017. Citations: 5 (Google Scholar). doi:10.1002/prot.25263
- Yao S, Moseley HNB. Finding high-quality metal ion-centric regions across the worldwide Protein Data Bank. Molecules 24:3179, 2019. Citations: 2 (Google Scholar). doi:10.3390/molecules24173179
- Yao S, Moseley HNB. A chemical interpretation of protein electron density maps in the worldwide protein data bank. PLOS One 15:e0236894, 2020. Citations: 2 (Google Scholar). doi:10.1371/journal.pone.0236894
Improved Utilization and Curation of the Gene Ontology
The Gene Ontology (GO) is the largest and best curated ontology in the OBO Foundry and is used extensively to precisely describe the functions, locations, and processes of gene(-product)s through specific annotations stored across many knowledgebases. But there is a fundamental problem with a lack of tools that organize ontology terms into usable domain-specific concepts that biomedical researchers can easily interpret, leverage within statistically rigorous analyses, and integrate with other types of information. Therefore, we have developed the GO Categorization Suite (GOcats), which streamlines the slicing of GO into custom, biologically-meaningful subgraphs representing emergent concepts in GO. GOcats uses a list of user-defined keywords or GO terms that describe a concept, the structure of GO, and relationship properties to automatically generate a subgraph of child terms and a mapping of these child terms to their respective concept-defining term. GOcats enables the utilization of additional GO relationship types in a manner that preserves proper scoping and scaling. Furthermore, we have demonstrated improvements in statistical power via the use of GOcats in annotation enrichment analyses performed by categoryCompare. We have also integrated GOcats driven annotation enrichment analysis with principal component analysis and molecular interaction network analysis (see Figure). Moreover, we have collaborated in the development of advanced curation tools that can help detect missing and erroneous relationships in GO, which are needed due to GO’s size (over 40,000 terms) and rate of growth.
|| Figure 2. A) PCA plot of equine RNAseq datasets. B) Organized groups of enriched GO-terms for PC1. C) STRING interactions between high PC1 loading gene(-product)s annotated with group G1 GO terms (cartilage development).
- Abeysinghe R, Hinderer III EW, Moseley HNB, and Cui L. Auditing Subtype Inconsistencies among Gene Ontology Concepts. The 2nd International Workshop on Semantics-Powered Data Analytics (SEPDA 2017) -- Bioinformatics and Biomedicine (BIBM), 2017 IEEE International Conference 1242-1245, 2017. Citations: 9 (Google Scholar).
- Abeysinghe R, Zheng F, Hinderer III EW, Moseley HNB, and Cui L. A Lexical Approach to Identifying Subtype Inconsistencies in Biomedical Terminologies. Quality Assurance of Biological and Biomedical Ontologies and Terminologies Workshop -- Bioinformatics and Biomedicine (BIBM), 2018 IEEE International Conference 1982-1989, 2018. Citations: 6 (Google Scholar).
- Hinderer III EW, Flight RM, Dubey R, MacLeod JN, and Moseley HNB. Advances in Gene Ontology Utilization Improve Statistical Power of Annotation Enrichment. PLOS One 14:e0220728, 2019. Citations: 6 (Google Scholar).
- Hinderer III EW and Moseley HNB. GOcats: A tool for categorizing Gene Ontology into subgraphs of user-defined concepts. PLOS One 15:e0233311, 2020. Citations: 6 (Google Scholar).
- Rashmie Abeysinghe, Eugene W. Hinderer III, Hunter N.B. Moseley, and Licong Cui. "Subsumption-based Sub-term Inference Framework to Audit Gene Ontology" Bioinformatics 36, 3207 (2020). Citations: 3 (Google Scholar).
Interaction Network-centric Cancer Mutational Pattern Analyses
Lung cancer is the leading cause of cancer death worldwide, with 160,000 deaths in the US annually
. The state of Kentucky ranks highest in lung cancer incidence and mortality, with the Central Appalachian region of Kentucky (AppKY) ranking the highest of the highest. Squamous cell carcinoma (SQCC) of the lung from AppKY has uniquely high mutation rates in PCMTD1 and IDH1 genes in comparison to The Cancer Genome Atlas (TCGA), suggesting that pathways including these genes are likely important for cancer development in this population. Therefore, we have developed analyses for placing these genes within molecular interaction networks constructed from known protein-protein interactions and gene-products with related function. In this application, we have found mutually exclusive mutational patterns between PCMTD1 and related histone methylases and between IDH1 and related histone demethylases, suggesting that mutations in these pathways directing histone methylation and demethylation are important in SQCC cancer development and may be related to AppKY-specific environmental factors.
Membrane proteins are essential for many biological functions. They comprise roughly one third of all sequenced genomes, and represent 70% of all current drug targets. However, fewer than 1500 of the ~100,000 protein structure entries in the worldwide Protein Data Bank (PDB) involve integral membrane proteins as of June 2009. This is because they are difficult to crystallize for x-ray crystallographic studies and difficult to solubilize for solution nuclear magnetic resonance (NMR) studies. Magic-angle spinning solid-state NMR (MAS SSNMR) represents a fast developing experimental method that has great potential to provide structural and dynamics information of membrane proteins without the sample limitations of other techniques. We are developing automated analysis tools that will aid in the analysis of SSNMR data and specifically tailored for SSNMR data from membrane protein samples. Specifically our lab is focusing on developing and testing algorithms that will automate all analysis steps from raw SSNMR spectral data to protein resonance assignments for uniformly 13C/15N-labeled membrane proteins. This development will provide necessary analysis tools for expansion of MAS SSNMR and its application to membrane proteins into the broader biological community.