What is Bioinformatics and Data Science?

Bioinformatics is a catch-all term for the application of computer science, statistics, and mathematics to specific biological problem domains. As a field, bioinformatics developed in the 1960s from scattered applications of computers to different problems in molecular biology, especially protein and DNA sequence comparison. Dr. Margaret Oakley Dayhoff was one of the early pioneers of the field that used computers to perform protein sequence comparisons between organisms to understand patterns of evolution. As the field matured, the exact definition changed and began to represent many different applications of computer science and mathematics to biological problems. With the introduction of other terms like computational biology and systems biology, bioinformatics is now being more narrowly defined. The ambiguous and changing definition of bioinformatics often leads to confusion about what the field is.

But the new catch-all term is data science, which represents the intersection between computer science, mathematics/statistics, and problem domain. Bioinformatics, computational biology, systems biology and biomedical informatics all fall under the larger data science umbrella with a focus on biological and biomedical problem domains.

Narrower Definitions

  • Data Science
    • An interdisciplinary data analysis field which integrates and applies computer science, mathematics, and statistics to a specific application problem domain.
  • Bioinformatics
    • The use of computer science, statistics, and biological domain knowledge to develop tools and analyses that enable the efficient access, use, management, and assessment of various types of biological information, especially genetic sequence data.
    • Structural Bioinformatics
      • Branch of bioinformatics focused on structural biology and involving the analysis and prediction of the 3D structure of biological macromolecules such as proteins, RNA, and DNA.
  • Computational Biology
    • The use of computer science, statistics, and biological domain knowledge to develop algorithms and computational methods that address specific biological problems via the analysis of biological and biophysical data.
  • Biomedical Informatics
    • The use of information science, statistics, and biomedical knowledge to optimize the acquisition, storage, management, and analysis of biomedical information for the purpose of improving human health.
  • Biophysical Informatics
    • The use of computer science, mathematics, statistics, and biophysical domain knowledge to develop algorithms and analyses to derive useful information from biophysical data.
  • Systems Biology
    • The use of computer science, mathematics, statistics, and biological domain knowledge to develop algorithms and computational methods that model complex biological systems via the integration of large biological and biophysical datasets, often to better understand the emergent properties of these systems.

Next-Generation Definitions

  • Data Science
    • The application of computational, mathematical/statistical, and other scientific methods to large datasets in a particular problem domain in order to extract information and discover new knowledge often for decision-making.
Topic revision: r12 - 25 Apr 2024, HunterMoseley
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Moseley Bioinformatics Lab? Send feedback