BS6208 Biological Big Data

Summary of co​urse content

In this course, we will go through “case studies” of how big data are being assembled and used in the biomedical field. We will focus on the challenges faced by researchers in assembling the data and how these are overcome. We will explore what are the new insights that are generated from these big data that could not have been achieved from small data. We will focus on the following case studies: The Cancer Genome Atlas (TCGA), dermatologist-level classification of skin cancer with deep neural networks, and the Global Initiative on Sharing All Influenza Data (GISAID). After this course, the students will be expected to “think big” for future projects that they may encounter and learn from these case studies how they can gain new insights from biological big data.

Aims and objectives

The students should be able to articulate why biomedical research needs big data and what are the challenges and solutions in assembling big data. The students should also be able to apply such “big” thinking to their future biomedical projects. Also, the students are expected to become competent in the case studies, such as TCGA and GISAID.

Syllabus

Part I: Big Data in Cancer
  • Cancer genomics: background, experimental techniques to generate genomic data, small data in the past and why it didn’t work, TCGA, future data sets
  • Cancer imaging diagnostics: introduction to the various modalities of cancer imaging, survey of data sets available, successful skin cancer example

Part II: Big Data in Human Infectious Diseases
Biological background about infectious diseases, challenges and solutions in generation of data for epidemiology and surveillance, GISAID

Assessment

Quiz (MCQ or/and short answers question)​​Indivi​​dual30%
Assignments​Individual30%
Term PaperIndividual40%
  100%
​​​​