🎒
NGS for natural scientist
  • 1. Preface
    • How to use this book
    • Motivation
    • Genomic data science as a tool to biologist
    • Next Generation Science (also NGS)
  • 2. Getting started
    • A step by step pipeline tutorial
    • Sequencing chemistry explained by Illumina
    • Joining a course
    • RNA quality and Library prep
    • (optional) My click moment about "Why Linux"
  • 3. Good-to-know beforehand
    • Experiment design
    • Single-end and Paired-end
    • Read per sample and data size
    • Normalization - RPKM/FPKM/TPM
    • Gene annotation
  • 4. Setting up terminal
    • My Linux terminal
    • Linux environment
    • R and RStudio
    • PATH
  • 5. FASTQ and quality control
    • Getting FASTQ files from online database
    • FASTQ quality assessment
  • 6. Mapping/alignment and quantification
    • Salmon
    • DESeq2
  • 7. Visualization
  • 8. Single cell RNA-Seq
  • 9. AWS cloud and Machine Learning
    • Machine Learning in a nutshell
    • R vs Python
    • Setting up ML terminal
    • Data exploration
  • (pending material)
    • graphPad
    • readings for ML
Powered by GitBook
On this page
  1. 3. Good-to-know beforehand

Experiment design

It is a bit not traditional in my book

Previous3. Good-to-know beforehandNextSingle-end and Paired-end

Last updated 2 years ago

In the case of RNA-Seq for differential expression analysis, simply speaking you ought to have at least 2 technical replications for baseline/control because the available packages are not able to compare single sample to a single sample.

Extra information

Consider technical replication, biological replication and individual replication. In a traditional sense and little do people make the connection, N should indicate individual replication but biological replication or technical replication if what you want to control is the inter-experimental error.

Yellow = technical replication Blue = Biological replication Different occasion to actually perform the experiment (3 different arrow to the same green) = individual replication So in this case the N number would be 3 (instead of 9)

But this is just an idea. For example, what if I, for the sake of sequencing cost, to run all 9 samples on 3 lanes using 3 indexes in the same sequencing run? I would say that depends on how you see it. I would see that as a better approach to control for the technical error of the sequencer cause that should never be the concern of the biologist in the ideal case. Anyway that is up to you (it is not a concern when you publish your NGS data at the time of writing) but I still prefer to have completely separated biological replications connected by controls.

Yellow = technical replication Blue = Biological replication Same occasion to perform the experiment (1 same arrow to the same green) = individual replication So in this case the N number would be 1 (instead of 3 or 9)