5. FASTQ and quality control

This is more important than I first thought.

FASTQ file format

FASTQ or fq files are the sequencing result (read) generated by the sequencer. You can either get it by running your own experiment or download from SRA/GEO database.

These files, although huge, are quite straight forward. It is simply a text file holding the ATCG sequence read off from the DNA library of your biological samples. It is composed of an index, or tag, or description (computational) to that particular sequence read, the actual sequence, a '+' sign, followed by the read score, basically how confident the sequencer is about that particular base read. So all in all, it is a file consisting of a format of 4-line-chunk, each represents one strand of DNA that get sequenced by the sequencer.

or you can read this - https://compgenomr.github.io/book/fasta-and-fastq-formats.html

Last updated