1. Preface

Read me

The only aim for this writing is to help scientist who wants to acquire NGS skills to start this life-long learning journey.

Welcome to Next Generation Sequencing (NGS) for natural scientist. This e-book is but a RNA-Seq starting reference manual intended for

  • Molecular biologist

  • Beginners of Next Generation Sequencing and Linux environment

  • Non-computer scientist with computer literacy

  • Academia who are thinking to transit to commercial research

  • (optional) Wanting to own a private Linux terminal

For that matter, this writing will be brief, to the point, and basic but informative. Instead of a bunch of code chunk, I will show you how to fish in the sea (never your own pond), and more importantly I will talk about the science in order to comprehend the code. In my book, NGS analysis using pre-loaded R package has less to do with coding but more to do with the understanding of your own data and the statistics.

I will walk you through one of my mRNA differential expression analyze RNA-seq experiments to explain the logistics and to tell the story of how I started in this field.

To be honest once I was done with the visualization part, although I planned to add in the biological insight between codes, I didn't expect myself to come back (at all) because I don't think this is useful to the world in anyway. And then half a year later after I settled in this next country (i.e. UK) of my journey, I received a panel interview invitation from a single cell seq biotech company for a position in the workflow product development area. This pushed me to study for single cell technique for the interview, and I think I should have it also inserted into this writing, no matter this is now for whom or for what.

This can be a writing for molecular biologists in any disciplines to learn how to use computer as a tool to do basic science. You will need to be fluent in molecular biology and classic basic science to understand me. And the next thing would be an open mind. To see things past what they are and customize the usage to suit yourself and serve what you believe in. This is the definition of tool.

If you are expecting any crappy "because this is the future" or "anyone can be a biologist, biology is too easy", with all due respect this is a misunderstanding of science. Science rocks because it works, and it works because we can show that it works for as many times as we want. At the end of the day experiment is always the goal, the end point and the answer, so much like theoritical physics vs experimental physics. NGS in biology is very much like an evidented theoritical answer, that it is less about to make a perfect digital solution to call ATCG at 100% accuracy, but it's time for biologist to really do biology - reductionism should have been a thing in the past because it was due to technology limitation. My aspiration, which is a bit deviated from the mainstream, is that these high-through put methods is the solution to resolve the minimum wages situation in highly trained and well educated molecular biologists, to free up their hands and truly devote their time to think about the biology sitting behind the data points. There is a wide gap sitting between a technician and a scientist.

Those who want to leave academia and work in industrial sector might also want to comb through the writings, to get an idea of why some colleagues are having hard time in getting jobs outside of University or research institutes.

Last updated