R vs Python

So for some reason I am learning more than I planned to.

Python has a wider appearance in general data science such as mahine learning, while R supports RNA-seq better than Python in terms of package (R) or library (Python). That is why at the end of the day I need to acquire both syntax. I have talked about how much you need to know about R before you can do RNA-Seq on your own, and I am going to show you how do I see R and Python in a programming sense.

Before we even started to paste code chunks around the place, to sum it up I do think Python makes my life easier in plotting graphs.

//pheatmap in R
df <- as.data.frame(colData(dds)[,"group"])
select <- order(rowMeans(counts(dds, normalized=T)), decreasing=TRUE)[1:200]
pheatmap(assay(vsd)[select,], 
         color=colorRampPalette(c("navy", "white", "red"))(100),
         cluster_rows = T, 
         show_rownames = F, 
         show_colnames = T, 
         cluster_cols = F, 
         labels_col = paste0(sampleTable$sample, sampleTable$group), 
         border_color = NA)

//Boxplot in Python
plt.figure(figsize = (20,6))
sns.boxplot(data = df_metadata, x = 'species', y = "top_frequency")
plt.xticks(rotation = 90)
plt.show()

They are totally 2 different graphs, but what I want to highlight is how much more intitutive to plot a graph in Python than R. In R, you need to specify all the parameters in the plotting function in one go, but in Python you can execute them one by one. Of course you can copy the whole function and change the parameter as you run the plot in R, but Python is undoubtedly to win in the readibility section.

The anonymous function in R and Python works quite similar, but they look different, with R to be a bit daunting to me.

1 %>% {.+1}

And because of many other reasons, Python is easier to learn than R which explained the popularity. Thus I would recommend to start with R if you want to know both because that would ease up your Python learning journey for a bit.

It is easier to pick up Python if you know R beforehand.

Familiar with Linux command line is also an advantage in RNA-seq and datasciecne in general.

PreviousMachine Learning in a nutshell NextSetting up ML terminal

Last updated 2 years ago