4 min read

Programming: An Essential Skills for Scientists

Introduction

In recent years, there’s been an admirable push to get more people to learn programming. But if I’ve never been exposed to programming, why should I invest all of the effort to learn? What’s in it for me? When I was confronted with data limitation challenge in spreadsheet (Brown, 2001), and the issue of repeating the same process over and over again, I decided to learn to program. I started learning R— a statistical and graphical software (James, Witten, Hastie, & Tibshirani, 2013). I then moved to Python (Langtangen & Langtangen, 2009), later to Matlab (Trauth, 2015). But now, I do nearly all of my data work in R. Trust me—because I have been there, a little programming skills obtained by learning R has helped me to do so much more with data than I used to do with out–of–the–box–software.

The skills of knowing how to progrm were apparent recently when I submitted a manuscript in a peer reviewed journal to be considered for publication. After a month of review, they gave me the feedback of the review with major revisions that should be done within 21 days. Because I had analysed the data and prepared the manuscript with plain text rmarkdown developed by Allaire and his collegues (2018). Although the revision 85 of the origin manuscript, it just only took me three days to address all the major issues. After resubmission, the in–house checked and returned the revised version simply because I had not included the data underlying the findings. They asked me to provide these data as supporting information within three weeks. I instantly uploaded the data along with the script1 containing codes used for the manuscript. That was easy for me becasue I coded the work. That was easier and found myself relaxed—a glimpse of programming! I now realize how much productive I have become at work with this programming skills and I how the skills make my routine easy, constistency and reproducible.

Using a programming language like R (R Core Team, 2018) can look cryptic at first. That is true, like any language, you can not immediately start a conversation. You must know the basics first and the way you practice make you fluency in that language. Honestly, I found learning programming is hard!. There is definitely a learning curve, however, putting in a little time is rewarding in long run as it make routine task much easier. The routine tasks that used to take hours of work can now be done within few minutes with code. This code can be replicated to solve other task as well. The other good reason I found about programming is that once you muster one language, it is much easier to learn others because their share almost similar building blocks with minor adjustments.

In summary knowing to code can make you 10 times faster in your daily routine. Because programming help you to automate tedious tasks of cleaning and preparing the data that you would otherwise need to do by hand. If you know how to program, computer-related tasks that used to take you a week to finish will now take only a few hours—What a relief?

Programming stretch your brain and allows you to discover more creative solutions than your colleagues who don’t know how to program. It lets you go beyond simply using the tools and data sets that everyone else around you uses, to transcend the limitations that your peers are stuck with. For example, you’ll be able to write programs to automatically acquire data from new sources, to clean, reformat, and integrate that data with your existing data, and to implement far more sophisticated analyses than your colleagues who can only use pre–existing tools.

Cited resources

Allaire, J., Xie, Y., McPherson, J., Luraschi, J., Ushey, K., Atkins, A., … Chang, W. (2018). Rmarkdown: Dynamic documents for r. Retrieved from https://CRAN.R-project.org/package=rmarkdown

Brown, A. M. (2001). A step-by-step guide to non-linear regression analysis of experimental data using a microsoft excel spreadsheet. Computer Methods and Programs in Biomedicine, 65(3), 191–200.

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112). Book, Springer.

Langtangen, H. P., & Langtangen, H. P. (2009). A primer on scientific programming with python (Vol. 2). Springer.

R Core Team. (2018). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from https://www.R-project.org/

Trauth, M. (2015). MATLAB® recipes for earth sciences (4th ed. 2015.). Book, Berlin, Heidelberg: Berlin, Heidelberg : Springer Berlin Heidelberg : Imprint Springer.


  1. R script written that automate the execution of analysis of the manuscript