8 min read

dygraphs for interactive time series oceanographical data

Interactive plots can help users overcome the dilemma of complex or detailed plot like the hour temperature records plotted over a year. They also offers readers an option or tools to explore the plot in detail. R has several package that turn static plot into interactive ones. In this post, I will show you how to create an interactive time series using the dygraphs package developed and maintained by Vanderkam et al. (2018). We first load the packages into the workspace in R (R Core Team 2018). Other packages I need for this task include the lubridate(Grolemund and Wickham 2011) for manipulating date and the tidyverse (Wickham 2017) for data manipulation

require(tidyverse)
require(lubridate)
require(dygraphs)

Once the packages are loaded, its time to load the dataset from the working directory into R. The function read.table() was used to import the dataset.

temperature = read.table("./Temperature data/Chumbe_19Jan12_in.txt", sep = "\t", header = T) 

Table 1 show the sample of the first five and last five records of the dataset. You notice that the first record of the temperature was measured on 2012-01-19 00:01:00 and the last records was captured on 2012-01-19 00:01:00. You notice also that the temperature was recorded at an interval of 15 mins.

Table 1: Sample of temperature records at Chumbe
Date Time Temperature
1/19/2012 00:00.0 29.03
1/19/2012 15:00.0 28.85
1/19/2012 30:00.0 28.67
1/19/2012 45:00.0 28.67
1/19/2012 00:00.0 28.48
1/19/2012 15:00.0 28.48
9/10/2012 15:00.0 31.28
9/10/2012 30:00.0 31.67
9/10/2012 45:00.0 29.22
9/10/2012 00:00.0 27.76
9/10/2012 15:00.0 27.22
9/10/2012 30:00.0 27.04

A keen observation of the Time variable in table 1 you notice that the minutes have been placed in the position of hours—which is totally missing in the dataset. Since we know the begin time and the time interval, we can create a new variable muda using the seq() as shown in the chunk below.

temperature.15min = temperature %>% 
  mutate(date = mdy(Date)) %>% 
  filter(date <= dmy(310812)) %>% 
  mutate(muda = seq(dmy_hms("190112 00:00:00"),
                    dmy_hms("310812 23:59:00"),
                    length.out = n()))

Although the dataset contain time in the standard R format, we can not plug this dataset direct into the dygraphs. This is because the dataset must be in the xts format to work with the package. The chunk below show how to convert this dataset into the xts format.

muda.wakati = xts::xts(x = temperature.15min$Temperature, 
                       order.by = temperature.15min$muda)

Now we have the dataset that we used to make an interactive chart shown in figure 1 using the dygraphs package. This chart was created by adding two options. First, add a date range selector with the dyRoller() function. The function allows the person interacting with the plot to average the time by entering the number at the bottom left of the plot, which smooth the Y-value over the specified number of time scale units. Second, add a range selector to the bottom of the chart with dyRangeSelector(). This allow the users to pan and zoom to various date ranges.

dygraph(data = muda.wakati, main = "Sea surface temperature", ylab = "Degree Celcius") %>%
  dyRoller(rollPeriod = 96) %>%
  dyRangeSelector()

Figure 1: An interactive sea surface temperature time series charts recorded Near Chumbe Island, Unguja in 2012

Pretty simple, Play around with the changing the rollperiod number on the chart to smooth. Since the records are 15 minutes interval, to obtain a hourly average, you can punch 4 in the box, and if you want a daily smooth, you have to get a product of 24 hours by 4 (four records in 1 hour) to obtain 96, which you I have used in the and smooth the line to daily average.

We can further specify the region with highest and lowest value of temperature with the dyShading() as shown in figure 2

dygraph(data = muda.wakati, main = "Sea surface temperature", ylab = "Degree Celcius") %>%
  dyRoller(rollPeriod = 96) %>%
  dyRangeSelector() %>% 
  dyShading(from = "2012-3-15", to = "2012-4-15", color = "lightpink")%>% 
  dyShading(from = "2012-7-20", to = "2012-8-5", color = "palegreen")

Figure 2: An interactive sea surface temperature time series charts recorded Near Chumbe Island, Unguja in 2012 shaded period of peak value

alternative todyShading() function that add region, we use the dyEvent()function that add a vertical line mark the occurrence of a particular event as shown in figure as shown in figure 3

dygraph(data = muda.wakati, main = "Sea surface temperature", ylab = "Degree Celcius") %>%
  dyRoller(rollPeriod = 96) %>%
  dyRangeSelector() %>% 
  dyEvent(x = c("2012-2-21","2012-4-2", "2012-7-24"), 
          label = c("Kaskazi","Matarahi", "Kusi"), 
          labelLoc = "bottom", color = "red")

Figure 3: An interactive sea surface temperature time series charts recorded Near Chumbe Island, Unguja in 2012. Dotted line indicate the monsoon season characteristic

Sometime we may wish to zoom back to original state of the chart, rather than double click, you add the dyUnzoom() function that add a button to zoom back when the chart is zoomed.

dygraph(data = muda.wakati, main = "Sea surface temperature", ylab = "Degree Celcius") %>%
  dyRoller(rollPeriod = 96) %>%
  dyRangeSelector() %>% 
  dyEvent(x = c("2012-2-21","2012-4-2", "2012-7-24"), 
          label = c("Kaskazi","Matarahi", "Kusi"), 
          labelLoc = "bottom", color = "red") %>%
  dyUnzoom()

Figure 4: An interactive sea surface temperature time series charts recorded Near Chumbe Island, Unguja in 2012. Zoom in to see the buttom for unzooming

dygrpah series group

dygraph allows to add series of group into a chart. For this example I used the primary productivity along the three channel of Tanzania—Pemba, Zanzibar and Mafia and the Exclusive Economic Zone. The sample data for this are shown in table 2.

Table 2: Sample of primary Production in Tanzania
Coastal Channels
Date EEZ Mafia Pemba Zanzibar
2009-09-15 671 1159 765 842
2013-09-15 850 1270 861 998
2016-01-15 510 1214 664 789
2004-10-15 710 1329 891 997
2004-11-15 556 1048 676 788
2008-07-15 736 1435 900 980
2015-09-15 892 1220 912 934
2017-08-15 846 1427 939 1008
2003-05-15 531 1188 721 943
2017-12-15 486 981 690 772
2010-12-15 408 906 575 683
2014-09-15 707 1326 687 950

For dygraphs to make time series group chart, we must tweak the data to the format it recognize. This involves making a time series object using the ts() function for each group. Once the object are created,are then combined to create time series group object with the cbind() function. The chunk below contains the line codes for the process.

pp.sites = cbind(EEZ = ts(data = pp.wide$EEZ, start = c(2003,1), frequency = 12),
                 Mafia = ts(data = pp.wide$Mafia, start = c(2003,1), frequency = 12),
                 Pemba = ts(data = pp.wide$Pemba, start = c(2003,1), frequency = 12),
                 Zanzibar = ts(data = pp.wide$Zanzibar, start = c(2003,1), frequency = 12))

The line of code in the chunk below were used to make a time series group chart shown in figure 5. This chart is smoothed with twelve months—this means they present the annual variability of primary productivity.

dygraph(data = pp.sites, 
        main = "Primary Production") %>%
  dySeries(name = "EEZ", stepPlot = FALSE, color = "red") %>%
  dyGroup(c("Mafia", "Pemba", "Zanzibar"), drawPoints = FALSE, 
          color = c("blue", "green", "black")) %>%
  dyRoller(rollPeriod = 12)%>%
  dyRangeSelector()

Figure 5: Time series group of primary production in Coastal and Marine Waters of Tanzania

It is obvious that that Mafia channel is more productive that other channels and the EEZ (Figure 5). However, the anomaly of primary productivity shown in figure 6 reveal that all sites have similar patterns of up and down trends. The sea surface temperature anomaly also show upward trend for all the sites between 2003 and 2010 then unclear trend thereafter (Figure 7). Notice these series were averaged with 12—reflecting an annual smoothing. If you want to see detailed monthly anomaly variation, simply change the value 12 in the lower left corner of the chart with 1. The lines of code for computing anomalies for site and making time series group and plotting them with dygraphs is shown in the chunk below.

## compute anomlay
pp.anomaly = pp.wide %>% 
  mutate(eez_an = EEZ-mean(EEZ, na.rm = T),
         mafia_an = Mafia-mean(Mafia, na.rm = T),
         pemba_an = Pemba-mean(Pemba, na.rm = T),
         zenji_an = Zanzibar-mean(Zanzibar, na.rm = T) ) %>% 
  select(date, EEZ = eez_an, Mafia = mafia_an, Pemba = pemba_an, Zanzibar = zenji_an)

## make a group time series
pp.anomaly.sites = cbind(EEZ = ts(data = pp.anomaly$EEZ, start = c(2003,1), frequency = 12),
                         Mafia = ts(data = pp.anomaly$Mafia, start = c(2003,1), frequency = 12),
                         Pemba = ts(data = pp.anomaly$Pemba, start = c(2003,1), frequency = 12),
                         Zanzibar = ts(data = pp.anomaly$Zanzibar, start = c(2003,1), frequency = 12))

## plot anomaly chart
dygraph(data = pp.anomaly.sites, 
        main = "Anomaly of Primary Production") %>%
  dySeries(name = "EEZ", stepPlot = FALSE, color = "red") %>%
  dyGroup(c("Mafia", "Pemba", "Zanzibar"), drawPoints = FALSE, 
          color = c("blue", "green", "black")) %>%
  dyRoller(rollPeriod = 12)%>%
  dyRangeSelector()%>%
  dyCrosshair(direction = "vertical")

Figure 6: Time series group of primary production anomaly in Coastal channels and Marine Waters of Tanzania

Figure 7: Time series group of sea surface temperature anomaly in Coastal channels and Marine Waters of Tanzania

References

Grolemund, Garrett, and Hadley Wickham. 2011. “Dates and Times Made Easy with lubridate.” Journal of Statistical Software 40 (3): 1–25. http://www.jstatsoft.org/v40/i03/.

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Vanderkam, Dan, JJ Allaire, Jonathan Owen, Daniel Gromer, and Benoit Thieurmel. 2018. Dygraphs: Interface to ’Dygraphs’ Interactive Time Series Charting Library. https://CRAN.R-project.org/package=dygraphs.

Wickham, Hadley. 2017. Tidyverse: Easily Install and Load the ’Tidyverse’. https://CRAN.R-project.org/package=tidyverse.