-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Max found that the level data doesn't look quite right as the rainfall was starting an hour after the water level increased (Ferko):
Per Monica:
"Ok, this is an OW whose data was added by the OW script in August, and covers a DST boundary in November 2024. This is incorrectly handled - it's supposed to show the 1 AM hour on 11/02/2024 twice - once with a (sprung-forwards) -04 offset and once with a -05 offset to represent the fall-back of DST ending. "
"The edit to the import code that I identified earlier in the day for Max's Ferko import fixes this issue. The solution to this data problem would be for me to delete all the data from the tbl_ow_leveldata_raw table with a date_added field after the DST migration and reimport it via the OW script. The tbl_gw_depthdata_raw table doesn't have a date_added field, so for that, I would need to delete and reimport everything"
Current code
marsMaintenanceScripts/01_localscripts/02_update_ow_data.R
Lines 100 to 112 in 8e14eeb
| accessdb_newdata <- sqlQuery(accessdbCon, accessdb_query, as.is = TRUE) %>% | |
| select(dtime = 1, level_ft = ncol(.)) %>% #dtime is the first column, level is the last | |
| mutate(dtime = ymd_hms(dtime, tz = "America/New_York"), level_ft = as.numeric(level_ft)) %>% #Data comes in as plain text from RODBC | |
| filter(dtime > accessdb_latestdates$dtime[i]) %>% #We still need to filter by > the latest date because Access will treat values with fractional seconds as > values without fractional seconds. When R recieves them, though, we get them without the fractional seconds, so from our perspective, we have a value that is = the latest date. This is very silly. | |
| mutate(secondbump = (second(dtime) == 59)) %>% #Calculate whether the seconds place needs to be bumped +1 second | |
| mutate(dtime = as.POSIXct(ifelse(secondbump, dtime + dseconds(1), dtime), tz = "America/New_York")) %>% | |
| filter(dtime > accessdb_latestdates$dtime[i]) %>% #We still need to filter by > the latest date because Access will treat values with fractional seconds as > values without fractional seconds. When R recieves them, though, we get them without the fractional seconds, so from our perspective, we have a value that is = the latest date. This is very silly. | |
| arrange(dtime) %>% #Order by ascending datetime in case it's out of order in the DB | |
| mutate(ow_uid = accessdb_latestdates$ow_uid[i]) %>% #Attach OW UID to the data | |
| mutate(key = paste(ow_uid, dtime, sep = "_"), | |
| dupe = duplicated(key)) %>% #Sometimes there are duplicates in the Access DBs | |
| filter(dupe == FALSE) %>% #Remove the dupe rows | |
| select(-key, -dupe, -secondbump) #Remove the key columns |
Initial Proposed fix:
Change time zone to EST when converting char to datetime
accessdb_newdata <- sqlQuery(accessdbCon, accessdb_query, as.is = TRUE) %>%
select(dtime = 1, level_ft = ncol(.)) %>% #dtime is the first column, level is the last
mutate(dtime = ymd_hms(dtime, tz = "EST"), level_ft = as.numeric(level_ft)) %>% #Data comes in as plain text from RODBC