By now, even remote villages on uncharted islands in the Pacific know that the U.S. is in the midst of a protracted partial government shutdown. It’s having real impacts on the lives of Federal government workers but they aren’t the only ones. Much of the interaction Federal agencies have with the populace takes place online and the gateway to most of these services/information is a web site.
There are Federal standards that require U.S. government web sites to use SSL/TLS certificates and those certificates have something in common with, say, a loaf of bread you buy at the store: they expire. In all but the best of orgs — or we zany folks who use L e t ‘ s E n c r y p t and further propel internet denizens into a false sense of safety & privacy — renewing certificates involves manual labor/human intervention. For a good chunk of U.S. Federal agencies, those particular humans aren’t around. If a site’s SSL certificate expires and isn’t re-issued, it causes browsers to do funny things, like this:
Now, some of these sites are configured improperly in many ways, including them serving pages on both
https (vs redirecting to
https immediately upon receiving an
http connection). But, browsers like Chrome will generally try
https first and scare you into not viewing the site.
But, how big a problem could this really be? We can find out with a fairly diminutive R script that:
- grabs a list of Federal agency domains (thanks to the GSA)
- tries to make a SSL/TLS connection (via the
opensslpackage) to the apex domain or
www.prefixed apex domain
- find the expiration date for the cert
- do some simple date math
I’ve commented the script below pretty well so I’ll refrain from further blathering:
library(furrr) library(openssl) library(janitor) library(memoise) library(hrbrthemes) library(tidyverse) # fetch the GSA CSV: read_csv( file = "https://raw.githubusercontent.com/GSA/data/master/dotgov-domains/current-federal.csv", col_types = "ccccccc" ) %>% janitor::clean_names() -> xdf # make openssl::download_ssl_cert calls safer in the even there # are network/connection issues .dl_cert cdf
Now, lets make strings into proper dates, count only the dates starting with the date of the shutdown to the end of 2019 (b/c the reckless human at the helm is borderline insane enough to do that) and plot the timeline:
filter(cdf, !is.na(expires)) %>% mutate( expires = as.Date( as.POSIXct(expires, format="%b %d %H:%M:%S %Y") ) ) %>% arrange(expires) count(expires) %>% filter( expires >= as.Date("2018-12-22"), expires % ggplot(aes(expires, n)) + geom_vline( xintercept = Sys.Date(), linetype="dotted", size=0.25, color = "white" ) + geom_label( data = data.frame(), aes(x = Sys.Date(), y = Inf, label = "Today"), color = "black", vjust = 1 ) + geom_segment(aes(xend=expires, yend=0), color = ft_cols$peach) + scale_x_date(name=NULL, date_breaks="1 month", date_labels="%b") + scale_y_comma("# Federal Agency Certs") + labs(title = "2019 Federal Agency ShutdownCertpoalypse") + theme_ft_rc(grid="Y")
Now, I’m unwarrantedly optimistic that this debacle could be over by the end of January. How many certs (by agency) could go bad by then?
left_join(cdf, xdf, by=c("who"="domain_name")) %>% mutate( expires = as.Date( as.POSIXct(expires, format="%b %d %H:%M:%S %Y") ) ) %>% filter( expires >= as.Date("2018-12-22"), expires % count(agency, sort = TRUE) ## # A tibble: 10 x 2 ## agency n ##
## 1 Government Publishing Office 8 ## 2 Department of Commerce 4 ## 3 Department of Defense 3 ## 4 Department of Housing and Urban Development 3 ## 5 Department of Justice 3 ## 6 Department of Energy 1 ## 7 Department of Health and Human Services 1 ## 8 Department of State 1 ## 9 Department of the Interior 1 ## 10 Department of the Treasury 1
Not every agency is fully shutdown and not all workers in charge of cert renewals are furloughed (or being forced to work without pay). But, this one other area shows the possible unintended consequences of making rash, partisan decisions (something both Democrats & Republicans excel at).
You can find the contiguous R code at 2018-01-10-shutdown-certpocalypse.R and definitely try to explore the contents of those certificates.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…