Namespace issues
Most of the functionality in R comes from additional packages that you load. Sometimes two packages will have a function with the same name but they will do different things. In a situation where you have multiple packages with functions with the same name loaded, R will use the the function from the package you loaded the latest. As you can imagine, this can sometimes create problems. If you are lucky, you get an error message but if you are unlucky your code runs but with an unexpected result.
Let me give you an example. I always load the dplyr package. Look what happens when I use summarize to calculate the mean sepal lenght by species.
library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union iris %>% group_by(Species) %>% summarize(sepal_length_mean=mean(Sepal.Length)) ## # A tibble: 3 x 2 ## Species sepal_length_mean ## <fct> <dbl> ## 1 setosa 5.01 ## 2 versicolor 5.94 ## 3 virginica 6.59
Say that I then realise that I need the Hmisc package and load it. Look what happens when I rerun the same code as above.
library(Hmisc) ## Loading required package: lattice ## Loading required package: survival ## Loading required package: Formula ## Loading required package: ggplot2 ## ## Attaching package: 'Hmisc' ## The following objects are masked from 'package:dplyr': ## ## src, summarize ## The following objects are masked from 'package:base': ## ## format.pval, units iris %>% group_by(Species) %>% summarize(sepal_length_mean=mean(Sepal.Length)) ## Error in summarize(., sepal_length_mean = mean(Sepal.Length)): argument "by" is missing, with no default
R is now using the summarize function from the Hmisc package and I get an error because the syntax is wrong. The best way to solve this problem is to use the :: operator.Writing packagename::functionname tells R which package to get the function from.
iris3 <- iris %>% group_by(Species) %>% dplyr::summarize(sepal_length_mean=mean(Sepal.Length))