-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path06-extras.Rmd
90 lines (53 loc) · 5.64 KB
/
06-extras.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
---
title: "Extras"
author: "Data Carpentry contributors"
layout: topic
---
------------
> ## Learning Objectives
>
> * Describe what a R add-on package is
> * Install a package from CRAN
> * Install a package from Bioconductor
> * Learn how to manage package conflicts
> * Learn about other ways to practice R
------------
# Add-on packages
## CRAN
In the last two lessons, we used an add-on package called `dplyr`. R is open source, so anyone can write their own functions for particular tasks. When you want to make these functions available to other people, you can develop them into a package with extra documentation like help pages for each function and a guide on how to put all the functions together in a typical analysis. If your package meets certain requirements, you can submit it to the Comprehensive R Archive Network (CRAN), who will host it and make it available to anyone in the world. CRAN currently has over [15,000 packages](http://cran.mtu.edu/web/packages/index.html)! To get any one of them, use R's built in install.packages function:
```{r, eval = FALSE, purl = FALSE}
install.packages("swirl") ## install a package to help learn R
```
You might get asked to choose a CRAN mirror -- this is basically asking you to choose a site to download the package from. The choice doesn't matter too much. I'd recommend choosing the RStudio mirror or one that is geographically close. You might also notice that more than one package gets downloaded. `install.packages()` checks to see if the package depends upon additional packages and downloads them as well.
## Bioconductor
Besides CRAN, there are other repositories for R packages. [Bioconductor](http://bioconductor.org/) provides packages for analyzing genomic data. They currently have over [1,800 software packages](http://bioconductor.org/packages/release/BiocViews.html#___Software) and 900 Annotation packages. To install a package from Bioconductor, you cannot use install.packages with default options. Instead, Bioconductor submitted their own small package to CRAN with a function to download packages from either Bioconductor or CRAN.
```{r, eval = FALSE, purl = FALSE}
install.packages("AnnotationDbi") #this should throw an error
install.packages("BiocManager") ## Bioconductor's CRAN package
BiocManager::install("AnnotationDbi")
```
## Package conflicts
What was that `::` in the line of code above? It is a way to call the function `install` that is specifically from the `BiocManager` package. You can imagine that with 13,000 + 1,500 software packages, the same simple function names get used often. It is also a way to use a function without loading its package first. Remember from the [dplyr lesson](04-dplyr.html), in order to use the functions we had to first load the dplyr package with the `library` function. *Every* time you open a new R session, you have to load packages from your library to use the functions unless you use the `::` designation. The `::` can also come in handy when you have two packages loaded that share a function name. Try this:
```{r, eval = FALSE, purl = FALSE}
library(dplyr)
library(AnnotationDbi)
```
If you already had the dplyr package loaded this session, then the first line may not produce any output in the console. But the second line should produce a lot of output, much of it similar to this:
```
The following object is masked from ‘package:dplyr’:
select
```
"Masked" means that since `AnnotationDbi` was loaded second, using the function `select` will typically result in using the one from `AnnotationDbi` instead of `dplyr`. Ask for help on `select` in the following ways:
```{r, eval = FALSE, purl = FALSE}
?select #R gives both options
?dplyr::select #takes you to dplyr's help page
?AnnotationDbi::select #takes you to AnnotationDbi's help page
```
If the order of loading packages had been reversed, then the `select` function from `dplyr` would mask the one from `AnnotationDbi`. This package confict becomes an issue when you need more and more packages for an analysis. Package maintainers can mitigate conflicts by writing their functions in a way that tells R to check the class of object given to the function and then select the correction function.
# More R practice
We have covered many of the basics of R, but because R is a language, you will not become fluent unless you get lots of practice. In addition to this workshop, there are many other ways to learn R. One excellent way is using the `swirl` package that we installed above. `swirl` uses self-paced, interactive lessons directly in R to cover many more topics that we have time to cover in this workshop. The only thing you have to remember to use swirl is to load the package with the `library` function:
```{r, eval = FALSE, purl = FALSE}
library(swirl)
```
Just read the output and do what it says. It will wait for you to type in a response and hit enter. It then covers some basics of how it works. There are several different courses and I suggest to start with `1: R Programming: The basics of programming in R`. Once the course installs, choose `1: R Programming`, then from the lessons start with `1: Basic Building Blocks`. Each lesson is interactive, giving you some information or asking a question and then waiting for your response. It checks to make sure your response is correct, and if not, will ask you to try again. These are great to work through on your own, at your own pace.
Another excellent way to get more R practice is to run through Software Carpentry's [Programming with R](https://swcarpentry.github.io/r-novice-inflammation/) lessons. It's focus is more on general programming, writing functions, for loops, if/else statements, writing packages and writing dynamic reports.