Data-Management/Data-Management-Setup-Project-01.Rmd

---
title: "Data Management Setup Project"
author: "Lauren Yee"
date: "15/07/2020"
output: word_document
---


```{r setup, include=FALSE,fig.path = 'Figs/', dev="png",dpi=300}
knitr::opts_chunk$set(echo = TRUE)
```

# Setting up R Project

One of the first steps of every workflow should be to set up a Project within RStudio. A Project is the home for all of the files, images, reports, and code that are used in any given project. Note that when we capitalize the word Project, we’re referring to a specific setup within RStudio, while we refer to general projects that you might work on with the lowercase project.

We use Projects because they create a self-contained folder for a given analysis in R. This means that if you want to share your Project with a colleague, they will not have to reset file paths (or even know anything about file paths!) in order to re-run your analysis.

Furthermore, even if the only person you ever collaborate with is a future version of yourself, using a Project for each of your analyses will mean that you can move the Project folder around on your computer, or even move it to a new computer, and remain confident that the analysis will run in the future (at least in terms of file path structures).


Creating a Project is one of the first steps in working on an R-based data science project in RStudio. To create a Project you will need to first open RStudio.

From within RStudio, follow these steps:

    Click on File
    Select New Project
    Choose New Directory
    Click on New Project
    
  Enter your Project’s name in the box that says “Directory name.” We recommend choosing a Project name that helps you to remember that this is a project that involves data management and cleaning.
  
Avoid using spaces in your Project name, and instead separate words with hyphens or underscore characters.

Choose where to save your Project by clicking on “Browse” next to the box labeled “Create project as a subdirectory of:” If you are just using this to learn and to test out creating a Project, consider placing it in your downloads or another temporary directory so that you remember to remove it later.

    Click “Create Project”
    
    
![](./Figures/rstudio-project-1.png)    
    

At this point, you should have a Project that will serve as a place to store any .R scripts that you create as you work through this text. If you’d like more practice, take a few moments to set up a couple of additional Projects by following the steps listed above. Within each Project, add and save .R scripts. Since this is just for practice, feel free to delete these Projects once you have the hang of the procedure.


# Arranging your Markdown document:

    Start each program with a description of what it does.

    Then load all required packages.

    Consider what working directory you are in when sourcing a script.

    Use comments to mark off sections of code.

    Put function definitions at the top of your file, or in a separate file if there are many.

    Name and style code consistently.

    Break code into small, discrete pieces.

    Factor out common operations rather than repeating them.

    Keep all of the source files for a project in one directory and use relative paths to access them.

    Keep track of the memory used by your program.

    Always start with a clean environment instead of saving the workspace.

    Keep track of session information in your project folder.

    Have someone else review your code.

    Use version control.
    

# Style / Naming Conventions

Object names must start with a letter, and can only contain letters, numbers, .codeblock`_` and `.` 
You want your object names to be descriptive, so you’ll need a convention for multiple words. I recommend snake_case where you separate lowercase words with `_`


    i_use_snake_case
    
    otherPeopleUseCamelCase
    
    some.people.use.periods
    
    And_aFew.People_RENOUNCEconvention

# R Markdown

Open this document in R Studio to examine how this document is structured in R Markdown and how it is generated by word. I recommend having both R studio and the word document side by side.

All R markdown documents end in **.rmd** as opposed to **.R**. The start of a markdown file is called a "YAML". Here you can specify the title and other meta data attributes to your file, as well as the outputs generated. Such as a pdf, word file, or html document

Each "chunk" represented by ``` of R code can executed independently and visualizations are generated in-line.

```{r default,echo=FALSE,out.width="90%",out.extra=""}
knitr::include_graphics('./Figures/markdown.png')


```

# Adding chunks

To add a new code chunk, press *Cmd+Option+I* (*Ctrl+Alt+I* on Windows), or click the *Insert* button at the top of this document, then select *R*. R Markdown will add a new, empty chunk at your cursor's location.

Try making a code chunk below:



Examine the chunks below:


```{r}
# Sometimes you might want to run only some of the code 
# in a code chunk. To do that, highlight the code to 
# run and then press Cmd + Enter (Control + Enter on 
# Windows). If you do not highlight any code, R will 
# run the line of code that your cursor is on.
# Try it now. Run mean(1:5) but not the line below.
mean(1:5)
warning("You shouldn't run this!")
```

```{r}
# You can click the downward facing arrow to the left of the play button to run
# every chunk above the current code chunk. This is useful if the code in your
# chunk uses object that you made in previous chunks.
# Sys.Date()
```

Did you notice the green lines in the code chunk above? They are *code comments*, lines of text that R ignores when it runs the code. R will treat everything that appears after `#` on a line as a code comment. As a result, if you run the chunk above, nothing will happen—it is all code comments (and that's fine)!

Remove the `#` on the last line of the chunk above and then rerun the chunk. Can you tell what `Sys.Date()` does?

By the way, you only need to use code comments _inside_ of code chunks. R knows not to try to run the text that you write outside of code chunks.


# Chunk Options

`eval = FALSE` prevents code from being evaluated. This is useful for displaying example code, or for disabling a large block of code without commenting each line.

```{r, eval = FALSE}
mean(1:5)
```


`include = FALSE` runs the code, but doesn’t show the code or results in the final document. 

```{r, include = FALSE}
mean(1:5)
```


`echo = FALSE` prevents code, but not the results from appearing in the finished file. Use this when writing reports aimed at people who don’t want to see the underlying R code. Or to show a figure generated by `ggplot2`

```{r, echo=FALSE}
mean(1:5)
```

# Themes in R Markdown

There are built in themes you can explore in R markdown that can we changed through the YAML header (more on this in future sessions).
Note that embedding images in R Markdown can take many forms, this is just one way. Also note the **relative path** used to our Figures Directory.

Below is the default R markdown theme.

```{r markdown output,out.width="90%"}

knitr::include_graphics("./Figures/default_markdown.PNG")

```

# Change Theme

By changing the YAML header theme to 'darkly' we get a different look.

```{r markdown output2}

knitr::include_graphics("./Figures/darkly_yaml.png")

```

```{r darkly}
knitr::include_graphics("./Figures/darkly.png")
```


# Text formatting

Have you noticed the funny highlighting that appears in this document? R Markdown treats text surrounded by *asterisks*, **double asterisks**, and `backticks` in special ways. It is R Markdown's way of saying that these words are in

- _italics_
- *also italics*
- **bold**, and
- `code font`

`*`, `**`, and \` are signals used by a text editing format known as `markdown`. R Markdown uses `markdown` to turn your plain looking .Rmd documents into polished reports. Let's give that a try.

# Reports

When you click the `knit` button at the top of an R Markdown file (like this one), R Markdown generates a polished copy of your report. R Markdown:

1. Transforms all of your markdown cues into actual formatted text (e.g. bold text, italic text, etc.)
2. Reruns all of your code chunks in a clean R session and appends the results to the finished report.
3. Saves the finished report alongside your .Rmd file

Click the *knit* button at the top of this document or press *Cmd+Shift+K* (*Ctrl+Shift+K* on Windows) to render the finished report. The RStudio IDE will open the report so you can see its contents. For now, our reports will be HTML files. Try clicking *Knit* now.