forked from mccallpitcher/LearnR-Part2
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy path01_inclass_solutions.qmd
More file actions
86 lines (61 loc) · 2.18 KB
/
01_inclass_solutions.qmd
File metadata and controls
86 lines (61 loc) · 2.18 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
---
title: "Learn R Part II In-class Exercises"
format: html
editor: visual
---
Welcome to Quarto! This is where you will try out all of the hands-on exercises in the workshop. Begin by running these first two code chunks:
```{r}
# load packages
library(tidyverse)
```
```{r}
# load data
tswift <- read_csv("data/taylor_swift_spotify.csv")
nat_parks <- read_csv("data/nat_parks_visitors.csv")
```
## 00. Quick review
Using an `if_else()` statement, create a new variable in `tswift` that indicates if a song is "long" or "short". Name the variable `long_short`.
Songs are considered "long" if `duration_ms` is greater than 250000.
```{r}
# create long_short variable
tswift <- tswift |>
mutate(long_short = if_else(duration_ms > 250000,
"long",
"short"))
```
## 01. How do I aggregate by collapsing?
Using `group_by()` and `summarise()`, calculate average `danceability` by `long_short`.
On average, are Taylor Swift's longer or shorter songs more "danceable"?
***Shorter***
```{r}
tswift |>
group_by(long_short) |>
summarise(avg_danceability = mean(danceability))
```
## 02. How do I aggregate *without* collapsing?
Alter the `tswift` data frame to add a variable that calculates average acousticness by album (without collapsing).
Bonus: Can you determine if the song "Cruel Summer" is more or less acoustic than the Lover album average?
***Less acoustic (.12 vs. .33)***
```{r}
# add average acousticness by album
tswift <- tswift |>
group_by(album) |>
mutate(avg_acoustic = mean(acousticness))
# find Cruel Summer
tswift |>
select(name, acousticness, avg_acoustic) |>
filter(name == "Cruel Summer")
```
## 03. How do I tidy data?
Pivot the `nat_parks` data frame longer so that year and visitors each make a column.
Hint: Pivot the year columns only. To specify them, you can use either of these structures:
`cols = start:stop`
`cols = -c(column1, column2`)
```{r}
# pivot data long
nat_parks_long <- nat_parks |>
pivot_longer(cols = 3:7,
names_to = "year",
values_to = "visitors")
nat_parks_long
```