Psych-7709-Learn-R/presentation.Rmd at master · 693284/Psych-7709-Learn-R · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
---
title: "Package 'psych'"
subtitle: "Procedures for Psychological, Psychometric, and Personality
Research"
author: "Zoren Degtyarev"
date: "2019/05/15 (updated: `r Sys.Date()`)"
output:
  xaringan::moon_reader:
    lib_dir: libs
    nature:
      highlightStyle: github
      highlightLines: true
      countIncrementalSlides: false
---


```{r setup, include=FALSE}
options(htmltools.dir.version = FALSE)
```


---

# Introduction
--

### A toolbox for personality, psychometric theory and experimental psychology

--
- Functions are primarily for multivariate analysis and scale construction using factor analysis, principal component analysis, cluster analysis and reliability analysis, although others provide basic descriptive statistics

--

---
--
# Introduction

### Functions for analyzing data at multiple levels include within and between group statistics, including correlations and factor analysis
--

- Several functions serve as a useful front end for structural equation modeling

--

- Graphical displays of path diagrams, factor analysis and structural equation models are created using basic graphics
--

- Some of the functions are written to support a book on psychometric theory as well as publications in personality research

---
--
# Overview of the psych package
--

###  Developed at Northwestern University to include functions most useful for personality and psychological research

--

- Some of the functions (e.g., read.file, read.clipboard, describe, pairs.panels, error.bars and error.dots) are helpful for basic data entry and descriptive analyses

--
- Use help(package="psych") or objects("package:psych") for a list of all functions Three vignettes are included as part of the package. The intro vignette tells how to install psych and overview vignette provides examples of using psych in many applications.

--
- Additionally, there are sets of tutorials available on the https://personality-project.org/r webpages.

---
--

# Overview of the psych package.

--
- To load up use library (psych)

--

### Psychometric applications

--

- Schmid Leiman transformations (schmid) to transform a hierarchical factor structure into a bifactor solution

--

- Functions for determining the number of factors in a data matrix include Very Simple Structure (VSS) and Minimum Average Partial correlation (MAP)


### An alternative approach to factor analysis is Item Cluster Analysis (ICLUST)

--

- Reliability coefficients alpha (scoreItems, score.multiple.choice), beta (ICLUST) and McDonald’s omega (omega and
omega.diagram)

--

### Multilevel analyses

--

- may be done by statsBy and multilevel.reliability

---
--
### The scoreItems, and score.multiple.choice functions

--

- Used to form single or multiple scales from sets of dichotomous, multilevel, or multiple choice items by specifying scoring keys

--

### Functions to apply various standard statistical tests

--

- include p.rep and its variants for testing the probability of replication, r.con for the confidence intervals of a correlation, and r.test to test single, paired, or sets of correlations

--

### Studying diurnal or circadian variations in mood

--

- Helpful to use circular statistics
- Functions to find the circular mean (circadian.mean)
- circular (phasic) correlations (circadian.cor) and the correlation between linear variables and circular variables (circadian.linear.cor)
- This supplements a function to find the best fitting phase angle (cosinor) for measures taken with a fixed period (e.g., 24 hours)

---
--

# Functions continued

--

### Functions to generate simulated data with particular structures

--
- Include sim.circ (for circumplex structures)

--
- sim.item (for general structures) and sim.congeneric (for a specific demonstration of congeneric measurement)

--
- The functions sim.congeneric and sim.hierarchical - used to create data sets with particular structural properties

--
- A more general form for all of these is sim.structural for generating general structural models

--
- More detail in the vignette (psych_for_sem)

---
--
### useful functions for basic data manipulation including read.file, read.clipboard, describe, pairs.panels, error.bars and error.dots) which are useful for basic data entry and descriptive analyses

--

- When given a set of items from a personality inventory

--

- useful to combine these into higher level item composites

--

- What are the basic properties of the data?

--
- describe reports basic summary statistics (mean, sd, median, mad, range, minimum, maximum, skew, kurtosis, standard error)

--
- for vectors, columns of matrices, or data.frames. describeBy provides descriptive statistics, organized by one or more grouping variables

--
- statsBy provides even more detail for data structured by groups including
within and between correlation matrices

--
- pairs.panels - shows scatter plot matrices (SPLOMs)
- histograms and the Pearson correlation for scales or items. error.bars will plot variable means with associated confidence intervals

--
- errorCircles will plot confidence intervals for both the x and y coordinates

--
- corr.test will find the significance values for a matrix of correlations

--
- error.dots creates a dot chart with confidence intervals

---
--
### Functions to generate simulated data

--

- basic statistics

--
- it is useful to be able to generate examples suitable for analysis of variance or simple linear models

--

- sim.anova will generate the design matrix of three independent
variables (IV1, IV2, IV3) with an arbitrary number of levels and effect sizes for each main effect
and interaction

--

- IVs can be either continuous or categorical and can have linear or quadratic effects

--

- Either a single dependent variable or multiple (within subject) dependent variables are generated
according to the specified model

--

- The repeated measures are assumed to be tau equivalent with a
specified reliability

---
--

```{r, eval=TRUE}
library(psych)

set.seed(42)
data.df <- psych::sim.anova(es1=1,es2=.5,es13=1) # one main effect and one interaction
describe(data.df)
```
---
--
```{r}
pairs.panels(data.df) #show how the design variables are orthogonal
#
```

---
```{r}
summary(lm(DV~IV1*IV2*IV3,data=data.df))
summary(aov(DV~IV1*IV2*IV3,data=data.df))
set.seed(42)
```

---

```{r, echo=FALSE}
#demonstrate the effect of not centering the data on the regression
data.df <- sim.anova(es1=1,es2=.5,es13=1,center=FALSE) #
describe(data.df)
#
#this one is incorrect, because the IVs are not centered
summary(lm(DV~IV1*IV2*IV3,data=data.df))
summary(aov(DV~IV1*IV2*IV3,data=data.df)) #compare with the lm model
#now examine multiple levels and quadratic terms
set.seed(42)
```

---
```{r, echo=FALSE}
data.df <- sim.anova(es1=1,es13=1,n2=3,n3=4,es22=1)
summary(lm(DV~IV1*IV2*IV3,data=data.df))
summary(aov(DV~IV1*IV2*IV3,data=data.df))
pairs.panels(data.df)
#
data.df <- sim.anova(es1=1,es2=-.5,within=c(-1,0,1),n=10)
pairs.panels(data.df)
```

```{r}
print(data.df$IV1)

```

---
--

### References

--
J. Cohen (1982) Set correlation as a general multivariate data-analytic method. Multivariate Behavioral Research, 17(3):301-341.

J. Cohen, P. Cohen, S.G. West, and L.S. Aiken. (2003) Applied multiple regression/correlation
analysis for the behavioral sciences. L. Erlbaum Associates, Mahwah, N.J., 3rd ed edition.

H. Hotelling. (1936) Relations between two sets of variates. Biometrika 28(3/4):321-377.
E.Cramer and W. A. Nicewander (1979) Some symmetric, invariant measures of multivariate association. Psychometrika, 44:43-54.

V. D. R. Guide Jr. and M. Ketokivim (2015) Notes from the Editors: Redefining some methodological criteria for the journal. Journal of Operations Management. 37. v-viii

---
### Thank You!