Skip to content

Accounting for "ties" in rank-biserial corr in paired/one-sample cases #502

@arcaldwell49

Description

@arcaldwell49

For paired samples (or one-sample), the rank-biserial correlation gives a (seemingly) odd result when there are ties with the mu argument. For example (code below), if I use the sleep data set there will be 1 tie between the paired comparisons therefore, in my opinion, the effect size, or probability of superiority, should not be -1.00 or 0%. Is there any particular reason for this?

> # example of function
> rank_biserial(extra ~ group, data = sleep, paired = TRUE,
+               parametric = FALSE, mu = 0)
r (rank biserial) |         95% CI
----------------------------------
-1.00             | [-1.00, -1.00]
> # take paired differences
> z = subset(sleep, group == 1)$extra - subset(sleep, group == 2)$extra
> # calculate sum less than zero
> # not equal to 100% less!
> sum(z < 0) / length(z)
[1] 0.9

My feeling is that with .r_rbs should actually be the following for paired samples.

z = na.omit((x - y) - mu)
Ry <- effectsize:::.safe_ranktransform(z, sign = TRUE, verbose = verbose)
Ry0 <-ifelse(is.na(Ry),1,0)
Ry <- stats::na.omit(Ry)

n <- length(na.omit((x - y) - mu))
S <- (n * (n + 1) / 2)

U1 <- sum(Ry[Ry > 0], na.rm = TRUE) + 0.5*sum(Ry0[Ry0 == 1], na.rm = TRUE)
U2 <- -sum(Ry[Ry < 0], na.rm = TRUE) + -0.5*sum(Ry0[Ry0 == 1], na.rm = TRUE)

u_ <- U1 / S
f_ <- U2 / S
u_ - f_

For this exact example, it provides the correct result (again, just my opinion).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions