# Anova-Freak and Bayesian Hipster #rstats

**R – Strenge Jacke!**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’m pleased to announce an update of my sjstats-package. New features are specifically implemented for the Anova and Bayesian statistic and summary functions. Here’s a short overview of what’s new…

# Anova statistics

Beside the already implemented functions to calculate eta-squared, partial eta-squared and omega-squared, it is now also possible to calculate *partial omega-squared* statistics for Anova-tables.

library(sjstats) # load sample data data(efc) # fit linear model fit <- aov( c12hour ~ as.factor(e42dep) + as.factor(c172code) + c160age, data = efc ) omega_sq(fit, partial = TRUE) #> # A tibble: 3 x 2 #> term partial.omegasq #> 1 as.factor(e42dep) 0.278 #> 2 as.factor(c172code) 0.00547 #> 3 c160age 0.0649

Furthermore, `eta_sq()`

and `omega_sq()`

have a `ci.lvl`

-argument, which – if not `NULL`

– also computes a confidence interval.

For eta-squared, i.e. `eta_sq()`

with `partial = FALSE`

, due to non-symmetry, confidence intervals are based on bootstrap-methods. Confidence intervals for partial omega-squared, i.e. `omega_sq()`

with `partial = TRUE`

– are also based on bootstrapping. In these cases, `n`

indicates the number of bootstrap samples to be drawn to compute the confidence intervals.

eta_sq(fit, partial = TRUE, ci.lvl = .8) #> # A tibble: 3 x 4 #> term partial.etasq conf.low conf.high #> 1 as.factor(e42dep) 0.281 0.247 0.310 #> 2 as.factor(c172code) 0.00788 0.00109 0.0160 #> 3 c160age 0.0665 0.0467 0.0885

# Bayiasian statistics and summaries

Some of the summary-functions for Bayesian models were polished in the current update, e.g. `hdi()`

. Let’s fit a (non-sense) model with the great **brms**-package first (you’ll see later why I don’t use **rstanarm** here):

library(lme4) library(brms) library(dplyr) data(sleepstudy) sleepstudy$mygrp <- sample(1:5, size = 180, replace = TRUE) sleepstudy % group_by(mygrp) %>% mutate(mysubgrp = sample(1:30, size = n(), replace = TRUE)) m <- brm( Reaction ~ Days + (1 | mygrp / mysubgrp) + (1 | Subject), data = sleepstudy )

## Highest Density Intervals

`hdi()`

calculates by default the 89% HDI for Bayesian regression models:

hdi(m) #> # A tibble: 3 x 3 #> term hdi.low hdi.high #> 1 b_Intercept 232.00 268.0 #> 2 b_Days 9.15 11.8 #> 3 sigma 27.20 33.5

Now you can calculate the HDI for multiple tresholds (probabilities) at once, using the `prob`

-argument:

hdi(m, prob = c(.5, .8, .95)) #> # A tibble: 3 x 7 #> term hdi.low_0.5 hdi.high_0.5 hdi.low_0.8 hdi.high_0.8 hdi.low_0.95 hdi.high_0.95 #> 1 b_Intercept 244.00 258.0 237.00 265.0 229.00 274.0 #> 2 b_Days 9.97 11.1 9.51 11.6 8.90 12.1 #> 3 sigma 28.90 31.5 27.60 32.5 26.60 33.8

## Tidy summary

A tidy summary, similar to the `tidy()`

-function from the **broom**-package, can be obtained with `tidy_stan()`

. By default, for multilevel models, this function only shows the fixed effects. Use the different options from the `type`

-argument if you want random effects only or both random and fixed effects to be printed.

tidy_stan(m) #> # A tibble: 3 x 8 #> term estimate std.error hdi.low hdi.high n_eff Rhat mcse #> 1 b_Intercept 250.0 10.900 233.00 268.0 0.326 1.00 0.312 #> 2 b_Days 10.6 0.835 9.18 11.8 1.000 1.00 0.013 #> 3 sigma 30.2 1.930 27.30 33.4 1.000 1.00 0.202

## Intraclass Correlation Coefficient (ICC)

The ICC is a useful statistic for random-intercept models, which describes the proportion of variance that is explained by the random effects structure (see `?icc`

for more details). The ICC is based on the variance and correlation components of the model, which you typically get with the `VarCorr`

function. According to the **brms**-package, there’s a good and a bad news: from a developers perspective, the „bad“ news is that `VarCorr.brmsfit`

has recently changed in its internal code structure, so I had to revise my `icc()`

-function to work with **brms**-models again, with success:

icc(m) #> Bayesian mixed model #> Family: gaussian (identity) #> Formula: list() Reaction ~ Days + (1 | mygrp/mysubgrp) + (1 | Subject) list() #> #> ICC (mygrp): 0.032317 #> ICC (mygrp:mysubgrp): 0.013529 #> ICC (Subject): 0.598439

The good news is, that `VarCorr.brmsfit`

has recently changed in its internal code structure. Now it is also possible to get the variance and correlation components *for each sample* of the model, which allows us to compute the ICC *for each sample* of the model, which again in turn allows us to compute an ICC including uncertainty intervals very quickly – simply add `posterior = TRUE`

to the `icc()`

-function call:

icc(m, posterior = TRUE) #> # Random Effect Variances and ICC #> #> Family: gaussian (identity) #> Formula: Reaction ~ Days + (1 | mygrp/mysubgrp) + (1 | Subject) #> #> ## mygrp #> ICC: 0.022 (HDI 89%: 0.000-0.104) #> Between-group: 57.406 (HDI 89%: 0.000-277.588) #> #> ## mygrp:mysubgrp #> ICC: 0.011 (HDI 89%: 0.000-0.050) #> Between-group: 28.535 (HDI 89%: 0.000-127.271) #> #> ## Subject #> ICC: 0.578 (HDI 89%: 0.419-0.737) #> Between-group: 1449.446 (HDI 89%: 682.362-2413.380) #> #> ## Residuals #> Within-group: 909.039 (HDI 89%: 722.154-1085.644)

By default, the 89% interval around the ICC „point estimate“ is shown in the output, however, the `print()`

-method has an argument to change the interval range to any value, e.g.

`print(icc(m, posterior = TRUE), prob = .5, digits = 2)`

.

# Final words

The colored output above is not just playing with CSS, the R console output is indeed colored!

**leave a comment**for the author, please follow the link and comment on their blog:

**R – Strenge Jacke!**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.