In this assignment, you’ll be building and evaluating two mixed-effects regression models with MCMC in JAGS. The primary object of interest will be the effect of word frequency on two measures from a lexical decision task: (1) reaction times and (2) correctness. The first model will assume that (log) reaction times are normally distributed (i.e., linear regression) and the second model will assume that whether or not a trial is correct is distributed according to a Bernoulli distribution and that the regression model is in log-odds space (i.e., logistic regression). Both models should assume random intercepts for each subject and for each word and also random slopes of frequency for each subject. (Random slopes of frequency for each word would not make sense, since a given word cannot vary in frequency.) The main goal is to use 95% central posterior intervals to describe the likely size of the effects of frequency on RTs and on responding correctly.

You’ll turn in both your R script (that I can run) and a pdf write-up.

Further details

Write-up

For this assignment, instead of answering specific questions, you’ll be producing a coherent write-up describing what you did, the results you found, and what they mean. Your write-up should be divided into two main parts: the first part for the RT model and the second part for the Correct model. Each of these two parts should contain the following three sections (with section headings):

  1. Model specification. Here, describe the probabilistic model you did inference with, assuming that the reader has not read these homework instructions. You can assume the reader is familiar with mixed-effects regression, but you must explain all design decisions (such as what your independent variables and dependent variables were, how you transformed your variables, which random effects you included, and what priors you used).

  2. Inference and diagnostics. Here, describe how many chains and iterations of MCMC you used, why you think the model has converged, how we know that the boundaries of any uniform priors you used weren’t relevant to the posterior, and what the effective samples sizes are for each of your unknown model parameters.

  3. Results and discussion. Give the central 95% posterior intervals for the two regression coefficients (\(\beta_0\) and \(\beta_1\)) and describe precisely what they both mean (e.g., for a slope, for each [unit] increase in [X], there is an [increase/decrease] of [Z] [units] in [Y]). For interpreting intercepts, note that the plogis() function in R will convert a log-odds value back to a probability.