Note: This document adapts examples from the legacy R-INLA documentation, authored by the R-INLA team.

1 Introduction

We consider the salamander mating experiments reported and analysed by (Section 14.5, McCullagh and Nelder 1989). Data consist of three separate experiments. The first one, conducted in the summer of 1986, used two groups of 20 salamanders. Each group comprised five male Rough Butt, five male Whiteside, five female Rough Butt and five female Whiteside salamanders. Within each group, each animal was paired to three animals of the opposite sex from the same and from the other population. Therefore, 60 male-female pairs were formed per group leading to 120 observations in one experiment for the two groups. Two further experiments with equal design were conducted in the fall of 1987. The animals used for the first and the second experiment were identical. A new set of salamanders was utilised for the third experiment.

The main scientific question addressed in the study was whether the mating of both geographically isolated species of salamanders was as successful as the one between the animals from the same population. Moreover, there was some interest if a seasonal effect could be identified. To this end, the following dummy variables were coded:

  • fW Whiteside female “yes”: 1, “no”: 0
  • mW Whiteside male “yes”: 1, “no”: 0
  • WW interaction effect of wsf and wsm
  • fall experiment conducted in fall “yes”: 1, “no”: 0

2 Model

Let \(Y_{ijk}\) denote the binary outcome (0 = failure, 1 = success) of the mating for female \(i=1, \ldots, 20\) and male \(j=1,\ldots, 20\) in experiment \(k=1, \ldots, 3\). Thus \(Y_{ijk} \mid \pi_{ijk} \sim \operatorname{Bern}(\pi_{ijk})\), with \[ \operatorname{logit}({\pi_{ijk}}) = \mathbf{x}_{ijk}^\top \boldsymbol{\beta} + b_{ik}^F + b_{jk}^M \] where \(x_{ijk}\) is a vector comprising an intercept, \({\tt wsf}_{ik}\), \({\tt wsm}_{jk}\), \({\tt ww}_{ijk}\) and \({\tt fall}_{ijk}\) variables, \(\boldsymbol{\beta}\) is the corresponding vector of the fixed effects parameters, and \(b_{ik}^F\) and \(b_{jk}^M\) are female and male random effects respectively.

2.1 Prior distributions

We assume that \(b_{i3}^F \sim \operatorname{N}(0, \kappa^F)\), \(b_{i3}^M \sim \operatorname{N}(0, \kappa^M)\) and \[ \begin{align*} {b_{i1}^F}\choose{b_{i2}^F} &\sim \operatorname{N}_2(\mathbf{0}, \mathbf{W}^{-1}_F)& {b_{i1}^M}\choose{b_{i2}^M} &\sim \operatorname{N}_2(\mathbf{0}, \mathbf{W}^{-1}_M) \end{align*} \] where the covariance matrices \(\mathbf{W}^{-1}_F\) and \(\mathbf{W}^{-1}_M\) are parameterised in terms of precisions and correlations leading to a total of eight hyperparameters in the model.

2.2 Hyperprior distributions

We assume that both \(\kappa^F\) and \(\kappa^M\) follow a gamma prior distribution with shape equal to \(1\) and rate equal to \(0.622\). The precision matrices \(\mathbf{W}_M\) and \(\mathbf{W}_F\) are Wishart distributed: \[ \begin{align*} \mathbf{W}_M &\sim \text{Wishart}_{p}(r, \mathbf{R}^{-1})& \mathbf{W}_F &\sim \text{Wishart}_{p}(r, \mathbf{R}^{-1}) \end{align*} \] with the following settings: \[ \begin{align*} p = 2,& &r = 3,& &\mathbf{R} = \begin{pmatrix} 1.244 &0 \\ 0 & 1.244 \end{pmatrix}. \end{align*} \]

3 Data

# Load data and libraries
library(tidyverse)
library(INLA)
here::i_am("salamander/salamander.Rmd")
load(here::here("salamander/salam.RData"))
str(salam)
## List of 4
##  $ x: num [1:120, 1:4] 1 1 1 1 1 0 0 0 0 0 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:4] "R/R" "R/W" "W/R" "W/W"
##  $ y: num [1:120, 1:3] 1 1 1 1 1 1 0 0 0 0 ...
##  $ z: num [1:120, 1:40] 1 0 0 0 0 0 0 0 0 0 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:40] "fR1" "fR2" "fR3" "fR4" ...
##  $ i: num [1:40] 1 1 1 1 1 1 1 1 1 1 ...

Currently the responses of the mating experiments are stored in a \(120 \times 3\) matrix y, where the columns correspond to the three experiments. The design matrix x is also stored in a separate matrix. We will have to organise the data into a form that is suitable for logistic regression, and also add all of the necessary indices for the random effects.

dat <-
  with(salam, {
    data.frame(
      exp = y,
      fW = as.integer(x[, "W/R"] == 1 | x[, "W/W"] == 1),  # female Whiteside
      mW = as.integer(x[, "R/W"] == 1 | x[, "W/W"] == 1),  # male Whiteside
      # Matrix of female and male indices
      t(apply(z, 1, \(x) {
        tmp <- which(x == 1)
        tmp[2] <- tmp[2] - 20
        names(tmp) <- c("female", "male")
        tmp
      }))
    )
  }) |>
  pivot_longer(
    cols = starts_with("exp"),
    names_to = "exp",
    values_to = "y"
  ) |>
  mutate(
    WW = fW * mW,  # interaction effect
    fall = ifelse(exp == "exp.1", 0, 1),  # fall indicator
    fid_e1e2 = case_when(
      exp == "exp.1" ~ female,
      exp == "exp.2" ~ female + 20,
      TRUE ~ NA
    ),
    mid_e1e2 = case_when(
      exp == "exp.1" ~ male,
      exp == "exp.2" ~ male + 20,
      TRUE ~ NA
    ),
    fid_e3 = ifelse(exp == "exp.3", female, NA),
    mid_e3 = ifelse(exp == "exp.3", male, NA)
  ) |>
  select(y, fW, mW, WW, fall, exp, fid_e1e2, mid_e1e2, fid_e3, mid_e3)
print(dat)
## # A tibble: 360 × 10
##        y    fW    mW    WW  fall exp   fid_e1e2 mid_e1e2 fid_e3 mid_e3
##    <dbl> <int> <int> <int> <dbl> <chr>    <dbl>    <dbl>  <dbl>  <dbl>
##  1     1     0     0     0     0 exp.1        1        1     NA     NA
##  2     1     0     0     0     1 exp.2       21       21     NA     NA
##  3     1     0     0     0     1 exp.3       NA       NA      1      1
##  4     1     0     0     0     0 exp.1        2        5     NA     NA
##  5     1     0     0     0     1 exp.2       22       25     NA     NA
##  6     0     0     0     0     1 exp.3       NA       NA      2      5
##  7     1     0     0     0     0 exp.1        3        2     NA     NA
##  8     0     0     0     0     1 exp.2       23       22     NA     NA
##  9     1     0     0     0     1 exp.3       NA       NA      3      2
## 10     1     0     0     0     0 exp.1        4        4     NA     NA
## # ℹ 350 more rows

4 Preliminary plot

dat |>
  mutate(cross = interaction(fW, mW, sep = "", lex.order = TRUE),
         cross = recode(cross, "00"="R♀×R♂", "10"="W♀×R♂",
                               "01"="R♀×W♂", "11"="W♀×W♂"),
         season = if_else(fall==1, "Fall", "Summer")) |>
  group_by(season, cross) |>
  summarise(p_hat = mean(y), .groups = "drop") |>
  ggplot(aes(x = cross, y = p_hat, col = season, group = season)) +
  geom_point() + geom_line() +
  labs(x = NULL, y = "Success rate", linetype = NULL, color = NULL) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
  theme_minimal()

The exploratory plot illustrates the observed mating success rates for the four combinations of male–female salamander pairings, separated by season. A clear pattern emerges: matings between individuals of the same population (R♀×R♂ and W♀×W♂) show considerably higher success rates than those between different populations, particularly W♀×R♂, which exhibits the lowest success of all. This suggests some degree of reproductive isolation between the Rough Butt and Whiteside salamanders.

In addition, the overall success rates during the fall are consistently lower than in the summer, hinting at a possible seasonal effect—perhaps linked to environmental or physiological conditions that reduce mating success. These visual patterns motivate the logistic mixed model that follows, where we will test whether the female and male population origins, their interaction, and a fall indicator significantly affect the probability of successful mating, while allowing for random effects to account for individual-level variability among females and males.

5 INLA fit

frm <- y ~ fW + mW + WW + fall +
  f(fid_e1e2, model = "iid2d", n = 40, param = c(3, 1.244, 1.244, 0)) +
  f(mid_e1e2, model = "iid2d", n = 40, param = c(3, 1.244, 1.244, 0)) +
  f(fid_e3, model = "iid", param = c(1, 0.622)) +
  f(mid_e3, model = "iid", param = c(1, 0.622))

# Note: To get smooth posterior marginals for the hyperparameters you need to
# increase the number of maximum function evaluations: numint.maxfeval (default:
# 10000)
res <- inla(
  formula = frm, 
  data = dat, 
  family = "binomial", 
  Ntrials = rep(1, 360),
  control.inla = list(numint.maxfeval = 200000), 
  verbose = !TRUE
)
summary(res)
## Time used:
##     Pre = 0.707, Running = 2.67, Post = 0.157, Total = 3.53 
## Fixed effects:
##               mean    sd 0.025quant 0.5quant 0.975quant   mode kld
## (Intercept)  1.389 0.508      0.415    1.379      2.418  1.379   0
## fW          -2.916 0.508     -3.945   -2.905     -1.949 -2.905   0
## mW          -0.697 0.475     -1.647   -0.692      0.223 -0.692   0
## WW           3.561 0.576      2.453    3.554      4.713  3.554   0
## fall        -0.564 0.466     -1.498   -0.560      0.343 -0.560   0
## 
## Random effects:
##   Name     Model
##     fid_e1e2 IID2D model
##    mid_e1e2 IID2D model
##    fid_e3 IID model
##    mid_e3 IID model
## 
## Model hyperparameters:
##                                        mean    sd 0.025quant 0.5quant
## Precision for fid_e1e2 (component 1)  1.276 1.105      0.238    0.963
## Precision for fid_e1e2 (component 2)  1.083 0.847      0.234    0.852
## Rho1:2 for fid_e1e2                  -0.102 0.419     -0.809   -0.124
## Precision for mid_e1e2 (component 1)  1.550 1.177      0.324    1.235
## Precision for mid_e1e2 (component 2)  1.122 0.797      0.261    0.914
## Rho1:2 for mid_e1e2                   0.700 0.207      0.155    0.753
## Precision for fid_e3                  2.540 1.959      0.510    2.013
## Precision for mid_e3                  0.788 0.518      0.213    0.657
##                                      0.975quant   mode
## Precision for fid_e1e2 (component 1)      4.214  0.568
## Precision for fid_e1e2 (component 2)      3.332  0.540
## Rho1:2 for fid_e1e2                       0.708 -0.254
## Precision for mid_e1e2 (component 1)      4.659  0.781
## Precision for mid_e1e2 (component 2)      3.221  0.608
## Rho1:2 for mid_e1e2                       0.945  0.855
## Precision for fid_e3                      7.713  1.250
## Precision for mid_e3                      2.150  0.463
## 
## Marginal log-Likelihood:  -228.19 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

The INLA results confirm the main patterns seen in the exploratory plot. The intercept indicates a high baseline mating probability for within-population pairs. The strong negative effect for fW (–2.9) and the smaller negative effect for mW (–0.7) suggest that cross-population matings—particularly those involving Whiteside females—are substantially less successful. The large positive interaction term WW (+3.6) implies that when both partners are from the Whiteside population, mating success is restored to a level comparable to the Rough Butt pairs, consistent with reproductive isolation between the two groups. The seasonal term fall (–0.56) indicates a modest decline in success during the fall, though its credible interval includes zero.

The random-effect precisions are moderate, showing some heterogeneity among individuals, with a strong positive correlation among male random effects across the first two experiments. Overall, the model supports the hypothesis that cross-population matings are less successful and that season exerts a weaker secondary influence.

References

McCullagh, P., and John A. Nelder. 1989. Generalized Linear Models. 2nd ed. Monographs on Statistics and Applied Probability 37. London ; New York: Chapman and Hall.