This tutorial is about fitting multi-rate Brownian motion models (using phytools) and multi-regime OU models (using the OUwie and bayou packages).
The first exercise is designed to explore a model developed by O’Meara et al. (2006; Evolution) in which the rate of Brownian evolution takes different values in different parts of a phylogeny.
For this analysis we will use the function brownie.lite in the phytools package. This function allows us to input a tree with painted rate “regimes” - for instance, from a stochastically mapped discrete character - and then fit a model in which the rate differs depending on the mapped regime.
First, let’s make sure we have the anole tree & svl dataset that we used earlier in the course. You will need svl.csv and Anolis.tre.
Make sure you have these files in your working directory, and then read them in.
anoleTree<-read.tree("Anolis.tre")
anoleData<-read.csv("svl.csv")
The analyses will we run in this exercise focus on one particular anole ecomorph, the crown giant. These large anoles have many unique features. We can create a vector that identifies whether each species is a crown-giant or not.
crownGiantSpecies<-c("equestris", "luteogularis", "smallwoodi", "noblei", "baracoae", "baleatus", "barahonae", "ricordii", "cuvieri", "garmani")
isCrownGiant <- anoleData[,1] %in% crownGiantSpecies
ecomorph <- sapply(isCrownGiant, function(x) if(x) "cg" else "other")
ecomorph <- as.factor(ecomorph)
names(ecomorph) <- anoleData[,1]
We can then use stochastic character mapping, as we learned earlier, to reconstruct the evolution of this character.
ecomorphTree <- make.simmap(anoleTree, ecomorph, model="ER")
## make.simmap is sampling character histories conditioned on the transition matrix
##
## Q =
## cg other
## cg -0.09903616 0.09903616
## other 0.09903616 -0.09903616
## (estimated using likelihood);
## and (mean) root node prior probabilities
## pi =
## cg other
## 0.5 0.5
## Done.
plot(ecomorphTree)
## no colors provided. using the following legend:
## cg other
## "black" "red"
The data we will analyse is body size, which appears in our matrix as snout-vent length (svl).
svl<-anoleData[,2]
names(svl)<-anoleData[,1]
We will fit both a single-rate and a two-rate Brownian motion model to the data using phytools.
fitBrownie<-brownie.lite(ecomorphTree, svl)
fitBrownie
## ML single-rate model:
## s^2 se a k logL
## value 0.1362 0.02 4.0659 2 -4.7004
##
## ML multi-rate model:
## s^2(cg) se(cg) s^2(other) se(other) a k logL
## value 1.3136 0.685 0.0882 0.0141 4.0219 3 5.1587
##
## P-value (based on X^2): 0
##
## R thinks it has found the ML solution.
This result shows that the two-rate model fits the data significantly better than a one-rate model. The reconstructed rate of evolution in the fitted model is much higher for crown-giant anoles than for other ecomorphs.
Now let’s try to fit a multi-regime OU model using OUwie.
OUwie requires a special data frame with three columns: species name, the regime coding, and the variable being analysed.
svl_df<-data.frame(anoleData[,1], ecomorph, svl)
We can run single models in OUwie - for example, fitting a model where each regime has a different mean but all other parameters are shared.
fitOUM<-OUwie(ecomorphTree, svl_df,model="OUM",simmap.tree=TRUE)
## Initializing...
## Finished. Begin thorough search...
## Finished. Summarizing results.
fitOUM
##
## Fit
## lnL AIC AICc model ntax
## 0.7684505 6.463099 6.884152 OUM 100
##
##
## Rates
## cg other
## alpha 0.01397992 0.01397992
## sigma.sq 0.12290692 0.12290692
##
## Optima
## cg other
## estimate 7.0642968 4.0419351
## se 0.8851028 0.1016899
##
## Arrived at a reliable solution
To really test this model, though, we can fit a whole set of models using OUwie. Note that the first two models here match the two models that we just compared using brownie.lite - and, in fact, the two programs obtain identical likelihoods and parameter estimates.
fitBM1<-OUwie(ecomorphTree, svl_df,model="BM1",simmap.tree=TRUE)
## Initializing...
## Finished. Begin thorough search...
## Finished. Summarizing results.
fitBM1
##
## Fit
## lnL AIC AICc model ntax
## -4.700436 13.40087 13.52458 BM1 100
##
## Rates
## cg other
## alpha NA NA
## sigma.sq 0.1361648 0.1361648
##
## Optima
## cg other
## estimate 4.0659176 0
## se 0.1079716 0
##
## Arrived at a reliable solution
fitBMS<-OUwie(ecomorphTree, svl_df,model="BMS",simmap.tree=TRUE, root.station=FALSE)
## Initializing...
## Finished. Begin thorough search...
## Finished. Summarizing results.
fitBMS
##
## Fit
## lnL AIC AICc model ntax
## 5.158688 -4.317375 -4.067375 BMS 100
##
## Rates
## cg other
## alpha NA NA
## sigma.sq 1.313627 0.08820534
##
## Optima
## cg other
## estimate 4.02195048 4.02195048
## se 0.08733427 0.08733427
##
## Arrived at a reliable solution
fitOUM<-OUwie(ecomorphTree, svl_df,model="OUM",simmap.tree=TRUE)
## Initializing...
## Finished. Begin thorough search...
## Finished. Summarizing results.
fitOUM
##
## Fit
## lnL AIC AICc model ntax
## 0.7684505 6.463099 6.884152 OUM 100
##
##
## Rates
## cg other
## alpha 0.01397992 0.01397992
## sigma.sq 0.12290692 0.12290692
##
## Optima
## cg other
## estimate 7.0642968 4.0419351
## se 0.8851028 0.1016899
##
## Arrived at a reliable solution
fitOUMV<-OUwie(ecomorphTree, svl_df,model="OUMV",simmap.tree=TRUE)
## Initializing...
## Finished. Begin thorough search...
## Finished. Summarizing results.
fitOUMV
##
## Fit
## lnL AIC AICc model ntax
## 8.824992 -7.649983 -7.011685 OUMV 100
##
##
## Rates
## cg other
## alpha 2.061154e-09 2.061154e-09
## sigma.sq 6.718058e-01 8.723383e-02
##
## Optima
## cg other
## estimate 8.775604 4.01467757
## se 1.566700 0.08691293
##
## Arrived at a reliable solution
fitOUMA<-OUwie(ecomorphTree, svl_df,model="OUMA",simmap.tree=TRUE)
## Initializing...
## Finished. Begin thorough search...
## Finished. Summarizing results.
fitOUMA
##
## Fit
## lnL AIC AICc model ntax
## 21.71809 -33.43617 -32.79787 OUMA 100
##
##
## Rates
## cg other
## alpha 4.666379e-07 2.347238e-09
## sigma.sq 8.027581e-02 8.027581e-02
##
## Optima
## cg other
## estimate 4.9236096 4.00984731
## se 0.1320604 0.08317477
##
## Arrived at a reliable solution
fitOUMVA<-OUwie(ecomorphTree, svl_df,model="OUMVA",simmap.tree=TRUE)
## Initializing...
## Finished. Begin thorough search...
## Finished. Summarizing results.
fitOUMVA
##
## Fit
## lnL AIC AICc model ntax
## 26.43487 -40.86974 -39.96651 OUMVA 100
##
##
## Rates
## cg other
## alpha 3.2439389439 0.22073152
## sigma.sq 0.0001726062 0.09703856
##
## Optima
## cg other
## estimate 4.99844996 4.00166003
## se 0.05673701 0.07524586
##
## Arrived at a reliable solution
Now we can compare the fit of all of these models using AIC weights.
ouwie_aicc<-c(fitBM1$AICc, fitBMS$AICc, fitOUM$AICc, fitOUMV$AICc, fitOUMA$AICc, fitOUMVA$AICc)
names(ouwie_aicc)<-c("fitBM1", "fitBMS", "fitOUM", "fitOUMV", "fitOUMA", "fitOUMVA")
aic.w(ouwie_aicc)
## fitBM1 fitBMS fitOUM fitOUMV fitOUMA fitOUMVA
## 0.00000000 0.00000002 0.00000000 0.00000007 0.02700601 0.97299390
The crown-giant anoles do seem to have their own adaptive regime, and that regime has a different level of constraint compared to the other ecomorphs as a group.
Bayou uses Bayesian rjMCMC to identify regimes, and can do so in the absence of particular hypotheses. Let’s see if bayou can detect the unique selective regime of the crown-giant anoles.
First we need to set up priors for all model parameters. In general, it is worth thinking about this a bit. But I am lazy and will just use the defaults from the bayou tutorial on github.
prior <- make.prior(anoleTree, dists=list(dalpha="dhalfcauchy", dsig2="dhalfcauchy",dsb="dsb", dk="cdpois", dtheta="dnorm"), param=list(dalpha=list(scale=1), dsig2=list(scale=1), dk=list(lambda=15, kmax=200), dsb=list(bmax=1,prob=1), dtheta=list(mean=mean(svl), sd=2)))
Now, run the MCMC
fit1 <- bayou.mcmc(anoleTree, svl, SE=0, model="OU", prior=prior, ngen=50000, new.dir=getwd(), plot.freq=5000, ticker.freq=1000)
## gen lnL prior half.life Vy K .alpha .sig2 .theta birth.k D0.slide D1.slide death.k R1.slide U0.slide U1.slide U2.slide
## 1000 -7.25 -103.63 44.94 3.56 19 0.83 0.45 0.83 0.06 1 0.58 0.72 1 0.9 0.78 0.74
## 2000 -4.98 -82.13 1.95 0.21 15 0.86 0.44 0.85 0.06 1 0.61 0.82 0.33 0.94 0.79 0.77
## 3000 -6.95 -73.75 2.26 0.26 13 0.86 0.45 0.85 0.06 1 0.65 0.88 0.2 0.95 0.79 0.77
## 4000 -6.56 -61.42 11.43 1.36 10 0.87 0.47 0.85 0.06 1 0.65 0.89 0.2 0.96 0.77 0.81
## 5000 -5.35 -61.53 5.02 0.42 10 0.87 0.47 0.86 0.05 1 0.64 0.91 0.2 0.97 0.81 0.84
## 6000 -6.99 -52.17 2.07 0.28 8 0.85 0.46 0.86 0.06 0.99 0.63 0.92 0.2 0.96 0.81 0.86
## 7000 -1.14 -75.78 1.86 0.22 13 0.84 0.46 0.85 0.06 0.99 0.62 0.93 0.2 0.96 0.83 0.85
## 8000 -6.6 -86.68 8.81 1.1 16 0.85 0.46 0.85 0.05 0.99 0.62 0.93 0.14 0.96 0.84 0.86
## 9000 -7.36 -65.68 10.91 0.84 11 0.85 0.46 0.85 0.06 0.99 0.63 0.94 0.14 0.96 0.85 0.86
## 10000 -5.94 -49.29 49.17 4.01 7 0.85 0.46 0.85 0.06 0.99 0.64 0.94 0.14 0.97 0.85 0.85
## 11000 -6.24 -70.27 5.73 0.53 12 0.86 0.45 0.84 0.06 0.99 0.62 0.94 0.14 0.97 0.86 0.86
## 12000 -4.12 -48.53 2.49 0.27 7 0.85 0.46 0.84 0.06 0.99 0.62 0.95 0.14 0.97 0.86 0.87
## 13000 -6.45 -67.71 1.21 0.15 11 0.84 0.45 0.84 0.05 0.99 0.63 0.95 0.14 0.97 0.82 0.87
## 14000 -6.99 -39.7 38.93 5.12 5 0.84 0.45 0.84 0.05 0.99 0.63 0.95 0.14 0.96 0.81 0.87
## 15000 -9.53 -55.79 1.41 0.2 8 0.84 0.45 0.84 0.05 0.98 0.63 0.96 0.14 0.97 0.82 0.87
## 16000 -5.55 -67.09 4.65 0.48 11 0.84 0.46 0.84 0.05 0.98 0.62 0.96 0.14 0.97 0.82 0.86
## 17000 -7.66 -65.57 12.54 1.28 11 0.84 0.45 0.84 0.05 0.98 0.63 0.96 0.14 0.97 0.83 0.86
## 18000 -4.6 -65.57 18.46 1.87 11 0.85 0.45 0.84 0.05 0.98 0.63 0.96 0.14 0.97 0.83 0.86
## 19000 -6.75 -40.5 2.13 0.21 5 0.84 0.45 0.84 0.05 0.98 0.62 0.97 0.14 0.97 0.83 0.86
## 20000 -6.37 -75.4 35.17 4.53 13 0.85 0.45 0.84 0.05 0.98 0.63 0.97 0.14 0.97 0.84 0.87
## 21000 -4.86 -61.2 22.9 2.39 10 0.85 0.45 0.84 0.05 0.98 0.63 0.97 0.14 0.97 0.84 0.87
## 22000 -7.75 -70.62 2.67 0.27 12 0.85 0.45 0.84 0.05 0.97 0.62 0.97 0.14 0.97 0.84 0.88
## 23000 -4.98 -53.7 42.69 3.89 8 0.85 0.45 0.84 0.05 0.98 0.62 0.97 0.14 0.97 0.84 0.88
## 24000 -6.93 -47.33 11.16 1.14 7 0.85 0.45 0.83 0.05 0.97 0.63 0.97 0.14 0.97 0.84 0.88
## 25000 -6.82 -49.27 2.25 0.34 7 0.85 0.45 0.83 0.05 0.98 0.62 0.97 0.14 0.97 0.85 0.89
## 26000 -4.07 -70.44 5.26 0.59 12 0.85 0.45 0.83 0.05 0.98 0.61 0.97 0.14 0.97 0.85 0.88
## 27000 -5.52 -71.06 6.44 0.7 12 0.85 0.45 0.83 0.05 0.98 0.61 0.97 0.14 0.97 0.85 0.88
## 28000 -6.2 -48.51 17.2 1.93 7 0.85 0.45 0.83 0.05 0.98 0.61 0.97 0.1 0.97 0.85 0.88
## 29000 -5.72 -43.94 9.51 1.01 6 0.85 0.45 0.83 0.05 0.98 0.61 0.97 0.1 0.97 0.85 0.88
## 30000 -6.6 -84.83 13.42 1.62 15 0.85 0.44 0.83 0.05 0.98 0.62 0.97 0.1 0.97 0.86 0.88
## 31000 -3.25 -71.29 2.73 0.27 12 0.85 0.44 0.83 0.05 0.98 0.62 0.97 0.1 0.97 0.86 0.88
## 32000 -5.73 -69.34 24.72 3.01 12 0.85 0.44 0.83 0.05 0.98 0.62 0.98 0.08 0.97 0.86 0.88
## 33000 0.57 -55.25 1.5 0.19 8 0.85 0.45 0.83 0.05 0.98 0.61 0.97 0.07 0.97 0.87 0.88
## 34000 -5.99 -57.7 2.12 0.2 9 0.84 0.45 0.83 0.05 0.98 0.61 0.97 0.06 0.96 0.86 0.87
## 35000 20.99 -82.72 0.86 0.09 14 0.83 0.45 0.82 0.05 0.97 0.61 0.96 0.06 0.95 0.84 0.86
## 36000 23.63 -89.2 0.63 0.05 16 0.82 0.45 0.82 0.05 0.97 0.6 0.95 0.06 0.95 0.81 0.85
## 37000 17.23 -84.95 0.55 0.07 15 0.81 0.44 0.81 0.05 0.98 0.59 0.94 0.06 0.94 0.81 0.84
## 38000 3.5 -53.29 1.42 0.16 8 0.8 0.44 0.81 0.05 0.97 0.59 0.93 0.06 0.94 0.8 0.82
## 39000 16.46 -70.18 0.4 0.07 11 0.8 0.44 0.81 0.05 0.95 0.58 0.92 0.06 0.94 0.79 0.81
## 40000 11.57 -66.62 0.75 0.09 11 0.79 0.44 0.81 0.05 0.95 0.58 0.92 0.06 0.93 0.79 0.8
## 41000 8.22 -63.05 1.17 0.14 10 0.78 0.44 0.8 0.05 0.95 0.57 0.91 0.06 0.92 0.78 0.78
## 42000 9.82 -76.94 1.71 0.16 13 0.78 0.45 0.8 0.05 0.95 0.57 0.91 0.06 0.92 0.78 0.78
## 43000 0.44 -57.42 0.75 0.12 9 0.77 0.45 0.8 0.05 0.95 0.56 0.91 0.06 0.92 0.76 0.77
## 44000 2.15 -69.7 1.74 0.2 12 0.77 0.45 0.8 0.05 0.95 0.56 0.9 0.06 0.92 0.77 0.77
## 45000 -3.24 -66.69 2.69 0.32 11 0.76 0.45 0.8 0.05 0.95 0.56 0.9 0.06 0.91 0.76 0.77
## 46000 -5.65 -52.47 4.46 0.46 8 0.76 0.45 0.8 0.05 0.95 0.56 0.9 0.06 0.92 0.76 0.77
## 47000 -5.35 -56.62 3.55 0.37 9 0.76 0.45 0.8 0.05 0.95 0.56 0.9 0.06 0.92 0.77 0.77
## 48000 -5.89 -48.21 3.74 0.44 7 0.76 0.45 0.8 0.05 0.95 0.56 0.91 0.06 0.92 0.77 0.77
## 49000 -6.61 -57.33 4.35 0.46 9 0.77 0.45 0.8 0.05 0.95 0.56 0.91 0.06 0.92 0.77 0.78
## 50000 -6.24 -53.4 5.2 0.46 8 0.77 0.45 0.8 0.05 0.96 0.56 0.91 0.06 0.92 0.77 0.78
fit1
## bayou modelfit
## OU parameterization
##
## Results are stored in directory
## /Users/lukeharmon/Documents/teaching/rcourse_argentina/multi-regime/IGWZRDETON/bayou.*
## To load results, use 'load.bayou(bayouFit)'
##
## 50000 generations were run with the following acceptance probabilities:
## .alpha .sig2 .theta birth.k D0.slide D1.slide death.k R1.slide
## 0.77 0.45 0.80 0.05 0.96 0.56 0.91 0.06
## U0.slide U1.slide U2.slide
## 0.92 0.77 0.78
## Total number of proposals of each type:
## .alpha .sig2 .theta birth.k D0.slide D1.slide death.k R1.slide
## 9107 4499 9128 21473 1153 1179 1201 18
## U0.slide U1.slide U2.slide
## 1216 427 599
Bayou stores results in a set of files in your working directory. To explore convergence and see your results you have to read them back in.
chain <- load.bayou(fit1, save.Rdata=FALSE, cleanup=TRUE)
## Warning: You have selected to delete all created MCMC files and not to save them as an .rds file.
## Your mcmc results will not be saved on your hard drive. If you do not output to a object, your results will be lost.
##
chain <- set.burnin(chain, 0.3)
out <- summary(chain)
## bayou MCMC chain: 50000 generations
## 5000 samples, first 1500 samples discarded as burnin
##
##
## Summary statistics for parameters:
## Mean SD Naive SE Time-series SE
## lnL -0.3954114 9.00820297 0.152244674 4.108149509
## prior -63.8423053 13.93439375 0.235500603 2.303039722
## alpha 0.3748011 0.38054788 0.006431514 0.135774751
## sig2 0.1538762 0.02613831 0.000441755 0.001644224
## k 10.3710368 3.09732710 0.052346906 0.498502382
## ntheta 11.3710368 3.09732710 0.052346906 0.498502382
## root 4.0217136 0.10726936 0.001812924 0.013006172
## all theta 4.1818798 1.42646302 NA NA
## Effective Size
## lnL 4.808215
## prior 36.607772
## alpha 7.855622
## sig2 252.716315
## k 38.604655
## ntheta 38.604655
## root 68.022467
## all theta NA
##
##
## Branches with posterior probabilities higher than 0.1:
## pp magnitude.of.theta2 naive.SE.of.theta2 rel.location
## 25 0.3033419 5.424547 0.05001458 1.65625501
## 116 0.2824907 5.986222 0.03093638 1.14375191
## 94 0.2547843 2.660055 0.04273313 0.68447820
## 91 0.1970865 6.446337 0.05591287 0.46267909
## 187 0.1408169 2.994196 0.09293359 0.66819562
## 98 0.1336761 5.573826 0.08103935 0.73985778
## 159 0.1282491 5.774159 0.06628641 0.35326955
## 6 0.1165381 4.558123 0.05768824 0.73826974
## 141 0.1162525 4.696853 0.08389168 1.04302176
## 184 0.1151100 3.850612 0.08128017 0.09815843
## 180 0.1136818 5.314325 0.06663091 2.17682561
## 28 0.1128249 5.633666 0.07390165 1.13315364
## 68 0.1091117 3.787517 0.06196159 0.22754712
## 14 0.1059697 3.978408 0.06711496 3.88538970
## 82 0.1059697 4.137644 0.07996407 0.37823302
## 47 0.1056841 3.322822 0.05544484 1.29346973
## 106 0.1051128 3.442847 0.08679734 1.41716504
## 70 0.1033990 4.604663 0.06668637 0.44752346
## 2 0.1028278 3.891608 0.03547618 2.70508628
## 51 0.1019709 3.312506 0.09266617 0.55295436
#plot(chain)
Now let’s visualize shift locations on the tree
par(mfrow=c(1,1))
plotSimmap.mcmc(chain, burnin=0.3, lwd=2, edge.type="theta", pal=colorRampPalette(c("black", "gray90")), show.tip.label=FALSE)
How do these shifts look in a phenogram?
phenogram.density(anoleTree, svl, chain=chain, burnin=0.3, pp.cutoff=0.3)
Results may vary - but for these it does look like there is something unique about the crown giants! Note that many features of these results suggest that we should have run our Bayou analyses for a much longer time.