This tutorial is about fitting multi-rate Brownian motion models (using phytools), multi-regime OU models (using the OUwie package), and multivariate Brownian models (using phytools).
The first exercise is designed to explore a model developed by O'Meara et al. (2006; Evolution) in which the rate of Brownian evolution (σ2) takes different values in different parts of a phylogeny.
For this analysis we will use the function brownie.lite in the phytools package. This
function allows us to input a tree with painted rate “regimes” - for instance, from a stochastically
mapped discrete character - and then fit a model in which the rate differs depending on the mapped regime.
First, let's read simulated tree & dataset from file. These data will be used to illustrate fitting a
two-rate Brownian model using brownie.lite in the phytools package. The tree is
simulated.tre and the data vector is in
simulated.csv.
## load phytools
library(phytools)
## Loading required package: ape
## Loading required package: maps
Now, let's load the single tree with painted regimes from file. We can do this by downloading
the file from the web, and the loading it using the phytools function read.simmap.
## read simulated tree from file
tree<-read.simmap("simulated.tre",format="phylip")
tree
##
## Phylogenetic tree with 26 tips and 25 internal nodes.
##
## Tip labels:
## Z, Y, X, W, V, U, ...
##
## The tree includes a mapped, 2-state discrete character with states:
## aqua, terr
##
## Rooted; includes branch lengths.
Now, let's plot the tree with painted regimes:
## plot tree with regimes
colors=setNames(c("brown","blue"),c("terr","aqua"))
plot(tree,colors,lwd=4)
add.simmap.legend(colors=colors,prompt=FALSE,
x=0.05*max(nodeHeights(tree)),y=0.1*Ntip(tree))
Next, we can read our data from file. These are data for a single continuous character:
## read data from file
x<-read.csv("simulated.csv",header=TRUE,row.names=1)
## convert to vector with names
x<-setNames(x[,1],rownames(x))
x
## A B C D E F
## 1.32028474 0.61662698 -1.00911612 -0.70287046 1.56931705 1.64972830
## G H I J K L
## 0.94406092 1.23534571 2.05738145 3.16707622 -0.39868954 -1.40144281
## M N O P Q R
## -0.89405959 -0.76103676 -0.76097347 -0.81459239 -0.19720860 -0.06206985
## S T U V W X
## 0.06946520 -0.18486903 -1.77787925 -0.79620907 -0.39166148 -0.63321831
## Y Z
## -0.48235455 -0.46151922
Now we are ready to fit our multi-rate model of trait evolution:
## fit multi-rate Brownian model
fitBM<-brownie.lite(tree,x)
fitBM
## ML single-rate model:
## s^2 se a k logL
## value 2.1036 0.5834 -0.1898 2 -31.5992
##
## ML multi-rate model:
## s^2(aqua) se(aqua) s^2(terr) se(terr) a k logL
## value 5.7837 2.7182 0.6659 0.2398 -0.4222 3 -26.2556
##
## P-value (based on X^2): 0.0011
##
## R thinks it has found the ML solution.
This result shows (1) that a two-rate model fits the data highly significantly better than
a one-rate model; and (2) the rate of evolution in the fitted model is much higher for state
"aqua" than for state "terr".
In addition to this approach in which the rate of evolution differs between different parts of the tree, we can also fit an Ornstein-Uhlenbeck model in which the pull or selection regimes different (i.e., have different optimums) in different parts of the phylogeny.
To explore this model, let's read in the Anolis tree & data for gross morphology, and then fit a multi-optimum OU model in which the regime shifts are associated with the ecomorph state of different anole species.
In this case, our tree is ecomorph.tre and our data is ecomorph-data.csv. Finally, we need the file ecomorph.csv, which contains the ecomorph identities of each tip.
First, let's load the package “OUwie”. If you do not have OUwie, then you should first install it from CRAN.
## load OUwie
library(OUwie)
## Loading required package: nloptr
## Loading required package: lattice
Now let's read our data from the input file:
## read anole data from file
X<-read.csv("ecomorph-data.csv",row.names=1)
## read anole tree
anolis.tree<-read.tree("ecomorph.tre")
plotTree(anolis.tree,type="fan",fsize=0.8)
Since these data are multivariate and we are interested in how body shape evolves on the tree, we
will using phylogenetic PCA (phyl.pca; Revell 2009), and then analyze only principal
component #2.
To work in OUwie we need to make a special data frame for our analysis, as follows:
## make analysis input data.frame
ecomorph<-read.csv("ecomorph.csv",row.names=1)
ecomorph<-setNames(ecomorph[,1],rownames(X))
pca<-phyl.pca(anolis.tree,X)
data<-data.frame(Genus_species=rownames(pca$S),Reg=ecomorph,
X=as.numeric(pca$S[,"PC2"]))
data
## Genus_species Reg X
## ahli ahli TG -0.199891250
## allogus allogus TG 0.527960834
## rubribarbus rubribarbus TG -0.115836988
## imias imias TG 0.290340349
## sagrei sagrei TG 0.120114358
## bremeri bremeri TG -0.197087755
## quadriocellifer quadriocellifer TG 0.649040055
## ophiolepis ophiolepis GB -0.288600958
## mestrei mestrei TG -0.153272532
## jubar jubar TG 0.368795489
## homolechis homolechis TG 0.031292300
## confusus confusus TG 0.296371325
## guafe guafe TG -0.207781117
## garmani garmani CG -0.289574985
## opalinus opalinus TC -0.151377395
## grahami grahami TC -0.002085284
## valencienni valencienni Tw -0.162296673
## lineatopus lineatopus TG 0.273931501
## evermanni evermanni TC -0.371551504
## stratulus stratulus TC -0.104831873
## krugi krugi GB -0.105707852
## pulchellus pulchellus GB -0.348958448
## gundlachi gundlachi TG 0.161000659
## poncensis poncensis GB 0.137710164
## cooki cooki TG 0.186435289
## cristatellus cristatellus TG 0.347174328
## brevirostris brevirostris Tr 0.145163964
## caudalis caudalis Tr 0.003387502
## marron marron Tr -0.166924957
## websteri websteri Tr 0.050526610
## distichus distichus Tr -0.032782137
## alumina alumina GB -0.178942773
## semilineatus semilineatus GB -0.078800012
## olssoni olssoni GB -0.189198658
## insolitus insolitus Tw -0.270609917
## whitemani whitemani TG -0.340436818
## haetianus haetianus TG -0.241641895
## breslini breslini TG 0.365174916
## armouri armouri TG 0.132675968
## cybotes cybotes TG -0.209221831
## shrevei shrevei TG 0.042709969
## longitibialis longitibialis TG -0.334457794
## strahmi strahmi TG -0.104429857
## marcanoi marcanoi TG 0.126260217
## baleatus baleatus CG -0.355758090
## barahonae barahonae CG -0.188028600
## ricordii ricordii CG 0.091095536
## cuvieri cuvieri CG 0.021582983
## altitudinalis altitudinalis TC 0.289185847
## oporinus oporinus TC 0.731506634
## isolepis isolepis TC -0.405960955
## allisoni allisoni TC -0.360368083
## porcatus porcatus TC -0.253448010
## loysiana loysiana Tr -0.031981277
## guazuma guazuma Tw -0.169497642
## placidus placidus Tw -0.214186709
## sheplani sheplani Tw 0.367067252
## alayoni alayoni Tw 0.643884525
## angusticeps angusticeps Tw 0.005581478
## paternus paternus Tw -0.238386253
## alutaceus alutaceus GB -0.287622843
## inexpectatus inexpectatus GB -0.385330860
## clivicola clivicola GB 0.352896666
## cupeyalensis cupeyalensis GB 0.015798807
## cyanopleurus cyanopleurus GB 0.032180634
## alfaroi alfaroi GB -0.225393493
## macilentus macilentus GB -0.191481218
## vanidicus vanidicus GB 0.109855359
## baracoae baracoae CG -0.083759565
## noblei noblei CG -0.356554156
## smallwoodi smallwoodi CG 0.022301944
## luteogularis luteogularis CG 0.726829715
## equestris equestris CG -0.403867780
## bahorucoensis bahorucoensis GB 0.129802491
## dolichocephalus dolichocephalus GB -0.141701844
## hendersoni hendersoni GB 0.592711943
## darlingtoni darlingtoni Tw 0.287361238
## aliniger aliniger TC 0.651834177
## singularis singularis TC -0.274546646
## chlorocyanus chlorocyanus TC 0.234044367
## coelestinus coelestinus TC 0.273913424
## occultus occultus Tw -0.166643024
PC1 is dominated by body size, so PC2 is the leading body shape dimension controlling for overall size:
pca
## Phylogenetic pca
## Standard deviations:
## PC1 PC2 PC3 PC4 PC5 PC6
## 3.28688577 0.63224737 0.45592902 0.26299049 0.13689004 0.12759614
## PC7 PC8 PC9 PC10 PC11 PC12
## 0.09485649 0.06067835 0.04251493 0.03806602 0.03156784 0.02462736
## Loads:
## PC1 PC2 PC3 PC4 PC5
## AVG.SVL -0.9748640 0.18974036 0.02933597 0.103743393 0.035895854
## AVG.femur -0.9915421 -0.08374563 -0.05220067 -0.014113706 0.029034269
## AVG.tibia -0.9791706 -0.17114508 -0.04210164 -0.068848997 0.050713457
## AVG.met -0.9710972 -0.22243809 -0.03729140 -0.054894738 0.045566993
## AVG.ltoe.IV -0.9795775 -0.17342745 -0.05304229 -0.014810559 -0.002578176
## AVG.humerus -0.9889959 0.10049341 -0.08869036 0.029134013 -0.029329471
## AVG.radius -0.9863856 0.04804003 -0.13631327 -0.027397249 -0.047654596
## AVG.lfing.IV -0.9910488 0.00703555 -0.11225364 -0.006360021 -0.044172689
## Foot.Lam.num -0.9017083 0.27040903 0.26646899 -0.177218988 0.059040368
## Hand.Lam.num -0.8680079 0.41557919 0.18138347 -0.186020687 -0.045455811
## Avg.lnSVL2 -0.9701822 0.20486441 0.03896326 0.115759839 0.035140828
## Avg.ln.t1 -0.9131016 -0.25850810 0.30597752 0.057306719 -0.047269858
## PC6 PC7 PC8 PC9
## AVG.SVL -0.0097349401 0.0062145050 -0.004036242 -0.0005191164
## AVG.femur 0.0421184809 0.0552420910 0.033021641 -0.0065577985
## AVG.tibia 0.0425192315 -0.0088018515 -0.019650729 -0.0092363800
## AVG.met -0.0080056606 -0.0098685365 -0.017092971 0.0105251157
## AVG.ltoe.IV -0.0833608176 0.0002676749 -0.006334308 -0.0014541944
## AVG.humerus 0.0137671163 -0.0359708390 -0.000356065 -0.0257998809
## AVG.radius 0.0433241805 -0.0251441167 0.005598060 0.0232167295
## AVG.lfing.IV -0.0450034306 0.0250518258 0.009682767 0.0010062529
## Foot.Lam.num -0.0433033660 -0.0593618895 0.049583197 0.0028608600
## Hand.Lam.num 0.0099804242 0.0517722316 -0.037766348 -0.0037610392
## Avg.lnSVL2 -0.0002494187 0.0027880405 -0.008351120 0.0110454124
## Avg.ln.t1 0.0162998757 -0.0027156314 0.001548307 0.0002211933
## PC10 PC11 PC12
## AVG.SVL 0.0156630318 -0.016740370 7.177182e-03
## AVG.femur 0.0067566925 0.007689344 5.750757e-05
## AVG.tibia -0.0191972312 -0.006928281 8.736136e-03
## AVG.met 0.0139868658 -0.002312382 -1.567133e-02
## AVG.ltoe.IV 0.0054961724 0.012997493 1.099462e-02
## AVG.humerus 0.0043761608 0.004443894 -5.012747e-03
## AVG.radius 0.0043973244 0.001143342 5.709126e-03
## AVG.lfing.IV -0.0150557540 -0.013001805 -6.062955e-03
## Foot.Lam.num -0.0056065475 -0.002396990 -3.361791e-04
## Hand.Lam.num 0.0066631930 0.004633844 -1.300028e-04
## Avg.lnSVL2 -0.0151097759 0.013247665 -4.514422e-03
## Avg.ln.t1 -0.0007261601 -0.001473285 -1.518054e-04
This data frame contains our trait and the regimes (in this case, "ecomorph") for the tips,
but we also need a reconstruction of the regimes across the branches and nodes of the tree. For this,
we will use the method of stochastic mapping. Here, I will just use one stochastic map - but normally
we would want to integrate across a set of stochastic maps.
## perform & plot stochastic maps (we would normally do this x100)
smap.tree<-make.simmap(anolis.tree,ecomorph)
## make.simmap is sampling character histories conditioned on the transition matrix
##
## Q =
## CG GB TC TG Tr Tw
## CG -0.3436331 0.14412189 0.1995112 0.0000000 0.00000000 0.0000000
## GB 0.1441219 -1.20200454 0.2542848 0.3605536 0.08047684 0.3625674
## TC 0.1995112 0.25428482 -0.8151594 0.0000000 0.14565234 0.2157110
## TG 0.0000000 0.36055356 0.0000000 -0.3605536 0.00000000 0.0000000
## Tr 0.0000000 0.08047684 0.1456523 0.0000000 -0.22612918 0.0000000
## Tw 0.0000000 0.36256743 0.2157110 0.0000000 0.00000000 -0.5782785
## (estimated using likelihood);
## and (mean) root node prior probabilities
## pi =
## CG GB TC TG Tr Tw
## 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667
## Done.
plot(smap.tree,type="fan",fsize=0.8,ftype="i")
## no colors provided. using the following legend:
## CG GB TC TG Tr Tw
## "black" "red" "green3" "blue" "cyan" "magenta"
Now we are finally ready to fit our models. We will first fit a single-rate Brownian model and then we will also fit a multi-rate OU model for comparison:
## fit Brownian & multi-optimum OU models
fitBM<-OUwie(smap.tree,data,model="BM1",simmap.tree=TRUE) ## single rate
## Initializing...
## Finished. Begin thorough search...
## Finished. Summarizing results.
fitBMS<-OUwie(smap.tree,data,model="BMS",simmap.tree=TRUE) ## multiple rates
## Warning: You might not have enough data to fit this model well
## Initializing...
## Finished. Begin thorough search...
## Finished. Summarizing results.
fitOUM<-OUwie(smap.tree,data,model="OUM",simmap.tree=TRUE) ## multiple optima
## Initializing...
## Finished. Begin thorough search...
## Finished. Summarizing results.
fitBM
##
## Fit
## lnL AIC AICc model ntax
## -47.39086 98.78172 98.93362 BM1 82
##
## Rates
## CG GB TC TG Tr Tw
## alpha NA NA NA NA NA NA
## sigma.sq 0.3948603 0.3948603 0.3948603 0.3948603 0.3948603 0.3948603
##
## Optima
## CG GB TC TG Tr Tw
## estimate 2.421034e-08 0 0 0 0 0
## se 1.919428e-01 0 0 0 0 0
##
## Arrived at a reliable solution
fitBMS
##
## Fit
## lnL AIC AICc model ntax
## -34.31987 92.63974 97.16148 BMS 82
##
## Rates
## CG GB TC TG Tr Tw
## alpha NA NA NA NA NA NA
## sigma.sq 1.249673 0.1427856 0.3591133 0.3820528 0.04872764 0.2821856
##
## Optima
## CG GB TC TG Tr Tw
## estimate -0.4350754 -0.08917378 0.0886928 0.2249008 -0.05914502 0.1187033
## se 1.0433771 0.16621203 0.3298219 0.3430986 0.51620046 0.4425138
##
## Arrived at a reliable solution
fitOUM
##
## Fit
## lnL AIC AICc model ntax
## -13.37309 42.74618 44.71878 OUM 82
##
##
## Rates
## CG GB TC TG Tr Tw
## alpha 87.26695 87.26695 87.26695 87.26695 87.26695 87.26695
## sigma.sq 14.16029 14.16029 14.16029 14.16029 14.16029 14.16029
##
## Optima
## CG GB TC TG Tr
## estimate -0.07189924 -0.06418331 0.01434924 0.07252000 -0.004668489
## se 0.09403094 0.06691154 0.07911825 0.05739473 0.121911016
## Tw
## estimate 0.01847646
## se 0.09389608
##
## Arrived at a reliable solution
Note that the "BM1" and "BMS" models are exactly the same as the models
fit using brownie.lite in phytools:
brownie.lite(smap.tree,pca$S[,2])
## ML single-rate model:
## s^2 se a k logL
## value 0.3949 0.0616 0 2 -47.3908
##
## ML multi-rate model:
## s^2(CG) se(CG) s^2(GB) se(GB) s^2(TC) se(TC) s^2(TG) se(TG) s^2(Tr) se(Tr) s^2(Tw) se(Tw) a k logL
## value 1.2625 0.5556 0.1443 0.0436 0.3644 0.1277 0.3884 0.1122 0.0484 0.0332 0.2887 0.1432 -0.0194 7 -34.8595
##
## P-value (based on X^2): 1e-04
##
## R thinks it has found the ML solution.
Finally, let's compare among the models:
aic<-setNames(c(fitBM$AIC,fitBMS$AIC,fitOUM$AIC),
c("BM1","BMS","OUM"))
aic
## BM1 BMS OUM
## 98.78172 92.63974 42.74618
aic.w(aic)
## BM1 BMS OUM
## 0 0 1
This shows that almost all the weight of evidence favors the "OUM", multi-optimum OU,
model.
The last thing we are going to do is fit a multivariate Brownian model in which the evolutinary covariance (and thus correlation) between characters can be different in different parts of the tree. This is a method based on Revell & Collar (2009; Evolution).
For this example we will use data and a phylogeny for centrarchid fishes. The data are in Centrarchidae.csv and the tree is in Centrarchidae.tre.
Let's read our tree & data:
## read centrarchid tree
fish.tree<-read.tree("Centrarchidae.tre")
## read centrarchid data
fish.data<-read.csv("Centrarchidae.csv",header=TRUE,row.names=1) ## or
fish.data
## feeding.mode gape.width buccal.length
## Acantharchus_pomotis pisc 0.114 -0.009
## Lepomis_gibbosus non -0.133 -0.009
## Lepomis_microlophus non -0.151 0.012
## Lepomis_punctatus non -0.103 -0.019
## Lepomis_miniatus non -0.134 0.001
## Lepomis_auritus non -0.222 -0.039
## Lepomis_marginatus non -0.187 -0.075
## Lepomis_megalotis non -0.073 -0.049
## Lepomis_humilis non 0.024 -0.027
## Lepomis_macrochirus non -0.191 0.002
## Lepomis_gulosus pisc 0.131 0.122
## Lepomis_symmetricus non 0.013 -0.025
## Lepomis_cyanellus pisc -0.002 -0.009
## Micropterus_coosae pisc 0.045 -0.009
## Micropterus_notius pisc 0.097 -0.009
## Micropterus_treculi pisc 0.056 0.001
## Micropterus_salmoides pisc 0.056 -0.059
## Micropterus_floridanus pisc 0.096 0.051
## Micropterus_punctulatus pisc 0.092 0.002
## Micropterus_dolomieu pisc 0.035 -0.069
## Centrarchus_macropterus non -0.007 -0.055
## Enneacantus_obesus non 0.016 -0.005
## Pomoxis_annularis pisc -0.004 -0.019
## Pomoxis_nigromaculatus pisc 0.105 0.041
## Archolites_interruptus pisc -0.024 0.051
## Ambloplites_ariommus pisc 0.135 0.123
## Ambloplites_rupestris pisc 0.086 0.041
## Ambloplites_cavifrons pisc 0.130 0.040
Now let's pull out the feeding mode from this data frame. This is the character that we are going to map on the tree for our different regimes. We can then use this character to generate a set of stochastic maps on the phylogeny:
fmode<-setNames(fish.data[,1],rownames(fish.data))
fmode
## Acantharchus_pomotis Lepomis_gibbosus Lepomis_microlophus
## pisc non non
## Lepomis_punctatus Lepomis_miniatus Lepomis_auritus
## non non non
## Lepomis_marginatus Lepomis_megalotis Lepomis_humilis
## non non non
## Lepomis_macrochirus Lepomis_gulosus Lepomis_symmetricus
## non pisc non
## Lepomis_cyanellus Micropterus_coosae Micropterus_notius
## pisc pisc pisc
## Micropterus_treculi Micropterus_salmoides Micropterus_floridanus
## pisc pisc pisc
## Micropterus_punctulatus Micropterus_dolomieu Centrarchus_macropterus
## pisc pisc non
## Enneacantus_obesus Pomoxis_annularis Pomoxis_nigromaculatus
## non pisc pisc
## Archolites_interruptus Ambloplites_ariommus Ambloplites_rupestris
## pisc pisc pisc
## Ambloplites_cavifrons
## pisc
## Levels: non pisc
## stochastic mapping of feeding mode on the tree (we would normally do x100)
fish.tree<-make.simmap(fish.tree,fmode,model="ARD")
## make.simmap is sampling character histories conditioned on the transition matrix
##
## Q =
## non pisc
## non -6.087789 6.087789
## pisc 3.048905 -3.048905
## (estimated using likelihood);
## and (mean) root node prior probabilities
## pi =
## non pisc
## 0.5 0.5
## Done.
Plot our tree & mapped regimes:
cols<-setNames(c("blue","red"),c("non","pisc"))
plot(fish.tree,colors=cols,ftype="i")
add.simmap.legend(colors=cols,prompt=FALSE,x=0,y=Ntip(fish.tree))
Next, we can use the two continuous characters to test our hypothesis that the feeding mode
affects the evolutionary correlation/covariance between traits. For this analysis we will use
evol.vcv in the phytools package.
## data
fish.X<-as.matrix(fish.data[,2:3])
## fit model
fitMV<-evol.vcv(fish.tree,fish.X)
fitMV
## ML single-matrix model:
## R[1,1] R[1,2] R[2,2] k log(L)
## fitted 0.114 0.033 0.0556 5 72.1893
##
## ML multi-matrix model:
## R[1,1] R[1,2] R[2,2] k log(L)
## non 0.1502 0.0049 0.0177 8 79.5949
## pisc 0.0628 0.0627 0.1052
##
## P-value (based on X^2): 0.002
##
## R thinks it has found the ML solution.
This shows us that the two covariance model fits significantly better than the one covariance model. We can also easily extract the evolutionary correlations from these two alernative models:
## now let's look at the correlation matrices
cov2cor(fitMV$R.single)
## gape.width buccal.length
## gape.width 1.000000 0.414274
## buccal.length 0.414274 1.000000
cov2cor(fitMV$R.multiple[,,"non"])
## gape.width buccal.length
## gape.width 1.00000000 0.09487582
## buccal.length 0.09487582 1.00000000
cov2cor(fitMV$R.multiple[,,"pisc"])
## gape.width buccal.length
## gape.width 1.000000 0.771306
## buccal.length 0.771306 1.000000
Finally, this is a pattern that we should be able to visualize using a phylomorphospace plot:
phylomorphospace(fish.tree,fish.X,ftype="off",colors=cols,
node.by.map=TRUE,xlab="gape width",ylab="buccal length")
add.simmap.legend(colors=cols,prompt=FALSE,x=-0.24,y=0.12)
The increased correlation between gape width & buccal length is evident in the plot.
That's it!
Written by Liam J. Revell. Last updated 3 December 2016.