For this exercise we will fit models of discrete character evolution. We can use a dataset from a paper by Brandley et al. (2008).
First, load the data files into R. You can download the files here:
library(ape)
library(phytools)
sqData<-read.csv("brandley_table.csv")
sqTree<-read.nexus("squamate.tre")
plotTree(sqTree,type="fan",lwd=1,fsize=0.5)
The data set that we have is composed of measurements for continuous characters. E.g.:
head(sqData)
## Species Family SVL TL SE FLL HLL Fingers Toes N
## 1 Agamodon anguliceps Amphisbaenia 74.7 8.3 3.1 0.0 0 0 0 1
## 2 Amphisbaena alba Amphisbaenia 474.0 40.1 12.6 0.0 0 0 0 6
## 3 Bipes biporus Amphisbaenia 167.2 16.0 3.6 6.1 0 5 0 6
## 4 Bipes caniculatus Amphisbaenia 182.5 29.6 4.5 6.5 0 4 0 3
## 5 Bipes tridactylus Amphisbaenia 127.5 36.0 3.7 6.0 0 3 0 2
## 6 Blanus cinereus Amphisbaenia 174.9 21.8 4.1 0.0 0 0 0 5
## Ecology Source PC1 PC2 PC3 Morph Ecology.1
## 1 burrowing Zug et al. (2001) -0.880 -0.410 -2.726 1 1
## 2 burrowing Zug et al. (2001) -1.078 2.057 -1.055 1 1
## 3 burrowing Zug et al. (2001) -0.021 0.566 -2.780 1 1
## 4 burrowing Zug et al. (2001) -0.220 0.634 -1.932 1 1
## 5 burrowing Zug et al. (2001) -0.418 0.028 -1.522 1 1
## 6 burrowing Zug et al. (2001) -1.132 0.296 -1.673 1 1
To get a character for “limbed” versus “limbless” we will have to create it by finding all species with all limbs of length 0.
hindLimbs<-sqData[,"HLL"]!=0
foreLimbs<-sqData[,"FLL"]!=0
limbless<-!hindLimbs & !foreLimbs
limbless
## [1] TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE
## [12] TRUE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
## [23] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE TRUE
## [34] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [45] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [56] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [67] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [78] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [89] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [100] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [111] FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [122] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
## [133] TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [144] FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
## [155] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [166] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [177] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE
## [188] TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [199] FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE
## [210] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE
## [221] TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE FALSE TRUE TRUE
## [232] TRUE FALSE FALSE TRUE TRUE FALSE FALSE FALSE TRUE TRUE FALSE
## [243] TRUE TRUE TRUE FALSE FALSE TRUE FALSE TRUE TRUE FALSE FALSE
## [254] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Now we need to set our species names to match our tree, and assign these names to our trait vector:
speciesNames<-sqData[,1]
speciesNames
## [1] Agamodon anguliceps Amphisbaena alba
## [3] Bipes biporus Bipes caniculatus
## [5] Bipes tridactylus Blanus cinereus
## [7] Chirindia swimmertoni Diplometopon zarudnyi
## [9] Geocalamus acutus Monopeltis capensis
## [11] Rhineura floridana Trogonophis wiegmanni
## [13] Abronia graminea Anguis fragilis
## [15] Anniella geronimensis Anniella pulchra
## [17] Barisia imbricata Celestus enneagrammus
## [19] Diploglossus bilobatus Diploglossus pleei
## [21] Elgaria coerulea Elgaria kingii
## [23] Elgaria multicarinata Elgaria panamintina
## [25] Elgaria paucicarinata Gerrhonotus liocephalus
## [27] Mesaspis moreleti Ophiodes striatus
## [29] Ophisaurus apodus Ophisaurus attenuatus
## [31] Ophisaurus harti Ophisaurus koellikeri
## [33] Ophisaurus ventralis Sauresia agasepsoides
## [35] Wetmorena haetiana Chamaesaura anguina
## [37] Cordylus cataphractus Cordylus cordylus
## [39] Cordylus jordani Cordylus warreni
## [41] Platysaurus rhodiseinsis Pseudocordylus microlepidotus
## [43] Dibamus novaeguineae Coleonyx elegans
## [45] Diplodactylus damaeus Gekko gecko
## [47] Gonatodes albogularis Oedura coggeri
## [49] Teratoscincus scincus Angolosaurus skoogi
## [51] Cordylosaurus trivittata Gerrhosaurus major
## [53] Gerrhosaurus nigrolineatus Tetradactylus africanus
## [55] Tetradactylus seps Tetradactylus tetradactylus
## [57] Tracheloptychus madagascariensis Zonosaurus ornatus
## [59] Alopoglossus atriventris Alopoglossus carinicaudatus
## [61] Alopoglossus copii Arthrosaura kockii
## [63] Arthrosaura reticulata Bachia bresslaui
## [65] Bachia dorbignyi Bachia flavescens
## [67] Calyptommatus leiolepis Calyptommatus nicterus
## [69] Calyptommatus sinebrachiatus Cercosaura argulus
## [71] Cercosaura eigenmanni Cercosaura ocellata
## [73] Cercosaura quadrilineata Cercosaura schreibersii
## [75] Colobodactylus dalcyanus Colobodactylus taunayi
## [77] Colobosaura modesta Ecpleopus affinis
## [79] Gymnophthalmus leucomystax Heterodactylus imbricatus
## [81] Iphisa elegans Leposoma percarinatum
## [83] Micrablepharus maximiliani Neusticurus ecpleopus
## [85] Neusticurus rudis Nothobachia ablephara
## [87] Pholidobolus montium Placosoma glabellum
## [89] Procellosaurinus erythrocercus Procellosaurinus tetradactylus
## [91] Proctoporus bolivianus Proctoporus simoterus
## [93] Psilophthalmus paeminosus Ptychoglossus brevifrontalis
## [95] Rhachisaurus brachylepis Tretioscincus agilis
## [97] Vanzosaura rubricauda Heloderma suspectum
## [99] Basiliscus basiliscus Calotes versicolor
## [101] Dipsosaurus dorsalis Enyaloides laticeps
## [103] Gambelia wislizenii Leiocephalus carinatus
## [105] Leiolepis belliana Phrynocephalus versicolor
## [107] Polychrus marmoratus Urosaurus ornatus
## [109] Meroles cuneirostris Podarcis sicula
## [111] Psammodromus hispanicus Takydromus smaragdinus
## [113] Lanthanotus borneensis Aprasia pseudopulchella
## [115] Aprasia pulchella Aprasia repens
## [117] Aprasia striolata Delma australis
## [119] Delma borea Delma fraseri
## [121] Delma grayi Delma impar
## [123] Delma inornata Delma molleri
## [125] Delma nasuta Delma tincta
## [127] Lialis burtoni Lialis jicari
## [129] Pletholax gracilis Pygopus lepidopus
## [131] Pygopus nigriceps Acontias litoralis
## [133] Acontias meleagris Acontias percivali
## [135] Acontophiops lineatus Amphiglossus astrolabi
## [137] Amphiglossus igneocaudatus Amphiglossus intermedius
## [139] Amphiglossus macrocercus Amphiglossus melanopleura
## [141] Amphiglossus melanurus Amphiglossus mouroundavae
## [143] Amphiglossus ornaticeps Amphiglossus punctatus
## [145] Amphiglossus stumpffi Amphiglossus tsaratananensis
## [147] Amphiglossus waterloti Anomalopus mackayi
## [149] Anomalopus swansoni Brachymeles gracilis
## [151] Brachymeles talinis Calyptotis scutirostrum
## [153] Chalcides chalcides Chalcides mionecton
## [155] Chalcides ocellatus Chalcides polylepis
## [157] Coeranoscincus reticulatus Ctenotus leonhardii
## [159] Ctenotus pantherinus Ctenotus robustus
## [161] Egernia whitii Eremiascincus richardsonii
## [163] Eugongylus rufescens Eulamprus amplus
## [165] Eulamprus murrayi Eulamprus quoyii
## [167] Eumeces schneideri Feylinia polylepis
## [169] Glaphyromorphus gracilipes Glaphyromorphus isolepis
## [171] Gnypetoscincus queenslandiae Gongylomorphus bojeri
## [173] Hakaria simonyi Hemiergis peroni
## [175] Janetaescincus braueri Lamprolepis smaragdina
## [177] Lerista bipes Lerista bougainvillii
## [179] Melanoseps occidentalis Mesoscincus managuae
## [181] Mesoscincus schwartzei Morethia butleri
## [183] Nangura spinosa Notoscincus ornatus
## [185] Ophiomorus punctatissimus Ophioscincus ophioscincus
## [187] Pamelaescincus gardineri Paracontias brocchii
## [189] Paracontias hildebrandti Paracontias holomelas
## [191] Plestiodon egregius Plestiodon elegans
## [193] Plestiodon fasciatus Plestiodon inexpectatus
## [195] Plestiodon laticeps Plestiodon longirostris
## [197] Plestiodon lynxe Plestiodon obsoletus
## [199] Plestiodon reynoldsi Prasinohaema virens
## [201] Proscelotes eggeli Pygomeles braconnieri
## [203] Saiphos equalis Scelotes anguineus
## [205] Scelotes arenicolus Scelotes bipes
## [207] Scelotes caffer Scelotes gronovii
## [209] Scelotes kasneri Scelotes mirus
## [211] Scelotes sexlineatus Scincopus fasciatus
## [213] Scincus scincus Sepsina angolensis
## [215] Sphenops boulengeri Sphenops sphenopsiformis
## [217] Tiliqua adelaidensis Tribolonotus gracilis
## [219] Typhlacontias brevipes Typhlacontias punctatissimus
## [221] Typhlosaurus caecus Typhlosaurus lineatus
## [223] Voeltzkowia fierinensis Voeltzkowia lineata
## [225] Acrochordus granulatus Agkistrodon contortrix
## [227] Anilius scytale Calabaria reinhardtii
## [229] Candoia carinata Casarea dussumieri
## [231] Causus maculatus Cereberus rhynchops
## [233] Charina bottae Chondropython viridis
## [235] Cylindrophis maculatus Elaphe guttata
## [237] Epicrates striatus Eryx johni
## [239] Exiliboa plicata Farancia abacura
## [241] Leptotyphlops humilis Loxocemus bicolor
## [243] Micrurus fulvius Nerodia sipedon
## [245] Pareas kuangtungensis Trachyboa boulengeri
## [247] Tropidophis haitianus Typhlops angolensis
## [249] Ungaliophis continentalis Uropeltis ceylonicus
## [251] Xenopeltis unicolor Shinisaurus crocodiluris
## [253] Sphenodon punctatus Ameiva ameiva
## [255] Aspidoscelis sexlineatus Dicrodon guttulatum
## [257] Teius teyou Varanus griseus
## [259] Lepidophyma flavimaculatum Xantusia vigilis
## [261] Xenosaurus grandis
## 261 Levels: Abronia graminea Acontias litoralis ... Zonosaurus ornatus
species<-sub(" ","_",speciesNames)
names(limbless)<-species
Now, we can change what we are calling our character values (although
TRUE
& FALSE
would be fine):
limbs<-sapply(limbless,function(x) if(x) "limbless" else "limbed")
Now we're ready to fit our models. We will do this using the function
fitDiscrete
in the R package 'geiger'.
library(geiger)
The simplest model is one in which there is just one rate of evolution
from limblessness to having limbs, and vice versa. This is called the
'equal-rates' or "ER"
model.
fitER<-fitDiscrete(sqTree,limbs,model="ER")
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
## Gonatodes_albogularis
## Lepidophyma_flavimaculatum
## Trachyboa_boulengeri
fitER
## GEIGER-fitted comparative model of discrete data
## fitted Q matrix:
## limbed limbless
## limbed -0.001850204 0.001850204
## limbless 0.001850204 -0.001850204
##
## model summary:
## log-likelihood = -80.487176
## AIC = 162.974353
## AICc = 162.989978
## free parameters = 1
##
## Convergence diagnostics:
## optimization iterations = 100
## failed iterations = 0
## frequency of best fit = 1.00
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
It's possible to visualize this model - although in this case the visualization is quite simple!
plot(fitER,signif=5)
title(main="Fitted 'ER' model\nfor the evolution of limblessness")
Next, let's fit a model with different backward & forward rates. In this
case this would be a "ARD"
('all-rates-different') model:
fitARD<-fitDiscrete(sqTree,limbs,model="ARD")
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
## Gonatodes_albogularis
## Lepidophyma_flavimaculatum
## Trachyboa_boulengeri
fitARD
## GEIGER-fitted comparative model of discrete data
## fitted Q matrix:
## limbed limbless
## limbed -0.001610658 0.001610658
## limbless 0.003824678 -0.003824678
##
## model summary:
## log-likelihood = -79.383812
## AIC = 162.767624
## AICc = 162.814682
## free parameters = 2
##
## Convergence diagnostics:
## optimization iterations = 100
## failed iterations = 0
## frequency of best fit = 0.09
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
plot(fitARD,signif=5)
title(main="Fitted 'ARD' model\nfor the evolution of limblessness")
Finally, we can fit a model in which transitions to limblessness are permitted, but not the reverse:
model<-matrix(c(0,0,1,0),2,2)
colnames(model)<-rownames(model)<-c("limbed","limbless")
model
## limbed limbless
## limbed 0 1
## limbless 0 0
fitIrr<-fitDiscrete(sqTree,limbs,model=model)
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
## Gonatodes_albogularis
## Lepidophyma_flavimaculatum
## Trachyboa_boulengeri
## Warning in fitDiscrete(sqTree, limbs, model = model): Parameter estimates appear at bounds:
## q21
fitIrr
## GEIGER-fitted comparative model of discrete data
## fitted Q matrix:
## limbed limbless
## limbed -0.002115104 0.002115104
## limbless 0.000000000 0.000000000
##
## model summary:
## log-likelihood = -83.898819
## AIC = 169.797637
## AICc = 169.813262
## free parameters = 1
##
## Convergence diagnostics:
## optimization iterations = 100
## failed iterations = 0
## frequency of best fit = 1.00
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
plot(fitIrr,signif=5)
title(main="Fitted non-reversible model\nfor the evolution of limblessness")
Now, let's compare the AIC weights of each model. This tells us the weight of evidence in our data in support of each hypothesis:
aicc<-setNames(
c(fitER$opt$aicc,fitIrr$opt$aicc,fitARD$opt$aicc),
c("ER","Irr","ARD"))
aic.w(aicc)
## ER Irr ARD
## 0.47067894 0.01552628 0.51379478
This tells us that the weight of evidence is split between "ER"
and "ARD"
, but there is little support for the irreversible
model.
Last, we can try a trait with more than two states - in this case the
number of digits in the hindfoot. In class, we can try to use fitMk
,
just because it runs a little faster (though it can be less accurate as
a consequence):
toes<-setNames(as.character(round(sqData$Toes)),
species)
fitER.toes<-fitDiscrete(sqTree,toes,model="ER")
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
## Gonatodes_albogularis
## Lepidophyma_flavimaculatum
## Trachyboa_boulengeri
fitER.toes
## GEIGER-fitted comparative model of discrete data
## fitted Q matrix:
## 0 1 2 3
## 0 -0.0045795845 0.0009159169 0.0009159169 0.0009159169
## 1 0.0009159169 -0.0045795845 0.0009159169 0.0009159169
## 2 0.0009159169 0.0009159169 -0.0045795845 0.0009159169
## 3 0.0009159169 0.0009159169 0.0009159169 -0.0045795845
## 4 0.0009159169 0.0009159169 0.0009159169 0.0009159169
## 5 0.0009159169 0.0009159169 0.0009159169 0.0009159169
## 4 5
## 0 0.0009159169 0.0009159169
## 1 0.0009159169 0.0009159169
## 2 0.0009159169 0.0009159169
## 3 0.0009159169 0.0009159169
## 4 -0.0045795845 0.0009159169
## 5 0.0009159169 -0.0045795845
##
## model summary:
## log-likelihood = -235.614300
## AIC = 473.228600
## AICc = 473.244225
## free parameters = 1
##
## Convergence diagnostics:
## optimization iterations = 100
## failed iterations = 0
## frequency of best fit = 1.00
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
plot(fitER.toes)
title(main="Fitted 'ER' model\nfor number of digits in hindfoot")
Or, we can try an ordered model:
model<-matrix(c(0,1,0,0,0,0,
2,0,3,0,0,0,
0,4,0,5,0,0,
0,0,6,0,7,0,
0,0,0,8,0,9,
0,0,0,0,10,0),6,6)
rownames(model)<-colnames(model)<-0:5
fitOrdered<-fitDiscrete(sqTree,toes,model=model)
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
## Gonatodes_albogularis
## Lepidophyma_flavimaculatum
## Trachyboa_boulengeri
## Warning in fitDiscrete(sqTree, toes, model = model): Parameter estimates appear at bounds:
## q13
## q14
## q15
## q16
## q24
## q25
## q26
## q31
## q35
## q36
## q41
## q42
## q46
## q51
## q52
## q53
## q61
## q62
## q63
## q64
plot(fitOrdered,show.zeros=FALSE,signif=4)
title(main="Fitted ordered model\nfor number of digits in hindfoot")
Finally, the most parameter rich model:
fitARD.toes<-fitDiscrete(sqTree,toes,model="ARD")
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
## Gonatodes_albogularis
## Lepidophyma_flavimaculatum
## Trachyboa_boulengeri
plot(fitARD.toes,show.zeros=FALSE,signif=4)
title(main="Fitted 'ARD' model\nfor number of digits in hindfoot")
And, as before:
aicc.toes<-setNames(
c(fitER.toes$opt$aicc,fitOrdered$opt$aicc,
fitARD.toes$opt$aicc),
c("ER","Ordered","ARD"))
aic.w(aicc.toes)
## ER Ordered ARD
## 0.00000000 0.99999949 0.00000051
Unlike our first example, this case makes more sense because a model in which digits can be lost & regained, but in an ordered fashion, fits best by far.
Using the same data sets as in exercise 12, fit a directional ordered model for hindfoot toe loss (i.e., 5→4→3→2→1→0). How does this model compare to the previous three models?
The data are from Brandley et al. (2008) and are as follows:
Written by Liam J. Revell & Luke J. Harmon. Last updated 1 August 2017.