Exercise 6: Fitting models of discrete character evolution

For this exercise we will fit models of discrete character evolution. We can use a dataset from a paper by Brandley et al. (2008).

First, load the data files into R. You can download the files here:

  1. brandley_table.csv
  2. squamate.tre
library(ape)
library(phytools)
sqData<-read.csv("brandley_table.csv")
sqTree<-read.nexus("squamate.tre")
plotTree(sqTree,type="fan",lwd=1,fsize=0.5)

plot of chunk unnamed-chunk-1

The data set that we have is composed of measurements for continuous characters. E.g.:

head(sqData)
##               Species       Family   SVL   TL   SE FLL HLL Fingers Toes N
## 1 Agamodon anguliceps Amphisbaenia  74.7  8.3  3.1 0.0   0       0    0 1
## 2    Amphisbaena alba Amphisbaenia 474.0 40.1 12.6 0.0   0       0    0 6
## 3       Bipes biporus Amphisbaenia 167.2 16.0  3.6 6.1   0       5    0 6
## 4   Bipes caniculatus Amphisbaenia 182.5 29.6  4.5 6.5   0       4    0 3
## 5   Bipes tridactylus Amphisbaenia 127.5 36.0  3.7 6.0   0       3    0 2
## 6     Blanus cinereus Amphisbaenia 174.9 21.8  4.1 0.0   0       0    0 5
##     Ecology            Source    PC1    PC2    PC3 Morph Ecology.1
## 1 burrowing Zug et al. (2001) -0.880 -0.410 -2.726     1         1
## 2 burrowing Zug et al. (2001) -1.078  2.057 -1.055     1         1
## 3 burrowing Zug et al. (2001) -0.021  0.566 -2.780     1         1
## 4 burrowing Zug et al. (2001) -0.220  0.634 -1.932     1         1
## 5 burrowing Zug et al. (2001) -0.418  0.028 -1.522     1         1
## 6 burrowing Zug et al. (2001) -1.132  0.296 -1.673     1         1

To get a character for “limbed” versus “limbless” we will have to create it by finding all species with all limbs of length 0.

hindLimbs<-sqData[,"HLL"]!=0
foreLimbs<-sqData[,"FLL"]!=0
limbless<-!hindLimbs & !foreLimbs
limbless
##   [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
##  [12]  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
##  [23] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE  TRUE
##  [34] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [45] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [56] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [67] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [78] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [89] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [100] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [111] FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [122] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
## [133]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [144] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
## [155] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [166] FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [177] FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE
## [188]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [199] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE
## [210] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE
## [221]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE
## [232]  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE
## [243]  TRUE  TRUE  TRUE FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE FALSE
## [254] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Now we need to set our species names to match our tree, and assign these names to our trait vector:

speciesNames<-sqData[,1]
speciesNames
##   [1] Agamodon anguliceps              Amphisbaena alba                
##   [3] Bipes biporus                    Bipes caniculatus               
##   [5] Bipes tridactylus                Blanus cinereus                 
##   [7] Chirindia swimmertoni            Diplometopon zarudnyi           
##   [9] Geocalamus acutus                Monopeltis capensis             
##  [11] Rhineura floridana               Trogonophis wiegmanni           
##  [13] Abronia graminea                 Anguis fragilis                 
##  [15] Anniella geronimensis            Anniella pulchra                
##  [17] Barisia imbricata                Celestus enneagrammus           
##  [19] Diploglossus bilobatus           Diploglossus pleei              
##  [21] Elgaria coerulea                 Elgaria kingii                  
##  [23] Elgaria multicarinata            Elgaria panamintina             
##  [25] Elgaria paucicarinata            Gerrhonotus liocephalus         
##  [27] Mesaspis moreleti                Ophiodes striatus               
##  [29] Ophisaurus apodus                Ophisaurus attenuatus           
##  [31] Ophisaurus harti                 Ophisaurus koellikeri           
##  [33] Ophisaurus ventralis             Sauresia agasepsoides           
##  [35] Wetmorena haetiana               Chamaesaura anguina             
##  [37] Cordylus cataphractus            Cordylus cordylus               
##  [39] Cordylus jordani                 Cordylus warreni                
##  [41] Platysaurus rhodiseinsis         Pseudocordylus microlepidotus   
##  [43] Dibamus novaeguineae             Coleonyx elegans                
##  [45] Diplodactylus damaeus            Gekko gecko                     
##  [47] Gonatodes albogularis            Oedura coggeri                  
##  [49] Teratoscincus scincus            Angolosaurus skoogi             
##  [51] Cordylosaurus trivittata         Gerrhosaurus major              
##  [53] Gerrhosaurus nigrolineatus       Tetradactylus africanus         
##  [55] Tetradactylus seps               Tetradactylus tetradactylus     
##  [57] Tracheloptychus madagascariensis Zonosaurus ornatus              
##  [59] Alopoglossus atriventris         Alopoglossus carinicaudatus     
##  [61] Alopoglossus copii               Arthrosaura kockii              
##  [63] Arthrosaura reticulata           Bachia bresslaui                
##  [65] Bachia dorbignyi                 Bachia flavescens               
##  [67] Calyptommatus leiolepis          Calyptommatus nicterus          
##  [69] Calyptommatus sinebrachiatus     Cercosaura argulus              
##  [71] Cercosaura eigenmanni            Cercosaura ocellata             
##  [73] Cercosaura quadrilineata         Cercosaura schreibersii         
##  [75] Colobodactylus dalcyanus         Colobodactylus taunayi          
##  [77] Colobosaura modesta              Ecpleopus affinis               
##  [79] Gymnophthalmus leucomystax       Heterodactylus imbricatus       
##  [81] Iphisa elegans                   Leposoma percarinatum           
##  [83] Micrablepharus maximiliani       Neusticurus ecpleopus           
##  [85] Neusticurus rudis                Nothobachia ablephara           
##  [87] Pholidobolus montium             Placosoma glabellum             
##  [89] Procellosaurinus erythrocercus   Procellosaurinus tetradactylus  
##  [91] Proctoporus bolivianus           Proctoporus simoterus           
##  [93] Psilophthalmus paeminosus        Ptychoglossus brevifrontalis    
##  [95] Rhachisaurus brachylepis         Tretioscincus agilis            
##  [97] Vanzosaura rubricauda            Heloderma suspectum             
##  [99] Basiliscus basiliscus            Calotes versicolor              
## [101] Dipsosaurus dorsalis             Enyaloides laticeps             
## [103] Gambelia wislizenii              Leiocephalus carinatus          
## [105] Leiolepis belliana               Phrynocephalus versicolor       
## [107] Polychrus marmoratus             Urosaurus ornatus               
## [109] Meroles cuneirostris             Podarcis sicula                 
## [111] Psammodromus hispanicus          Takydromus smaragdinus          
## [113] Lanthanotus borneensis           Aprasia pseudopulchella         
## [115] Aprasia pulchella                Aprasia repens                  
## [117] Aprasia striolata                Delma australis                 
## [119] Delma borea                      Delma fraseri                   
## [121] Delma grayi                      Delma impar                     
## [123] Delma inornata                   Delma molleri                   
## [125] Delma nasuta                     Delma tincta                    
## [127] Lialis burtoni                   Lialis jicari                   
## [129] Pletholax gracilis               Pygopus lepidopus               
## [131] Pygopus nigriceps                Acontias litoralis              
## [133] Acontias meleagris               Acontias percivali              
## [135] Acontophiops lineatus            Amphiglossus astrolabi          
## [137] Amphiglossus igneocaudatus       Amphiglossus intermedius        
## [139] Amphiglossus macrocercus         Amphiglossus melanopleura       
## [141] Amphiglossus melanurus           Amphiglossus mouroundavae       
## [143] Amphiglossus ornaticeps          Amphiglossus punctatus          
## [145] Amphiglossus stumpffi            Amphiglossus tsaratananensis    
## [147] Amphiglossus waterloti           Anomalopus mackayi              
## [149] Anomalopus swansoni              Brachymeles gracilis            
## [151] Brachymeles talinis              Calyptotis scutirostrum         
## [153] Chalcides chalcides              Chalcides mionecton             
## [155] Chalcides ocellatus              Chalcides polylepis             
## [157] Coeranoscincus reticulatus       Ctenotus leonhardii             
## [159] Ctenotus pantherinus             Ctenotus robustus               
## [161] Egernia whitii                   Eremiascincus richardsonii      
## [163] Eugongylus rufescens             Eulamprus amplus                
## [165] Eulamprus murrayi                Eulamprus quoyii                
## [167] Eumeces schneideri               Feylinia polylepis              
## [169] Glaphyromorphus gracilipes       Glaphyromorphus isolepis        
## [171] Gnypetoscincus queenslandiae     Gongylomorphus bojeri           
## [173] Hakaria simonyi                  Hemiergis peroni                
## [175] Janetaescincus braueri           Lamprolepis smaragdina          
## [177] Lerista bipes                    Lerista bougainvillii           
## [179] Melanoseps occidentalis          Mesoscincus managuae            
## [181] Mesoscincus schwartzei           Morethia butleri                
## [183] Nangura spinosa                  Notoscincus ornatus             
## [185] Ophiomorus punctatissimus        Ophioscincus ophioscincus       
## [187] Pamelaescincus gardineri         Paracontias brocchii            
## [189] Paracontias hildebrandti         Paracontias holomelas           
## [191] Plestiodon egregius              Plestiodon elegans              
## [193] Plestiodon fasciatus             Plestiodon inexpectatus         
## [195] Plestiodon laticeps              Plestiodon longirostris         
## [197] Plestiodon lynxe                 Plestiodon obsoletus            
## [199] Plestiodon reynoldsi             Prasinohaema virens             
## [201] Proscelotes eggeli               Pygomeles braconnieri           
## [203] Saiphos equalis                  Scelotes anguineus              
## [205] Scelotes arenicolus              Scelotes bipes                  
## [207] Scelotes caffer                  Scelotes gronovii               
## [209] Scelotes kasneri                 Scelotes mirus                  
## [211] Scelotes sexlineatus             Scincopus fasciatus             
## [213] Scincus scincus                  Sepsina angolensis              
## [215] Sphenops boulengeri              Sphenops sphenopsiformis        
## [217] Tiliqua adelaidensis             Tribolonotus gracilis           
## [219] Typhlacontias brevipes           Typhlacontias punctatissimus    
## [221] Typhlosaurus caecus              Typhlosaurus lineatus           
## [223] Voeltzkowia fierinensis          Voeltzkowia lineata             
## [225] Acrochordus granulatus           Agkistrodon contortrix          
## [227] Anilius scytale                  Calabaria reinhardtii           
## [229] Candoia carinata                 Casarea dussumieri              
## [231] Causus maculatus                 Cereberus rhynchops             
## [233] Charina bottae                   Chondropython viridis           
## [235] Cylindrophis maculatus           Elaphe guttata                  
## [237] Epicrates striatus               Eryx johni                      
## [239] Exiliboa plicata                 Farancia abacura                
## [241] Leptotyphlops humilis            Loxocemus bicolor               
## [243] Micrurus fulvius                 Nerodia sipedon                 
## [245] Pareas kuangtungensis            Trachyboa boulengeri            
## [247] Tropidophis haitianus            Typhlops angolensis             
## [249] Ungaliophis continentalis        Uropeltis ceylonicus            
## [251] Xenopeltis unicolor              Shinisaurus crocodiluris        
## [253] Sphenodon punctatus              Ameiva ameiva                   
## [255] Aspidoscelis sexlineatus         Dicrodon guttulatum             
## [257] Teius teyou                      Varanus griseus                 
## [259] Lepidophyma flavimaculatum       Xantusia vigilis                
## [261] Xenosaurus grandis              
## 261 Levels: Abronia graminea Acontias litoralis ... Zonosaurus ornatus
species<-sub(" ","_",speciesNames)
names(limbless)<-species

Now, we can change what we are calling our character values (although TRUE & FALSE would be fine):

limbs<-sapply(limbless,function(x) if(x) "limbless" else "limbed")

Now we're ready to fit our models. We will do this using the function fitDiscrete in the R package 'geiger'.

library(geiger)

The simplest model is one in which there is just one rate of evolution from limblessness to having limbs, and vice versa. This is called the 'equal-rates' or "ER" model.

fitER<-fitDiscrete(sqTree,limbs,model="ER")
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
##  Gonatodes_albogularis
##  Lepidophyma_flavimaculatum
##  Trachyboa_boulengeri
fitER
## GEIGER-fitted comparative model of discrete data
##  fitted Q matrix:
##                    limbed     limbless
##     limbed   -0.001850204  0.001850204
##     limbless  0.001850204 -0.001850204
## 
##  model summary:
##  log-likelihood = -80.487176
##  AIC = 162.974353
##  AICc = 162.989978
##  free parameters = 1
## 
## Convergence diagnostics:
##  optimization iterations = 100
##  failed iterations = 0
##  frequency of best fit = 1.00
## 
##  object summary:
##  'lik' -- likelihood function
##  'bnd' -- bounds for likelihood search
##  'res' -- optimization iteration summary
##  'opt' -- maximum likelihood parameter estimates

It's possible to visualize this model - although in this case the visualization is quite simple!

plot(fitER,signif=5)
title(main="Fitted 'ER' model\nfor the evolution of limblessness")

plot of chunk unnamed-chunk-8

Next, let's fit a model with different backward & forward rates. In this case this would be a "ARD" ('all-rates-different') model:

fitARD<-fitDiscrete(sqTree,limbs,model="ARD")
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
##  Gonatodes_albogularis
##  Lepidophyma_flavimaculatum
##  Trachyboa_boulengeri
fitARD
## GEIGER-fitted comparative model of discrete data
##  fitted Q matrix:
##                    limbed     limbless
##     limbed   -0.001610658  0.001610658
##     limbless  0.003824678 -0.003824678
## 
##  model summary:
##  log-likelihood = -79.383812
##  AIC = 162.767624
##  AICc = 162.814682
##  free parameters = 2
## 
## Convergence diagnostics:
##  optimization iterations = 100
##  failed iterations = 0
##  frequency of best fit = 0.09
## 
##  object summary:
##  'lik' -- likelihood function
##  'bnd' -- bounds for likelihood search
##  'res' -- optimization iteration summary
##  'opt' -- maximum likelihood parameter estimates
plot(fitARD,signif=5)
title(main="Fitted 'ARD' model\nfor the evolution of limblessness")

plot of chunk unnamed-chunk-9

Finally, we can fit a model in which transitions to limblessness are permitted, but not the reverse:

model<-matrix(c(0,0,1,0),2,2)
colnames(model)<-rownames(model)<-c("limbed","limbless")
model
##          limbed limbless
## limbed        0        1
## limbless      0        0
fitIrr<-fitDiscrete(sqTree,limbs,model=model)
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
##  Gonatodes_albogularis
##  Lepidophyma_flavimaculatum
##  Trachyboa_boulengeri
## Warning in fitDiscrete(sqTree, limbs, model = model): Parameter estimates appear at bounds:
##  q21
fitIrr
## GEIGER-fitted comparative model of discrete data
##  fitted Q matrix:
##                    limbed    limbless
##     limbed   -0.002115104 0.002115104
##     limbless  0.000000000 0.000000000
## 
##  model summary:
##  log-likelihood = -83.898819
##  AIC = 169.797637
##  AICc = 169.813262
##  free parameters = 1
## 
## Convergence diagnostics:
##  optimization iterations = 100
##  failed iterations = 0
##  frequency of best fit = 1.00
## 
##  object summary:
##  'lik' -- likelihood function
##  'bnd' -- bounds for likelihood search
##  'res' -- optimization iteration summary
##  'opt' -- maximum likelihood parameter estimates
plot(fitIrr,signif=5)
title(main="Fitted non-reversible model\nfor the evolution of limblessness")

plot of chunk unnamed-chunk-10

Now, let's compare the AIC weights of each model. This tells us the weight of evidence in our data in support of each hypothesis:

aicc<-setNames(
    c(fitER$opt$aicc,fitIrr$opt$aicc,fitARD$opt$aicc),
    c("ER","Irr","ARD"))
aic.w(aicc)
##         ER        Irr        ARD 
## 0.47067894 0.01552628 0.51379478

This tells us that the weight of evidence is split between "ER" and "ARD", but there is little support for the irreversible model.

Last, we can try a trait with more than two states - in this case the number of digits in the hindfoot. In class, we can try to use fitMk, just because it runs a little faster (though it can be less accurate as a consequence):

toes<-setNames(as.character(round(sqData$Toes)),
    species)
fitER.toes<-fitDiscrete(sqTree,toes,model="ER")
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
##  Gonatodes_albogularis
##  Lepidophyma_flavimaculatum
##  Trachyboa_boulengeri
fitER.toes
## GEIGER-fitted comparative model of discrete data
##  fitted Q matrix:
##                   0             1             2             3
##     0 -0.0045795845  0.0009159169  0.0009159169  0.0009159169
##     1  0.0009159169 -0.0045795845  0.0009159169  0.0009159169
##     2  0.0009159169  0.0009159169 -0.0045795845  0.0009159169
##     3  0.0009159169  0.0009159169  0.0009159169 -0.0045795845
##     4  0.0009159169  0.0009159169  0.0009159169  0.0009159169
##     5  0.0009159169  0.0009159169  0.0009159169  0.0009159169
##                   4             5
##     0  0.0009159169  0.0009159169
##     1  0.0009159169  0.0009159169
##     2  0.0009159169  0.0009159169
##     3  0.0009159169  0.0009159169
##     4 -0.0045795845  0.0009159169
##     5  0.0009159169 -0.0045795845
## 
##  model summary:
##  log-likelihood = -235.614300
##  AIC = 473.228600
##  AICc = 473.244225
##  free parameters = 1
## 
## Convergence diagnostics:
##  optimization iterations = 100
##  failed iterations = 0
##  frequency of best fit = 1.00
## 
##  object summary:
##  'lik' -- likelihood function
##  'bnd' -- bounds for likelihood search
##  'res' -- optimization iteration summary
##  'opt' -- maximum likelihood parameter estimates
plot(fitER.toes)
title(main="Fitted 'ER' model\nfor number of digits in hindfoot")

plot of chunk unnamed-chunk-12

Or, we can try an ordered model:

model<-matrix(c(0,1,0,0,0,0,
    2,0,3,0,0,0,
    0,4,0,5,0,0,
    0,0,6,0,7,0,
    0,0,0,8,0,9,
    0,0,0,0,10,0),6,6)
rownames(model)<-colnames(model)<-0:5
fitOrdered<-fitDiscrete(sqTree,toes,model=model)
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
##  Gonatodes_albogularis
##  Lepidophyma_flavimaculatum
##  Trachyboa_boulengeri
## Warning in fitDiscrete(sqTree, toes, model = model): Parameter estimates appear at bounds:
##  q13
##  q14
##  q15
##  q16
##  q24
##  q25
##  q26
##  q31
##  q35
##  q36
##  q41
##  q42
##  q46
##  q51
##  q52
##  q53
##  q61
##  q62
##  q63
##  q64
plot(fitOrdered,show.zeros=FALSE,signif=4)
title(main="Fitted ordered model\nfor number of digits in hindfoot")

plot of chunk unnamed-chunk-13

Finally, the most parameter rich model:

fitARD.toes<-fitDiscrete(sqTree,toes,model="ARD")
## Warning in treedata(phy, dat): The following tips were not found in 'phy' and were dropped from 'data':
##  Gonatodes_albogularis
##  Lepidophyma_flavimaculatum
##  Trachyboa_boulengeri
plot(fitARD.toes,show.zeros=FALSE,signif=4)
title(main="Fitted 'ARD' model\nfor number of digits in hindfoot")

plot of chunk unnamed-chunk-14

And, as before:

aicc.toes<-setNames(
    c(fitER.toes$opt$aicc,fitOrdered$opt$aicc,
    fitARD.toes$opt$aicc),
    c("ER","Ordered","ARD"))
aic.w(aicc.toes)
##         ER    Ordered        ARD 
## 0.00000000 0.99999949 0.00000051

Unlike our first example, this case makes more sense because a model in which digits can be lost & regained, but in an ordered fashion, fits best by far.

Challenge problem 4: Fitting models of discrete character evolution

Using the same data sets as in exercise 12, fit a directional ordered model for hindfoot toe loss (i.e., 5→4→3→2→1→0). How does this model compare to the previous three models?

The data are from Brandley et al. (2008) and are as follows:

  1. brandley_table.csv
  2. squamate.tre

Written by Liam J. Revell & Luke J. Harmon. Last updated 1 August 2017.