Here are a few more details about Brownian motion

BMlk is function that calculates the likelihood of BM. There are two steps to follow here .

- We can get the ML estimate of sigma^2 following the equation from OāMeara 2006.

\[\hat{\sigma}^2 = \frac{[X - E(X)]'C^{-1}[X- E(X)]}{N}\]

Where \(X\) is a vector of the tip values, \(E(X)\) is the expected values at the root, \(C\) is the phylogenetic variance-covariance matrix, and \(N\) is the number of tips. Alsoā¦

- \([X - E(X)]'\) is the
*transpose*of the tip values minus the root value - \(C^{-1}\) is the inverse of the phylogenetic variance-covariance matrix

- We can get the
**likelihood**of any value of \(\sigma^2\) by using equation 1:

\[log(L) = log\left [ \frac{exp\left \{ -\frac{1}{2}[X - E(X)]'V^{-1}[X- E(X)] \right \}}{((2\pi)^N*det(V)} \right ]\]

\(V\) here is called the **rate matrix**. It is simply the proposed value of \(\sigma^2\) multiplied by the phylogenetic variance-covariance matrix. So, **Eq. 2** is saying that the likelihood (or rather the log of the likelihood) is a function of the difference between the **tip** and **root** values and the **rate matrix**, \(V\) (which is simply the vcv matrix multiplied by some value of \(\sigma^2\) that we wish to evaluate).

We use can use **Eq 2** to write a function in R that will allow us to compare likelihoods given some tree and and a \(\sigma^2\). Our function will take the following argumentsā¦

- the phylogenetic variance-covariance matrix
- the inverse of the vcv matrix
- \(\sigma^2\)
- the root state, and
- a vector of the tips values for the trait

`require(phytools)`

`## Loading required package: phytools`

`## Loading required package: ape`

`## Loading required package: maps`

```
##
## # maps v3.1: updated 'world': all lakes moved to separate new #
## # 'lakes' database. Type '?world' or 'news(package="maps")'. #
```

```
BMlk <- function(C, inv.C, sigmasq, root.state, data) {
N <- length(data); # the number of tips
EX <- rep(root.state, N) # creates a vector of the expected trait value - which under BM is the root state
V <- C * sigmasq; # multiply the entries in C by the BM rate
inv.V <- inv.C * sigmasq ^-1; # do the same for the inverted matrix using the inverse of the rate
lnlNum<- -0.5*(data - EX) %*% inv.V %*% (data - EX)
lnlDen<- log(sqrt((2*pi)^N*det(V)))
L<-lnlNum-lnlDen
return(L);
}
```

Note that we are going to have to do a little work before we can use our function. We need to pass it a vcv matrix and the inverse of the matrix, we need to pass in a root state value, and we need to pass a value of \(\sigma^2\). Letās try a fast example. We willā¦

- simulate a 5 taxon tree
- simulate BM on that tree with a known \(\sigma^2\) of 0.5
- compare the likelihood of the generating \(\sigma^2\) to a smaller \(\sigma^2\) of 0.005.

```
# we can check this works for a really simple example - let's simulate a tree and some data and try it out.
require(TreeSim);
```

`## Loading required package: TreeSim`

`## Loading required package: laser`

`## Loading required package: geiger`

```
require(phytools);
t <- 10 # total time
n <- 100 # total taxa
b <- (log(n) - log(2))/t
tree <- pbtree(b = b, n = n, t = t)
```

```
## simulating with both taxa-stop (n) and time-stop (t) is
## performed via rejection sampling & may be slow
##
## 99 trees rejected before finding a tree
```

```
data <- fastBM(tree, sig2 = 0.5);
plotTree(ladderize(tree))
```

```
#data
# we'll need the covariance matrix
c <- vcvPhylo(tree, anc.nodes = F);
#c
inv.c <- solve(c);
```

Lets compare the likelihoods of the generating sigma square and a much lower value

```
#inv.c
## now we'll compute the likelihood of the true value and a different value ##
BMlk(c, inv.c, sigmasq = 0.5, root.state = 0, data = data); # the true value for sigsq
```

```
## [,1]
## [1,] -150.7476
```

```
#BMlk(c, inv.c, sigmasq = 0.05, root.state = 0, data = data); #
BMlk(c, inv.c, sigmasq = 0.005, root.state = 0, data = data); #
```

```
## [,1]
## [1,] -5899.365
```

Remember when comparing likelihoods, the higher the number, the better the likelihood (because the likelihood of a hypothesis is proportional to the probability of the data given that the hypothesis is true). **Smaller** (more negative) log likelihoods are also smaller likelihoods (raise \(e\) to the power of the competing likelihoods to check this) so we can see that our data prefers the the generating \(\sigma^2\) value to the alternative.

We can also look at the likelihood surface around the generating value

```
vals <- rep(0,100)
sigmas <- 1:100/100
thetas <- 1:100/1000
for(ii in 1:100){
vals[ii] <- BMlk(c, inv.c, sigmasq = sigmas[ii], root.state = sigmas[ii], data = data)
}
vals
```

```
## [1] -2913.9403 -1438.7831 -955.7866 -718.5371 -578.7152 -487.1794
## [7] -422.9934 -375.7506 -339.7030 -311.4221 -288.7390 -270.2161
## [13] -254.8642 -241.9808 -231.0539 -221.7016 -213.6338 -206.6262
## [19] -200.5028 -195.1237 -190.3762 -186.1688 -182.4263 -179.0865
## [25] -176.0975 -173.4156 -171.0037 -168.8305 -166.8689 -165.0957
## [31] -163.4909 -162.0369 -160.7186 -159.5225 -158.4370 -157.4516
## [37] -156.5572 -155.7455 -155.0093 -154.3421 -153.7381 -153.1920
## [43] -152.6992 -152.2552 -151.8564 -151.4992 -151.1804 -150.8972
## [49] -150.6469 -150.4271 -150.2357 -150.0705 -149.9299 -149.8120
## [55] -149.7154 -149.6385 -149.5801 -149.5389 -149.5138 -149.5037
## [61] -149.5077 -149.5249 -149.5545 -149.5956 -149.6475 -149.7096
## [67] -149.7812 -149.8618 -149.9508 -150.0476 -150.1519 -150.2631
## [73] -150.3808 -150.5047 -150.6344 -150.7695 -150.9097 -151.0547
## [79] -151.2042 -151.3579 -151.5157 -151.6771 -151.8421 -152.0104
## [85] -152.1818 -152.3562 -152.5332 -152.7128 -152.8948 -153.0791
## [91] -153.2655 -153.4539 -153.6441 -153.8360 -154.0296 -154.2247
## [97] -154.4211 -154.6189 -154.8179 -155.0180
```

```
#filled.contour(sigmas, sigmas, vals)
plot(sigmas, vals,type = "l", ylab = "ln(L)")
```

`plot(sigmas,vals, xlim = c(0.2, 0.8), type = "l", ylab = "ln(L)")`

```
# BMlk(c, inv.c, sigmasq = 0.5, root.state = 0, data = data); # the true value for sigsq
# #BMlk(c, inv.c, sigmasq = 0.05, root.state = 0, data = data); #
# BMlk(c, inv.c, sigmasq = 0.49, root.state = 0, data = data); #
# BMlk(c, inv.c, sigmasq = 0.48, root.state = 0, data = data); #
```

The likelihood surface is quite flat around the generating value!

Another look at contrasts andBM rateā¦

```
library(phytools)
## simulate a coalescent shaped tree
tree<-rcoal(n=50)
plot(tree,ftype="off")
```

`## Warning in plot.window(...): "ftype" is not a graphical parameter`

`## Warning in plot.xy(xy, type, ...): "ftype" is not a graphical parameter`

`## Warning in title(...): "ftype" is not a graphical parameter`