Abstract (deu)
Genes can be divided into gene families, which are defined by homology, and let presume that the genes evolved from a common ancestral gene. In this content the size of a gene family is subject to evolutionary change.
We model gene duplications and deletions using a birth-and-death process. Based on the number of gene copies in a set of extant species, we apply a maximum likelihood approach to infer the birth-rate and the death-rate due to duplications and deletions respectively. Furthermore we determine the number of gene copies of the most recent common ancestor.
To validate this strategy, we performed simulation studies. Assuming a fixed number of gene copies for the ancestor and specific rates for duplication and deletion of genes, we simulated the evolution of the number of gene copies along a phylogenetic tree. Using our maximum likelihood framework, we subsequently estimated the rates and the ancestral number used for the simulation. A collection of different simulation studies showed that the maximum likelihood approach infers the given parameter quite good.
We further applied our method to biological gene family data from vertebrates of the Inparanoid and the Ensembl databases. Compared to previous reported rates our estimates are about one magnitude lower. The data was also considered with regards to model violations, since it is assumed that e.g. large-scale duplication, like whole genome duplications, have occurred during evolution. Hence, extensions of our method and future work are discussed.