Analysis and Classification of Human Herpesviridae Proteins Using Domain-Architecture Aware Inference of Orthologs (DAIO)

Herpesviridae are a large and diverse family of dsDNA viruses which have been implicated in numerous animal and human diseases. Mammalian Herpesviridae have been divided into three subfamilies—Alphaherpesvirinae, Betaherpesvirinae, and Gammaherpesvirinae. In contrast to many other human viruses, Herpesviridae have a long evolutionary history. The split into the three subfamilies likely occurred before, or around, the time placental mammals appeared (approximately 180 million to 220 million years ago). Therefore, herpesviruses are an attractive model to study viral genome evolution at the levels of gene duplication and protein domain rearrangement.

In this work, we have developed a computational approach called Domain-architecture Aware Inference of Orthologs (DAIO) for the analysis of entire genomes by combining phylogenetic and protein domain architecture information. Using this approach, we performed a systematic phylogenetic and protein domain architecture-based study, encompassing the entire proteomes of all human Herpesviridae, as well as of select non-human herpesviruses, to define Strict Ortholog Groups (SOGs). Besides assessing the taxonomic distribution for each herpesvirus protein, we computationally inferred gene duplication events and performed a comparative protein domain architecture analysis for every protein family.

The results indicate that while many herpesvirus proteins evolved without any detectable gene duplication or domain rearrangement events, numerous herpesvirus protein families do exhibit relatively complex evolutionary histories. Some of them acquired additional domains during evolution (e.g. DNA polymerase, mRNA export factor), whereas others show a combination of domain rearrangements and gene duplications (e.g. G-protein coupled receptor homologs, US22 domain proteins), and thus approach the complexity of eukaryotic protein family evolution. We expect this classification to serve as a stepping stone towards experimental investigations regarding domain functions, and experimental minimal genome reconstructions.

Funding

This work is funded by the National Institute of Allergy and Infectious Diseases (NIH/DHHS) under contract no. HHSN272201400028C.

Principal Investigator

Key Staff

Related Research