Comparative Sequencing of Entamoeba Spp.

This project is intended to close E. histolytica, perform comparative sequencing on E. invadens and E. dispar, and identify pathogenicity determinants and host interacting genes. It will deal with twenty-five mega-bases spread across three genomes.

Amebiasis, caused by the parasitic protist Entamoeba histolytica, is one of the most important human parasitic disease world wide, [1,2] causing between 40-100,000 deaths annually. Analysis of the sequence data from the recently completed genome project revealed much concerning the basic biology and pathogenesis of the organism and has been a useful resource for the community. However, problems concerning the extreme repetitiveness and high AT-content of the genome resulted in generation of an incomplete, draft assembly of the genome. Here we propose to increase the value of the E. histolytica genome sequence by comparative analysis with sequence data from the non-pathogenic species Entamoeba dispar and the reptilian parasite Entamoeba invadens. We also intend to advance and significantly improve the E. histolytica genome assembly through the generation of a complete physical map and closure of the remaining gaps in the genome.

Project Information

Despite advances and improvements in health care over the past century, diarrheal diseases continue to exert a staggering disease burden worldwide. In Bangladesh one in every 30 children dies of diarrhea or dysentery by his or her fifth birthday. A prospective study of a cohort of Bangladeshi preschool children found the occurrence of E. histolytica infection and amebic dysentery was 55% and 4% respectively [3]. Amebiasis is predominantly seen in developing countries where a high prevalence of infection is due to fecal contamination of food and water supply, factors that cannot be immediately remedied due to limited financial resources in these countries. There is no vaccine against amebiasis and although it can be treated with nitroimidazole, side effects are common and resistance has been observed in other protozoa [4]. E. histolytica has been listed by the NIAID as a category B priority pathogen due to its low infectious dose and potential for dissemination through compromised food and water supplies in the United States.

The E. histolytica genome sequencing project has produced numerous publications which have utilized the genome sequence data [5-10]. The sequence has already provided a better understanding of the diversity of host interacting molecules, parasite metabolism and evolution (Loftus et al. unpublished). However, there are limits to what we can glean from the sequence as it stands, for several reasons. First, the E. histolytica genome is incomplete and is known to contain unresolvable misassembles, resulting in missed genes and the absence of a defined genome structure. Secondly, there is a lack of sequence from related species to guide gene prediction. Thirdly, many of the predicted genes have unknown functions, and without complete comparative data from the related model species the role of these genes are difficult to predict through bioinformatics or to test in the lab.

The goal of this project is to use comparative sequencing of two additional Entamoeba species for identifying rapidly evolving genes, and species specific gene families. In addition the sequence data will provide a resource for improving the annotation of the current E. histolytica genome. Finally, to be able to fully utilize the generated genomic information, we intend to close the E. histolytica genome through the creation of a high resolution physical map [11], and once the genome assembly has been pinned to the map, perform directed sequencing to close gaps in the genome sequence. Completion of the genome in this way will allow us to identify complete biochemical pathways, so-called \u2018missing genes\u2019 and also provide a picture of spatial genome structure that is currently absent. The complete E. histolytica genome will also serve as a reference genome for the comparative analysis with E. invadens and E. dispar.


This project, which started on March 2005, has been split into two parts: the closure of the E. histolytica genome and comparative shotgun sequencing of E. dispar and E. invadens.

Entamoeba dispar progress by library

Entamoeba histolytica progress by library

Entamoeba invadens progress by library

Data produced from deposits in the NCBI Trace Archive.

Data Access


  1. WHO/PAHO/UNESCO. Report of a consultation of experts on amoebiasis. Weekly Epedemiological Review 72, 97-99 (1997).
  2. Walsh, J.A. Amebiasis in the world. Arch Invest Med (Mex) 17 Suppl 1, 385-9 (1986).
  3. Haque, R. et al. Innate and acquired resistance to amebiasis in bangladeshi children. J Infect Dis 186, 547-52 (2002).
  4. Upcroft, P. & Upcroft, J.A. Drug targets and mechanisms of resistance in the anaerobic protozoa. Clin Microbiol Rev 14, 150-64 (2001).
  5. Beck, D.L. et al. Entamoeba histolytica: sequence conservation of the Gal/GalNAc lectin from clinical isolates. Exp Parasitol 101, 157-63 (2002).
  6. Bruchhaus, I., Loftus, B.J., Hall, N. & Tannich, E. The intestinal protozoan parasite Entamoeba histolytica contains 20 cysteine protease genes, of which only a small subset is expressed during in vitro cultivation. Eukaryot Cell 2, 501-9 (2003).
  7. Cheng, X.J. et al. Intermediate subunit of the Gal/GalNAc lectin of Entamoeba histolytica is a member of a gene family containing multiple CXXC sequence motifs. Infect Immun 69, 5892-8 (2001).
  8. Nixon, J.E. et al. Evidence for lateral transfer of genes encoding ferredoxins, nitroreductases, NADH oxidase, and alcohol dehydrogenase 3 from anaerobic prokaryotes to Giardia lamblia and Entamoeba histolytica. Eukaryot Cell 1, 181-90 (2002).
  9. Nixon, J.E. et al. Iron-dependent hydrogenases of Entamoeba histolytica and Giardia lamblia: activity of the recombinant entamoebic enzyme and evidence for lateral gene transfer. Biol Bull 204, 1-9 (2003).
  10. Van Dellen, K., Field, J., Wang, Z., Loftus, B. & Samuelson, J. LINEs and SINE-like elements of the protist Entamoeba histolytica. Gene 297, 229-39 (2002).
  11. Dear, P.H. & Cook, P.R. Happy mapping: linkage mapping using a physical analogue of meiosis. Nucleic Acids Res 21, 13-20 (1993).
  12. Brumpt, E. Etude sommaire de l' \u201cEntamoeba dispar\u201d n. sp. Amibe kystes quadrinucleus, parasite de l'homme. Bull. Acad. Med. Paris 94, 943 (1925).
  13. Sargeaunt, P.G., Williams, J.E. & Grene, J.D. The differentiation of invasive and non-invasive Entamoeba histolytica by isoenzyme electrophoresis. Trans R Soc Trop Med Hyg 72, 519-21 (1978).
  14. Willhoeft, U., Hamann, L. & Tannich, E. A DNA sequence corresponding to the gene encoding cysteine proteinase 5 in Entamoeba histolytica is present and positionally conserved but highly degenerated in Entamoeba dispar. Infect Immun 67, 5925-9 (1999).
  15. Jacobs, T., Bruchhaus, I., Dandekar, T., Tannich, E. & Leippe, M. Isolation and molecular characterization of a surface-bound proteinase of Entamoeba histolytica. Mol Microbiol 27, 269-76 (1998).
  16. Pillai, D.R., Kobayashi, S. & Kain, K.C. Entamoeba dispar: molecular characterization of the galactose/N-acetyl-d-galactosamine lectin. Exp Parasitol 99, 226-34 (2001).
  17. Wang, Z. et al. Gene discovery in the Entamoeba invadens genome. Mol Biochem Parasitol 129, 23-31 (2003).
  18. Donaldson, M., Heyneman, D., Dempster, R. & Garcia, L. Epizootic of fatal amebiasis among exhibited snakes: epidemiologic, pathologic, and chemotherapeutic considerations. Am J Vet Res 36, 807-17 (1975).


Barbara J. Mann
Depts. of Internal Medicine, Microbiology, University of Virginia Health System

Related Research