(Copyright 1997, Ross Hardison)
TABLE OF CONTENTS
The taxonomic relationship of mycoplasmas to other microbes has been controversial (Razin, 1992). Prior to the 1930's, mycoplasmas were considered to be viruses because they were so small that they passed through filters that blocked passage of ordinary bacteria. Later they were thought to be symbionts growing with the Streptobacillus bacteria, and then they were proposed to be ordinary bacteria that had lost their cell wall (L form bacteria). By the 1960's, both base composition and hybridization analysis of the genomic DNA showed that mycoplasmas were not related to stable L forms of ordinarily walled bacteria. However, current explanations for the evolution of mycoplasmas argue for "degenerative" evolution from walled bacteria; thus the induction of L forms may be a present-day recapitulation of one step in mycoplasma evolution.
Mycoplasmas are the smallest and simplest self-replicating organisms. Their genome sizes range from about 540 to 1300 kb, with a G+C content 23-41 mol%. Although the small size, both of the mycoplasma and its genome, initially suggested that they were the most primitive extant organisms, nucleic acid hybridization and sequencing studies now indicate that they are derived from the gram-positive branch of walled eubacteria. Their evolution from these walled bacteria requires a substantial reduction in genome size, including loss of the functions required for synthesis and maintenance of a bacterial cell wall.
A total of 92 species were known within the genus Mycoplasma in 1992, and several related genera, families and orders are now recognized within the class Mollicutes. The name "mycoplasma" was derived from the Greek words mykes, for fungus, and plasma, for something formed or molded. The class Mollicutes is the only class in the division Tenericutes (wall-less bacteria), which is at an equivalent taxonomic level to the gram-positive eubacteria, the gram-negative eubacteria, and the archaebacteria (comprising four divisions of the kingdom Procaryotae in 1992). The term mycoplasma is widely used to refer to any member of the class Mollicutes. Phylogenetic data place the mycoplasmas as a monophyletic cluster within the gram-positive bacteria, specifically the family Bacillaceae, which include the genera Clostridium, Lactobacillus, Bacillus, and Streptococcus (Maniloff, 1983; Maniloff, 1992). If indeed mycoplasmas and the Lactobacillus group share a common ancestor, then it is likely that mycoplasma genomes evolved by "attrition" or "degenerative evolution" from an organism with a genome of about 2,200 to 2,500 kb (current sizes of genomes in the Lactobacillus group) (Maniloff, 1992). Thus the independent taxonomic status of mycoplasmas remains at best controversial. Mycoplasmas are evolving more rapidly than are eubacteria (Maniloff, 1992). The smallest mycoplasmal genomes, of about 600 kb, may be the minimal genetic complexity for a living organism, since this is the minimal size obtained on at least three independent lines of mycoplasmas (Maniloff, 1992).
"Mycoplasmas usually exhibit a rather strict host and tissue specificity, probably reflecting their nutritionally exacting nature and obligate parasite mode of life." (Razin, 1992). M. pneumoniae was long regarded as a pathogen strictly restricted to the tracheal epithelium of humans, but it has been recovered from joints of immunocompromised patients and other tissues even in immunocompetent patients. M. genitalium was first isolated from the urethral discharge of two men with non-gonococcal urethritis (NGU) and was actively studied as a possible pathogen. However, subsequent efforts to isolate more M. genitalium strains from the genital tract have not been successful (at least up until 1992). M. genitalium has been found in several throat isolates in a mixture with M. pneumoniae. "The two organisms share genomic sequences and epitopes." M. genitalium is a "fastidious" bacterium, which prevents its primary cultivation. (So maybe it is not surprising that investigators have not been able to culture it from genitals or from throat very easily). Application of PCR does show M. genitalium in about 10 of 150 genital tract specimens from 100 patients (with NGU?).
Previous characterization of mycoplasmal genomes measured the size and G+C content for M. pneumoniae of about 840 kb and 41 mol% (the highest known value for mycoplasmas) and for M. genitalium of about 590 kb and 32.5 mol% (Herrmann, 1992). There are numerous repetitive DNA sequences but these do not lead to substantial genome instability. This contrasts with the "observation that an unusually high number of spontaneously arising mutants can be isolated."
Mycoplasmal repeated DNA sequences include both multiple copies of protein-coding regions, such as that for portions of the P1 adhesin of M. pneumoniae and M. genitalium, as well as insertion sequences such as RS-1, which belongs to the IS3 class of insertion sequences initially characterized in E. coli (McIntosh et al., 1992).
Mycoplasma pneumoniae is the leading cause of pneumonia in older children and young adults (Krause and Taylor-Robinson, 1992). M. pneumoniae causes tracheobronchitis and primary atypical pneumonia. The chronic, flu-like nature of the mycoplasma-caused disease, contrasting with the abrupt, rigorous onset of most bacterial pneumonias, has led to its nickname "walking pneumonia". M. pneumoniae has been isolated from the oropharynx and lower respiratory tract of infected humans. It clearly is a pathogen, in contrast to several other mycoplasmal species that appear to be part of the normal flora. "M. pneumonia can catabolize glucose or mannose but not utilize arginine as a carbon and energy source. It is capable of reduction reactions under both anaerobic or aerobic conditions, and both hydrogen peroxide and the superoxide anion are by-products of glucose metabolism. It lacks many common enzyme systems that use iron as a cofactor, such as tricarboxylic acid cycle enzymes or a complete electron transfer chain containing cytochromes. Iron has been detected in the membrane, and mycoplasmas are thought to use a truncated electron transport system to generate energy."
Ureaplasma urealyticum is a mycoplasmal species that can cause non-gonococcal urethritis (NGU) in men and possibly in women (Krause and Taylor-Robinson, 1992). It can also cause respiratory disease in newborns. The ureaplasmas are "unique in their ability to metabolize urea through the enzyme urease."
Mycoplasma genitalium was initially recovered from the urethra of males with NGU. Subsequently, it has been found in specimens from the respiratory tract. Its possible role in urogenital disease remains poorly defined (Krause and Taylor-Robinson, 1992).
The complete sequence of the M. genitalium genome (Fraser et al., 1995) shows that it is a circular duplex DNA of 580,070 bp. The presumptive origin of replication is in an A+T rich region between dnaA and dnaN. A total of 470 open reading frames (ORFs) were identified, with about half the genome transcribed predominantly in one direction from the origin and the other half transcribed predominantly in the opposite direction. Thus this genome shows a symmetrical strand bias for transcription reminiscent of some circular viral genomes. Of the 470 ORFs, 374 were identified by sequence matches to entries in a non-redundant bacterial protein database or in the complete set of translated sequences from the Haemophilus influenza genome, or by GeneMark. 96 ORFs have no sequence matches in the databases. The predicted coding regions that could be compared with both Eschericia coli and Bacillus subtilis showed higher similarity to the B. subtilis sequence, strongly supporting the deduced evolutionary relationship between Mycoplasma and the Lactobacillus-Clostridium branch of the gram-positive eubacteria.
The 470 ORFs in M. genitalium show an average size of 1040 bp and comprise 88% of the genome, giving an average of one gene every 1235 bp. This is similar to the gene size and density seen with the 1727 predicted coding regions in H. influenza, with an average gene size of 900 bp comprising 85% of the genome at a density of one gene every 1042 bp. Thus the reduction in genome size for M. genitalium did not result from a increase in gene density or reduction in gene size. One major factor in reducing genome size of this parasitic microbe is the substantial loss of genes encoding enzymes needed for many anabolic pathways, in particular biosynthesis of amino acids, purines and pyrimidines, and fatty acids. Also, the mycoplasma have lost the capacity to make a cell wall.
Not surprisingly for a parasitic organism that must acquire most of its cellular building blocks from its host, a substantial number of transport proteins are encoded, with a fraction of total genes in this category similar to that of H. influenza. As expected, genes encoding enzymes needed for replication and repair of DNA, transcription, translation and "cellular processes" such as cell division, cell killing, and protein secretion are present, again at a fraction of the total genes similar to (or greater than) that of H. influenza. Catabolic metabolism to produce energy is largely anaerobic, with an intact glycolytic pathway but not tricarboxylic acid cycle or electron transport system (Fraser et al., 1995).
It has been argued that the "degenerative" mode of evolution of M. genitalium by reduction in genome size from that of an ancestral gram-positive bacterium has led to the minimal set of genes needed for a living organism. One might expect that the "necessary" genes that are retained will provide a similar function and hence be conserved in other bacteria or other organisms. If so, then the removal of "unnecessary genes" will increase the fraction of proteins conserved in other species. The similar fraction of total genes devoted to replication, transcription and other retained processes in both H. influenza and M. genitalium (Fraser et al., 1995) shows that the complement of proteins devoted to these tasks could be reduced by about the same amount in M. genitalium. By searching for "bacterial conserved regions" (or BCRs) conserved in distantly related bacteria and for "ancient conserved regions" (or ACRs) shared with eukaryotic or archaeal homologs, Koonin et al. (1996) showed that the fraction of proteins containing BCRs and ACRs is close to the same in E. coli, H. influenza, and M. genitalium. This refutes the model that only a core of critical genes is retained in M. genitalium, but rather suggests that in all three bacteria there is a balance between highly conserved genes and more variable genes. This has been rationalized as possibly reflecting an equilibrium between the stability of major physiological processes and need for environmental adaptability (Koonin et al., 1996).
The complete sequence of the M. pneumoniae genome (Himmelreich et al., 1996) shows that it is a circular duplex DNA of 816,394 bp. The presumptive origin of replication is in an A+T rich region between dnaA and dnaN. A total of 677 open reading frames (ORFs) and 39 genes coding for various RNAs were identified. Only 67 (or 9.9% of the total) ORFs had no significant similarity to sequences in the databases. The predicted coding regions that could be compared with both Eschericia coli and Bacillus subtilis showed higher similarity to the B. subtilis sequence, strongly supporting the deduced evolutionary relationship between Mycoplasma and the Lactobacillus-Clostridium branch of the gram-positive eubacteria.
Similar to the case for M. genitalium , one would expect the evolution of M. pneumoniae as a parasitic bacterium to allow the loss of biosynthetic pathways that were no longer needed, while retaining the transport proteins needed for acquisition of essential building blocks for macromolecular synthesis (e.g., amino acids, purines, pyrimidines, fatty acids) from the host. Indeed, the reductive evolution of M. pneumoniae from ancestral bacteria resulted in a smaller genome. The main causes of this reduction in genome size are (1) the loss of several anabolic pathways, such as biosynthesis of amino acids, purines and pyrimidines, and fatty acids, (2) the absence of a capacity to make a cell wall, and (3) a reduction of the number of proteins involved in essential processes such as DNA replication, repair, recombination, cell division, and protein secretion. A total of 44 transport proteins have been predicted from the genome sequence. This modest number, compared to the number of substrates expected to be transported, suggests that many of these transporters may not be highly specific.
The coding regions in M. pneumoniae comprise a total length of 724,174 bp, or 88.7% of the genome. The average size of a gene is 1011 bp, giving an average of one gene every 1140 bp. This is similar to the gene size and density seen with both the smaller M. genitalium genome as well as the H. influenza genome, which is more than twice as large.