Tuesday, August 27, 2013

The roots of genetics lie on the peptide bond!

Proteins sustain life on our planet, from major biogeochemical cycles necessary for planetary stability to crucial signaling in brain activities important for cognition and behavior. Their misfolding results in aggregation and diseases such as Alzheimers’s or Creutzfeld-Jakob. Their challenge and deregulation causes pathogenesis and cancer. Despite of their importance, our knowledge of how this sophisticated machinery was selected to carry biological functions, the rationale for molecular change, the mysterious origin of the ‘vocabulary’ that shapes genetics (the genetic code) and the evolutionary drivers of protein structure, have yet to be uncovered. This represents important omissions in biological knowledge that need to be urgently addressed. In a remarkable breakthrough that has been published in PLoS ONE [1] we reveal that the fundamental molecular principle lies conspicuously not in the nucleic acids but in the protein chemical bonds. We uncover a new and more primitive code in pairs of amino acid constituents of proteins that enable protein folding and flexibility. These dipeptides were initially produced by archaic synthetases that with time transformed into a yin-yang of modern aminoacyl-tRNA synthetases, the modern safekeepers of the genetic code. The new structural code that we have uncovered appears responsible for molecular innovations. This changes the focus of molecular biology, from replicators and genetics to molecular dynamics, emergence and the chemistries of function.
  1. Caetano-Anollés G, Wang M, Caetano-Anollés D (2013) Structural phylogenomics retrodicts the origin of the genetic code and uncovers the evolutionary impact of protein flexibility. PLoS ONE 8(8): e72225. doi:10.1371/journal.pone.0072225

Thursday, June 6, 2013

The origin of viruses revealed!

Explaining the origin of viruses remains an important challenge for evolutionary biology. Previous explanatory frameworks described viruses as founders of cellular life, as parasitic reductive products of ancient cellular organisms or as escapees of modern genomes [1]. Each of these frameworks endow viruses with distinct molecular, cellular, dynamic and emergent properties that carry broad and important implications for many disciplines, including biology, ecology and epidemiology. A recent genome-wide structural phylogenomic analysis shows that large-to-medium-sized viruses coevolved with cellular ancestors and have chosen the evolutionary reductive route [2].
  1. Nasir et al. (2012) Viral evolution: Primordial cellular origins and late adaptation to parasitism. Mob. Genet. Elements 2(5): 247-252.
  2. Nasir et al. (2012) Giant viruses coexisted with the cellular ancestors and represent a distinct supergroup along with superkingdoms Archaea, Bacteria and Eukarya. BMC Evol. Biol. 12: 156.


The origin of the ribosome is in mechanics, not protein synthesis

The origin and evolution of modern biochemistry is a complex problem that has puzzled scientists for almost a century. While comparative, functional and structural genomics has unraveled considerable complexity at the molecular level, there is very little understanding of the origin, evolution and structure of the molecules responsible for cellular or viral features in life. The ribosome is the most central macromolecular complex of the cell. It is responsible for protein synthesis and its biosynthetic functions set cells apart from viruses. the origin of this ensemble is mysterious. Proponents of the ancient 'RNA world' postulate that the ribosome was originally an RNA enzyme (a ribozyme) that was responsible for genetics. A recent paper by Harish and Caetano-Anollés (PLoS ONE 7 (3): e32776, 2012) challenge this scenario and the possible existence of an RNA world.Deep historical signal was retrieved from a census of molecular structures and functions in thousands of nucleic acid and protein structures and hundreds of genomes using powerful phylogenomic methods. Together with structural, chemical and cell biology considerations, this information reveals that the ribosome is the result of gradual and coordinated evolutionary appearance of molecular parts of RNA and ribosomal proteins. These coevolutionary patterns comply with the principle of continuity and falsify the existence of an ancient RNA world. Instead they are compatible with a model of gradually co-evolving nucleic acids and proteins in interaction with increasingly complex cofactors, lipid membrane structures and other cellular components (Caetano-Anollés et al., J. Mol. Evol. 74 (1-2): 1-34, 2012). This changes the perception we have of the rise of modern biochemistry and prompts further analysis of the emergence of biological complexity in an ever-expanding coevolving world of macromolecules (Caetano-Anollés and Seufferheld, J. Mol. Microbiol. Biotechnol. 23 (1-2): 152-177, 2013).

Monday, August 22, 2011

Unraveling LUCA

The Last Universal Common Ancestor (LUCA) is the primordial organism that gave rise to diversified life. Its make up is embedded in all the genomes that exist today on Earth. A paper by Kyung Mo Kim and Gustavo Caetano-Anollés was recently published in BMC Evolutionary Biology (11:140, 2011), "The proteomic complexity and rise of the primordial ancestor of diversified life". The study describes the proteome of LUCA and makes an account of the structures and functions that are associated with this ancient cellular organism that populated our world about 3 billion years ago.

Origins of translation and cellular life

A paper by Derek Caetano-Anollés, Kyung Mo Kim, Jay E. Mittenthal and Gustavo Caetano-Anollés, "Proteome evolution and the metabolic origins of translation and cellular life", was recently published in the Journal of Molecular Evolution (72: 14-32, 2011). The paper describes phylogenomic efforts to unravel the history of the translation apparatus using information in the structure of protein domains at the fold family level of structural abstraction. Results suggest translation started with metabolic rather than biosynthetic roles and its emergence was linked to the appearance of aminoacyl-tRNA synthetases and translation factors. The ribosomal machinery was a later addition. It appeared together with domains needed for the specificity of the genetic code.


Monday, June 28, 2010

The catalytic origin of modern molecular functions inferred from phylogenomic analysis of ontological data

A paper by Kyung Mo Kim and Gustavo Caetano-Anollés appeared published today in Molecular Biology and Evolution (27: 1710-1733, 2010) that describes the origin and evolution of molecular functions in biology.

The biological processes that characterize the phenotypes of a living system are embodied in the function of molecules and hold the key to evolutionary history, delimiting natural selection and change [1]. They provide direct insight into the emergence, development, and organization of cellular life. However, molecular functions make up a network-like hierarchy of relationships that tell little of evolutionary links between structure and function. For example, Gene Ontology terms represent widely used vocabularies of processes and functions with evolutionary relationships that are implicit but not defined [2]. This new publication uncovers patterns of global evolutionary history in ontological terms associated with the sequence of 38 genomes. These patterns unfold the metabolic origins of molecular functions and major biological transitions that are evident in the evolutionary progression toward complex life [3]. Phylogenies reveal the primordial appearance of hydrolases, transferases, and other enzymatic activities, indicating ancient catalysts were crucial for binding and transport, the emergence of nucleic acids and protein biopolymers, and the communication of primordial cells with the environment. ATPase, GTPase, and helicase activities were the most ancient molecular functions at lower hierarchical levels of ontological complexity. Furthermore, the history of biological processes showed that cellular biopolymer metabolic processes preceded biopolymer biosynthesis and essential processes related to macromolecular formation, energy generation, and signaling.

The phylogenomic approach that is described takes the structure and function paradigm to a completely new level of abstraction, demonstrating a ‘metabolic first’ origin of life and the progressive development of protein biosynthetic machinery, transport systems, and regulation. The fact that chemosynthesis precedes biosynthesis is remarkable and challenges the existence of an ancient RNA world [4]. Phylogenetic statements are reliable, especially because they are congruent with progressive evolutionary change and phylogenomic inferences derived from protein structure [5]. Ultimately, the procedure uncovers patterns in the morphing of function that are unprecedented and necessary for systematic views in biology.

1.     Darwin, C.R. 1859. On the origin of species by means of natural selection. Murray, London.
2.     Ashburner, M., Ball, C.A., Blake, J.A. et al. 2000. Gene Ontology: tool for the unification of biology. Nat Genet 25:25-29.
3.     Szathmáry, E., Maynard Smith, J. 1995. The major evolutionary transitions. Nature 374: 227-232.
4.     Gesteland, R.F., Atkins, J.F. 1993. The RNA world. Cold Spring Harbor press, New York.
5.    Caetano-Anollés, G., Kim, H.S., Mittenthal, J.E. 2007. The origin of modern metabolic networks inferred from phylogenomic analysis of protein architecture. Proc Natl Acad Sci USA 104:9358-9363.

Saturday, April 24, 2010

The ancient history of the structure of ribonuclease P and the early origins of Archaea

A new paper by Feng-Jie Sun and Gustavo Caetano-Anollés was published today in BMC Bioinformatics (11: 153, 2010) that describes the ancient history of ribonuclease P. Ribonuclease P is an ancient endonuclease that cleaves precursor tRNA and generally consists of a catalytic RNA subunit (RPR) and one or more proteins (RPPs). It represents an important macromolecular complex and model system that is universally distributed in life. Its putative origins have inspired fundamental hypotheses, including the proposal of an ancient RNA world.

To study the evolution of this complex, rooted phylogenetic trees of RPR molecules and substructures were constructed and RPP age was estimated using a cladistic method that embeds structure directly into phylogenetic analysis. The general approach was used previously to study the evolution of tRNA, SINE RNA and 5S rRNA, the origins of metabolism, and the evolution and complexity of the protein world, and revealed here remarkable evolutionary patterns. Trees of molecules uncovered the tripartite nature of life and the early origin of archaeal RPRs. Trees of substructures showed molecules originated in stem P12 and were accessorized with a catalytic P1-P4 core structure before the first substructure was lost in Archaea. This core currently interacts with RPPs and ancient segments of the tRNA molecule. Finally, a census of protein domain structure in hundreds of genomes established RPPs appeared after the rise of metabolic enzymes at the onset of the protein world.

The study provides a detailed account of the history and early diversification of a fundamental ribonucleoprotein and offers further evidence in support of the existence of a tripartite organismal world that originated by the segregation of archaeal lineages from an ancient community of primordial organisms.