The nucleocytoplasmic large DNA viruses (NCLDVs) are a diverse group that currently contain the largest known virions and genomes, also called giant viruses. The first giant virus was isolated and described nearly 20 years ago. Their genome sizes were larger than for any other known virus at the time and it contained a number of genes that had not been previously described in any virus. The origin and evolution of these unusually complex viruses has been puzzling, and various mechanisms have been put forward to explain how some NCLDVs could have reached genome sizes and coding capacity overlapping with those of cellular microbes. Here we critically discuss the evidence and arguments on this topic. We have also updated and systematically reanalysed protein families of the NCLDVs to further study their origin and evolution. Our analyses further highlight the small number of widely shared genes and extreme genomic plasticity among NCLDVs that are shaped via combinations of gene duplications, deletions, lateral gene transfers and de novo creation of protein-coding genes. The dramatic expansions of the genome size and protein-coding gene capacity characteristic of some NCLDVs is now increasingly understood to be driven by environmental factors rather than reflecting relationships to an ancient common ancestor among a hypothetical cellular lineage. Thus, the evolution of NCLDVs is writ large viral, and their origin, like all other viral lineages, remains unknown.
- 1182 Biokemia, solu- ja molekyylibiologia