We interviewed John Mattick a few years back as we were getting ready to launch EpiGenie, but couldn’t manage to get the transcripts all edited and ready for show time. So, when we ran accross this interview in what was Invitrogen’s Quest magazine we were excited, then bummed because we didn’t run it first, then excited again because the folks up in Carlsbad were cool enough to let us run it. You can find a copy of John Mattick’s full interview in Life Technologies’ Quest Magazine (Issue 5 Volume 1), which is only available electronically these days.
Mattick’s work on how non-coding RNA and introns control the development and evolution of complex organisms is as groundbreaking as it is controversial. His discoveries link non-coding RNA with the evolution of complex organisms, biological diversity, and cognition, lending credence to the field of epigenetics—the study of the 99 percent of the genome that does not code for proteins.
John Mattick Interview
How are you involved in epigenetics research?
Mattick: For a long time it has been evident that the vast majority of genetic information expressed from the genome is conveyed by RNA, primarily non-coding RNA (ncRNA). So, there are two choices: either this RNA is functional or it’s not. Most scientists have assumed the latter. But I’m now convinced that this RNA forms a previously hidden layer of regulation that controls the settings to direct differentiation and development, primarily by controlling gene expression at multiple levels, all the way from chromatin structure through to translation and messenger RNA (mRNA) turnover. The evidence for this has been building steadily over the years. In the last few years, a number of studies have emerged indicating that genomes increase in size as their complexity increases. However, this increase is not in the number of conventional genes encoding proteins, which has stayed remarkably constant all the way from tiny worms to humans. All animals have very similar sets of proteins. Where they differ is in the amount of non-coding sequences in their genomes, which are in fact transcribed into RNA, apparently in a developmentally regulated way.
From a hierarchical standpoint, at the top of the list for the function of these RNAs is modulation of chromatin architecture and epigenetic memory, which is central to embryogenesis and brain development, including learning. There are a limited number of enzymes involved in chromatin modification: histone methylases, acetylases, deacetylases, DNA methylases, etc. The patterns of modification of DNA histones are quite varied and very precise at thousands of different loci around the genome in different cell lineages. The question then arises: Since there is a limited number of these enzymes, what directs them to their sites of action?
The logical answer, now well supported by evidence, is that this is directed by RNA signaling, which recruits these enzymes to the appropriate site of action. Thus, these modifications are protein-mediated, but RNAdirected, a key point. When you look at this in more detail, even the term “epigenetic memory” has two components. The main one is internal. During ontogeny from the point of fertilization as the embryo grows, divides, and differentiates, each cell maintains an epigenetic memory of its history. This memory is mainly held through modifications to chromatin, and is critical to the correct unfolding of the development of the organism. But it’s also clear that this system can intersect with environmental signaling—mainly positional information from neighboring cells, but also including other sorts of things.
So the epigenetic state of the cell is not just a feed-forward consequence of RNA signaling from the genome, but also the modification of this ribotype, and hence the epigenome, by environmental signals. Many of the people working in the area of epigenetics have focused on environmental influences that have resulted in inherited effects. The effects of poor nutrition on the size of babies born over generations in Holland is one example. It’s clear that environmental circumstances can intersect with epigenetic pathways. But my thesis is that this system is essential to the whole ontogeny of differentiation and development in the first place.
It’s not that environment is controlling the epigenetic modification; rather, it’s modifying what is largely an already preprogrammed pathway. There’s another aspect of this that many people are not aware of. It’s also clear that RNA is itself altered by environmental signaling, primarily through a process called RNA editing, which is the enzymatically imposed modification of the RNA to alter its nucleotide sequence. There are two classes of enzymes involved in this, one of which contains a molecule called inositol hexaphosphate in its active site, strongly implying a link to cell signaling pathways. Of course, if you reasonably assume that these modifications aren’t random, there must be some input control of the pathway.
RNA editing occurs in all tissues of the body, but is particularly active in the brain. There are iconic examples of how RNA editing is employed to change the nucleotide sequence of mRNA to change out this splice pattern to alter the electrophysiological properties of neural receptors and neurotransmitters, presumably to tune the strength of synaptic connections and networks. This makes a lot of intrinsic sense. It’s also clear that humans have about a hundred times more RNA editing than rodents, which is probably related to the development of higher neural function and cognition. If the sequence and structure of mRNA can be modified directly by changing its nucleotide sequence, then it’s reasonable to expect that this may also occur in regulatory RNAs that control the state of chromatin architecture, thus also feeding into memory.
Did you ever believe that there was junk DNA?
Mattick: Not really. However, it’s surprising how many people have just accepted this proposition, and how few people have actually thought about it. It’s also been surprising to me that the field of epigenetics has focused only on changes to histones and DNA, without giving much thought to the underlying circuitry that’s directing these modifications, as opposed to enzymatically imposing them. Most molecular biologists accept the traditional view that most of the genome, apart from that which codes for proteins and their immediately adjacent regulatory elements, is accumulated evolutionary debris. I actually started thinking this might not be the case back in 1977 when I was a postdoc in Houston, Texas, when the very surprising discovery was made that genes of higher organisms contained extensive internal tracts of non–protein-coding sequences. These sequences became known as intervening sequences or introns. At that point everybody assumed that the intronic RNA sequences, which are transcribed but apparently cut out and discarded, must be junk, with their presence being rationalized as the hangover of the early evolution of genes from smaller pieces.
It occurred to me that just maybe, something more important was going on here. I started playing with the possibility that these transcribed intronic RNAs were themselves functional in transmitting some type of information to the system, in which case the structure, information content, and regulatory framework of gene expression in higher organisms was different and much more complex than we had thought. This idea became sort of an intellectual hobby. But in those days we didn’t have much information about the genome. However, by the early 1990s I had accumulated enough evidence, albeit most of it circumstantial, to suggest that there was a very high chance that these RNAs were functional and, therefore, that the genome was not junk but largely comprises information that intellectually and biochemically had slipped under the radar of biochemists.
At least one of your papers has compared RNA regulatory systems to an advanced computer system. Can you tell us about that comparison?
Mattick: One has to be careful in making these analogies, because some people take them too literally. It’s very hard to map biological space into computational space and vice versa. The analogies are really intended to introduce people to a new way of looking at the system. On the first level, you could make a strong case that RNA has a capacity for sequence-specific interactions and, therefore, to encode regulatory signals in very short, precise sequences—miRNAs being a classic example. This is in contrast to proteins, which are rather big and clunky. It’s clear that in other domains we have now moved to what we call digital communications control systems. I don’t mean “on–off”—that definition of digital—but rather in the sense that we can convey sequence-specific information that’s then received and interpreted by a receptive infrastructure to convert it into a meaningful action. To make another analogy, if you order a book online, you enter a product code and a credit card number, which are just a series of digits that get transmitted to somewhere in California or China, and the book gets put in the bag. In this way the relevant information is transmitted digitally and converted to an analog function on the other end.
miRNAs do exactly the same thing. They are 21 to 22 nucleotide sequences that have no catalytic activity, as far as we know, that simply bind to a target RNA that encodes a protein. This is then recognized by a generic protein complex, irrespective of the particular mRNA or miRNA, that prevents production of the protein and accelerates the destruction of its mRNA. So in this you don’t need to have a big clunky protein for every regulatory event that you want to impose on mRNA in different cells, but rather a generic infrastructure and tiny RNA sequences that are very flexible as well. It’s quite easy to modulate the regulatory circuitry to change the phenotype, which is probably the major basis of adaptive radiation in complex organisms. In that sense it looks as if evolution discovered the power of digital communication and control systems a billion years before we did.
RNA can act as an adapter between other RNAs and DNA, and recruit the appropriate infrastructure to impose a relevant action. There’s a matrix of different forms of RNA signals that interact with different types of protein complexes to impart different actions. It’s a very sophisticated regulatory system. At the next level, when you think about the complexity of these networks, you can start to appreciate that RNA interactions and RNA–DNA interactions—as well as interactions with small molecules, proteins, and other analog inputs—can comprise a very powerful computational device. This system computes and directs the trajectories of differentiation and development from a preprogrammed, feed-forward set of instructions that is particular to the species and the individual, while at the same time responding to and integrating signals from the environment. You can start to see through the mist that this network is an extraordinarily sophisticated system sitting in the background directing traffic and controlling expression of the genome and its integration with the environment in different cells and lineages at different stages of development.
Where do you see this field going in the next few years?
Mattick: I think in the short term we’re going to see the delineation of the RNA signaling systems that control chromosome modification and epigenetic memory over the next few years. That’ll be a very vibrant big field. That’s number one.
Number two, I think we’re going to find other levels of control of gene regulation by RNA, of which perhaps the most interesting will be alternative splicing. There’s a lot of circumstantial evidence that alternative splicing is being directed by RNA signaling. So I think that’s a big prize.
The third one, which could be related to either of the first two, is to start to dig into the function of the tens, if not hundreds, of thousands of long ncRNAs that are expressed dynamically during differentiation and development. We documented this in the brain by showing that many of these ncRNAs are expressed in different regions of the cortex, hippocampus, or olfactory bulb. In some cases, we have sufficient resolution to tell that most are also trafficked to precise subcellular locations. We have also shown that many ncRNAs are dynamically expressed in embryonal stem cells and during early development, and are writing up these results for publication now. In very few cases, almost fewer than the fingers on two hands, do we have information on the function of these ncRNAs.
The functional analysis of these RNAs will be a major field. In fact, it’s already starting to happen. It’s also going to yield lots of surprises. A number of these ncRNAs are being trafficked to the cytoplasm and seemingly to different compartments of the cell. A number of these components have not yet been defined. So analyzing the cell biology and function of these tens of thousands of ncRNAs may make major contributions to our understanding of cell biology as well. The fourth one is RNA editing. I think the role of RNA editing in genome–environment interactions, particularly in the development of the brain and learning and memory, is going to become an extraordinarily exciting field over the next 5 or 10 years. We’ll then start to have some real understanding of how gene regulation occurs in humans and other complex organisms, and how that plays into our individual idiosyncrasies, including our susceptibility to complex diseases and our understanding of the molecular basis of cognition, which is perhaps the holy grail of biological science.
Download a hard-copy of John Mattick’s full interview in Life Technologies’ Quest Magazine (Issue 5 Volume 1) which is no longer in print so it’s a collectors item.