Dr. Paul Soloway discusses the limitations with some of the main methods used in epigenetics research and shares a glimpse into what the next generation of tools will look like. This interview took place at the Keystone Symposia’s Epigenomics and Chromatin Dynamics joint meeting in January, 2012.
Getting Past ChIP and Bisulfite Applications
So, chromatin immuno-precipitation and bisulfite sequencing are probably the two most widely used methods currently for epigenomic analyses. With chromatin immuno-precipitation there are many limitations inherent to it: first of all, people using a population of cells or tissues that’re complex as a source of their chromatin, and so clearly there’re gonna be inputs from multiple independent cell sources.
Furthermore, even if you’re dealing with a homogenous cell line, there are gonna be cells at different stages of the cell cycle, and there are many data out there that describe heterogeneity even within a cell line from one phase of a cell cycle to another in terms of their epigenetic profiles. And when you scale up to a more complex tissue, what has many independent cell types, then when you do epigenomic profiling you never know which specific cell type you’re looking at, even if you’re looking at a tissue that is fairly homogeneous, for example like liver.
Additional limitations of chromatin immuno-precipitation include the fact that you query typically one epigenetic feature at a time. People do re-ChIP epi-analysis in which the immunoprecipitate with one antibody is then used in a subsequent immuno-precipitation with another antibody, and in those kinds of experiments they can get at when two independent epigenetic marks truly reside on the same piece of DNA. But those are very difficult experiments to do, primarily because you need quite a large input of material to have enough at the second immuno-precipitation to do sequencing.
But what people typically do is a single chromatin immuno-precipitation with one antibody. Then they’ll take another antibody against their second most favorite epigenetic mark and then superimpose the data sets, and that introduces an inherent ambiguity. You never are completely confident you have both epigenetic features on the same piece of chromatin at one time, so that provides a second limitation of chromatin immuno-precipitation.
“When you’re trying to associate epigenetic variations with those disease states, the analysis is not truly digital, as it is with genetic analysis.”
There’s really a third limitation as well, and that is typically chromatin IPs require a lot of input material. There are some protocols that have been described that use various small inputs of cells for genome-wide chip analysis. Some particles have been implemented that use even smaller inputs for just a few locus-specific assays, but being able to get down to just a handful of cells, maybe even single-cell analyses, is something that would be very desirable for future work.”When you’re trying to associate epigenetic variations with those disease states, the analysis is not truly digital, as it is with genetic analysis.”
I should probably mention a couple other limitations that are emerging with chromatin immuno-precipitation. There have been some recent data that show that when you throw formaldehyde on a cell to cross-sync proteins to the DNA that that actually induces a DNA damage response, which can generate local placement of poly-ADP ribose, which can then end up influencing the accessibility of your favorite chromatin epitobe-2 antibody. So, our efforts to perform chromatin immuno-precipitation are clearly influenced by the procedures we use.
With bisulfite sequencing, another very commonly used method for epigenomic analysis to evaluate DNA methalation that also has some limitations – for example, it becomes difficult to discern methylcytosine from hydroxymethylcytosine from carpoxymethylcytosine from formylmethylcytosine or formylcytosine, excuse me. And these independent features may be things that we wish to discern in the future. There’ve been also some recent reports using bisulfite sequencing that show that the degree of purity of your DNA is absolutely critical.
So, for example, if you don’t completely de-proteinize your DNA by very extensive proteinase treatments and subsequent purification processes, which can end up influencing the conversion rate of the cytosine, so, again, some other challenging features. And in terms of integrating both of those common features, chromatin immuno-precipitation with DNA methylation analyses, the methods are really not well developed for having truly integrated data sets in which you can identify a nucleus zone that has a range of modifications and then simultaneously know the underlying DNA methylation states.
Moving to Single Molecule Epigenetic Analysis
So, in thinking about the various shortcomings that exist for chromatin immuno-precipitation and DNA methylation methods, my lab and many other labs are also trying to develop methods to overcome these limitations. Some of my own interests have focused on performing single-molecule analysis using procedures very analogous to flow cytometry but rather than looking at individual cells and surface features on them, we’re looking at individual chromatin molecules and the epigenetic features on them.
The reagents that we use for our detection are very similar to the reagents that are currently used for chromatin immuno-precipitation or methyl-DNA precipitation, another method that I did not talk about earlier. But by using some of these same probes and having different fluorescent features on them and by querying individual molecules under a flow setting, we can identify molecules out of combinations of these epigenetic features, which we hope will overcome one of these limitations of epigenomic analyses that people are appreciating now.
There are many other advances that we need to achieve to make this ready for prime time. Among them are really being able to ascertain that we have pure, reliable, accurate and precise quantitation of the features and combinations of features that we’re querying, and additionally we need to implement some robust preparative methods that would allow us to isolate from a population, those molecules that have the features of interest and combinations that we’re interested in for downstream analysis, such as sequencing. So, throughput would become an issue for some of these technologies we’re trying to develop.
So, I think some of the big advances that we are beginning to see right now and that’ll continue to be seen in the future would be to really very reliably quantify some of these epigenetic changes and where they reside in the genome. So, we know from true genetic analyses that people are doing using genome-wide association studies, snips that are essentially digitally identified, you either have it or you don’t, can or cannot be associated with a given disease state.
When you’re trying to associate epigenetic variations with those disease states, the analysis is not truly digital, as it is with genetic analysis. It becomes very dependent upon identifying the absolute levels of an epigenetic feature to really be able to measure those precisely, combinations of features as well, and furthermore to be able to have very robust analytical methods that really minimize the technical contribution to variation that we see in these kinds of analyses.
We know that there’s gonna be a lot of inter-individual variation, so if we were to take a look at one individual who might be affected by Type 2 diabetes and another individual that might be affected by Type 2 diabetes, there may be inter-individual variation that underlies the disease but it becomes very important to distinguish that variation from variation that truly contributes to the disease. So being able to have very high throughput with large numbers of individuals is important.
Being able to have very robust and very precise measurement set, minimize the technical variation that’s introduced and then to be able to really appreciate what variation truly exists at the population level and to be able to correlate that with the disease state is something that is ongoing and I think will be dramatically improved over the coming few years. So, that’s one of the probably immediate advances that I would envision and hope to see as we begin to apply epigenomic methods to identify the epigenomic basis of human disease.