Proteins have been binding DNA for some time now, but only recently has ChIP-Seqencing (ChiP-Seq) pointed exactly where. As the technique becomes more popular, more researchers are wondering; what’s the best way to analyze the mounds of data they’ve just created with their ChIP-seq experiments? To tackle this problem, a group of bioinformaticians from the University of Cambridge created BayesPeak, an analysis method that lets you map ChIP-enriched genomic fragments faster than a GPS device.
For those of you deep in the ChIP-seq game, you know there are already a bunch of analytical methods for ChIP-seq and most work well overall, but they usually come with some shortfalls (according to the Cambridge team) depending on your data scenario. The Brits designed BayesPeak in a way that seems to take the best features from current models and eliminates most of the drawbacks, by adapting Bayesian statistical methodologies and Hidden Markov models (HMM). Testing out BayesPeak to map both transcription factor binding sites and histone modifications, they showed that it’s a flexible tool that gives high-confidence calls with few false positives.
Some other advantages include:
- The ability to include an input control sample for normalization
- It accounts for strandness and orientation of genomic fragments
- Precise location data making it easier to ID binding sites
- Adaptability; you can change read windows, adjust for different sizes of fragments and binding sites, and use paired-end data.
- Analysis for multiple modifications and binding sites simultaneously, for the same sample and controls.
So if you’re in the market for a new ChIP-seq algorithm, download the BayesPeak code and instructions at the University of Cambridge site.
To learn more details, or satisfy your mathematical curiosity, check out the full article at BMC Bioinformatics, Sept 2009