Back in the early days, they were these interesting “genes” with names like “lin” and “let” that appeared highly involved in developmental timing, so much so that they were called small temporal RNAs (stRNAs). A lot more has changed in the field of miRNA discovery than naming mechanisms since those early days when the first handful of miRNAs were popped into a vector (we’re omitting the ligations, gels, and precipitations of course) and sequenced into stardom.
Today, miRNA discovery methods have come a long way from the early days in their performance and output, but they’re still anchored in sequencing and bioinformatics
A miRNA Prediction Pattern Emerging
The early bioinformatics approaches for miRNA discovery were useful, but it was difficult to “teach” these algorithms when the miRNA rulebook was still being written. Even today we see substantial variation in algorithmic approaches, but unlike the earlier prediction methods, today’s approaches are less restrained and benefit from the collective knowledge of a research community with a few more years of miRNA studies under its belt. Considering this, it makes sense that current efforts are not coming up short on miRNA candidate predictions.
IBM’s Bioinformatics and Pattern Discovery group kicked things off a few years back when their approaches suggested considerably more miRNAs and miRNA targets existed. Led by Isidore Rigoutsos, Big Blue took a slightly different route to miRNA discovery with their RNA 22 algorithm, which notably didn’t depend on sequence conservation amongst species. This factor alone was enough to blow out the number of predictions significantly and shake things up a bit.
More recently, San Diego based Natural Selection, Inc. has been applying pattern recognition methods for miRNA discovery, making use of evolutionary algorithms and artificial neural networks to identify putative miRNA sequences. “Given a database of experimentally verified miRNA sequences and non-miRNA sequences, it is possible to train pattern recognition algorithms to identify features that separate miRNA sequences from non-miRNA sequences. Once trained, these same algorithms can be used to identify genomic regions that are putative miRNAs..” Explains Gary Fogel, CEO of Natural Selection.
Today, massive datasets are streaming off powerful sequencing platforms and are starting to back what were once thought of as just in silico predictions, yielding tens to hundreds of novel miRNA candidates a whack in a wide range of tissues types from human embryonic stem cells to rice seedlings. Many of these have yet to graduate to the official ranks of Sanger’s miRBase database, which currently houses about 700 human miRNA sequences.
Covering Outside miRBases
Historically, the technology companies have tied their miRNA products (arrays, qRT-PCR assays, inhibitors etc.) to Sanger’s miRBase, updating their products whenever there was a substantial change in the Sanger content. But now that a lot of the action is going on outside of miRBase, the technology companies are diving into predicted content to provide a simple way to profile additional miRNA content.
When developing their latest miRNA microarray, the development team at Invitrogen opted to cover their miRBases and then some. The crew teamed up with Natural Selection who produced thousands of high confidence miRNA predictions, and validated these predictions with deep sequencing. Invitrogen further validated the dataset with additional deep sequencing, array profiling, and qRT-PCR. The result? 373 novel miRNA candidates included on the array in addition to those in miRBase. How’s that for teamwork? Check out the NCode™ Human miRNA Microarray V3 and put those predictions to work!