A third approach to gene prediction suggests thousands of additional human transcribed regions
Permanent lenke
https://hdl.handle.net/10037/964Dato
2006-03Type
Journal articleTidsskriftartikkel
Peer reviewed
Forfatter
El-Gewely, M. Raafat; Glusman, Gustavo; Qin, Shizhen; Siegel, Andrew F.; Roach, Jared C.; Hood, Leroy; Smit, Arian F.A.Sammendrag
The identification and characterization of the complete ensemble of genes is a main goal of deciphering the digital
information stored in the human genome. Many algorithms for computational gene prediction have been described,
ultimately derived from two basic concepts: (1) modeling gene structure and (2) recognizing sequence similarity.
Successful hybrid methods combining these two concepts have also been developed. We present a third orthogonal
approach to gene prediction, based on detecting the genomic signatures of transcription, accumulated over
evolutionary time. We discuss four algorithms based on this third concept: Greens and CHOWDER, which quantify
mutational strand biases caused by transcription-coupled DNA repair, and ROAST and PASTA, which are based on
strand-specific selection against polyadenylation signals. We combined these algorithms into an integrated method
called FEAST, which we used to predict the location and orientation of thousands of putative transcription units not
overlapping known genes. Many of the newly predicted transcriptional units do not appear to code for proteins. The
new algorithms are particularly apt at detecting genes with long introns and lacking sequence conservation. They
therefore complement existing gene prediction methods and will help identify functional transcripts within many
apparent ‘‘genomic deserts.’’
Forlag
Public Library of ScienceSerie
PloS computational biology, 2(2006)nr 3, pp 160-173Metadata
Vis full innførselSamlinger
Følgende lisensfil er knyttet til denne innførselen: