DOI: 10.5176/2251-2489_BioTech15.44

Authors: Rashmi Tripathi, Vandana Kumari, Sunil Patel, Yashbir Singh and Dr. Pritish Varadwaj

Abstract: Non-coding RNAs are highly abundant in the human genome, comprising of thousands of functionally and structurally important families of tRNAs, rRNAs, snoRNAs, siRNAs, piRNAs, snRNAs, miRNAs and lncRNAs (long non-coding RNA). Recent high throughput sequencing technologies have resulted in generation and annotations of large number of lncRNA. The proposed approach is an effort to make an automated prediction tool to identify lncRNAs using machine learning. We have used several features based on entropic information content of k-mer sequence and fed into Deep Neural Network classifier. We have demonstrated the accuracy of our classifier on known datasets comprising of human lncRNA and transcript. This tool can be used for the prediction and annotation of unknown lncRNAs which plays crucial role in almost every machinery of cell biology from nuclear organisation to genetic and epigenetic regulation. A server is implemented for online prediction of lncRNA fasta sequences and database is located at http://www.deeplnc.iiita.ac.in

Keywords: non-coding RNAs; miRNAs; lncRNA; high throughput sequencing; machine learning; deep learning technique

simplr_role_lock:

Price: $0.00

Loading Updating cart...
LoadingUpdating...