Question: A Basic question regarding lncRNA identification pipeline.
1
gravatar for rajeev.vikram
23 months ago by
Taiwan
rajeev.vikram20 wrote:

Hi,

I have been analyzing RNA-Seq data sets of some Breast cancer cell lines to create a high confidence list of expressed lncRNAs. However as, I am new to NGS, I cannot figure out how do I filter out the known Expressed gene/protein coding transcripts from my annotation file after cufflinks assembly? Are there any specific tools to do the filtering? If anyone could help me regarding this, I will really appreciate your help.

Thanks

R

lncrna rna-seq pipeline • 802 views
ADD COMMENTlink modified 23 months ago • written 23 months ago by rajeev.vikram20
2
gravatar for geek_y
23 months ago by
geek_y8.5k
geek_y8.5k wrote:

http://cole-trapnell-lab.github.io/cufflinks/cuffcompare/

ADD COMMENTlink written 23 months ago by geek_y8.5k
0
gravatar for rajeev.vikram
23 months ago by
Taiwan
rajeev.vikram20 wrote:

Thanks, but my question is slightly different, , basically, after top-hat assembly with bowtie2 , I Used RABT assembly in cufflinks and then merged all transcripts (elegant= gtf file of annotated transcripts), then did cuffmerge of the replicates. After running cuffcompare with r- given as annotated gencode assembly, I got the transfrags identified with diff signs (=, c x etc.) Now I want to filter out all transfrags of ‘i’, ‘j’, ‘o’, ‘u’ and ‘x’ option, while making an extra file of known lncRNAs (by matching with bodymap annotated lncRNA.gtf). I am curious if I can do all that in command line in one comment, something like:

awk ‘$22 ~ /j,i,o,u,x/ { print }’..
ADD COMMENTlink written 23 months ago by rajeev.vikram20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1165 users visited in the last hour