Question: A Basic question regarding lncRNA identification pipeline.
1
gravatar for rajeev.vikram
20 months ago by
Taiwan
rajeev.vikram20 wrote:

Hi,

I have been analyzing RNA-Seq data sets of some Breast cancer cell lines to create a high confidence list of expressed lncRNAs. However as, I am new to NGS, I cannot figure out how do I filter out the known Expressed gene/protein coding transcripts from my annotation file after cufflinks assembly? Are there any specific tools to do the filtering? If anyone could help me regarding this, I will really appreciate your help.

Thanks

R

lncrna rna-seq pipeline • 692 views
ADD COMMENTlink modified 20 months ago • written 20 months ago by rajeev.vikram20
2
gravatar for geek_y
20 months ago by
geek_y8.2k
geek_y8.2k wrote:

http://cole-trapnell-lab.github.io/cufflinks/cuffcompare/

ADD COMMENTlink written 20 months ago by geek_y8.2k
0
gravatar for rajeev.vikram
20 months ago by
Taiwan
rajeev.vikram20 wrote:

Thanks, but my question is slightly different, , basically, after top-hat assembly with bowtie2 , I Used RABT assembly in cufflinks and then merged all transcripts (elegant= gtf file of annotated transcripts), then did cuffmerge of the replicates. After running cuffcompare with r- given as annotated gencode assembly, I got the transfrags identified with diff signs (=, c x etc.) Now I want to filter out all transfrags of ‘i’, ‘j’, ‘o’, ‘u’ and ‘x’ option, while making an extra file of known lncRNAs (by matching with bodymap annotated lncRNA.gtf). I am curious if I can do all that in command line in one comment, something like:

awk ‘$22 ~ /j,i,o,u,x/ { print }’..
ADD COMMENTlink written 20 months ago by rajeev.vikram20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1216 users visited in the last hour