Question: A Basic question regarding lncRNA identification pipeline.
1
gravatar for rajeev.vikram
2.2 years ago by
Taiwan
rajeev.vikram20 wrote:

Hi,

I have been analyzing RNA-Seq data sets of some Breast cancer cell lines to create a high confidence list of expressed lncRNAs. However as, I am new to NGS, I cannot figure out how do I filter out the known Expressed gene/protein coding transcripts from my annotation file after cufflinks assembly? Are there any specific tools to do the filtering? If anyone could help me regarding this, I will really appreciate your help.

Thanks

R

lncrna rna-seq pipeline • 886 views
ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by rajeev.vikram20
2
gravatar for geek_y
2.2 years ago by
geek_y8.7k
geek_y8.7k wrote:

http://cole-trapnell-lab.github.io/cufflinks/cuffcompare/

ADD COMMENTlink written 2.2 years ago by geek_y8.7k
0
gravatar for rajeev.vikram
2.2 years ago by
Taiwan
rajeev.vikram20 wrote:

Thanks, but my question is slightly different, , basically, after top-hat assembly with bowtie2 , I Used RABT assembly in cufflinks and then merged all transcripts (elegant= gtf file of annotated transcripts), then did cuffmerge of the replicates. After running cuffcompare with r- given as annotated gencode assembly, I got the transfrags identified with diff signs (=, c x etc.) Now I want to filter out all transfrags of ‘i’, ‘j’, ‘o’, ‘u’ and ‘x’ option, while making an extra file of known lncRNAs (by matching with bodymap annotated lncRNA.gtf). I am curious if I can do all that in command line in one comment, something like:

awk ‘$22 ~ /j,i,o,u,x/ { print }’..
ADD COMMENTlink written 2.2 years ago by rajeev.vikram20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1369 users visited in the last hour