Question: The same results with various dataset by cufflinks
gravatar for seta
2.5 years ago by
seta1.2k wrote:

Hi everybody,

I'm busy with genome-guided transcriptome assembly of some Illumina data from human. I used STAR for read mapping on hg19 and cufflinks for transcriptome assembly. I performed the analysis for two independent datasets, separately (one single end, 36bp and another, paired-end 100 bp). After conversion of "transcripts.gtf" file produced by cufflinks to fasta file, I observed that the count of sequences in fasta files related to the two independent datasets is the same. I was wondering if it is normal or something is wrong?

Thanks in advance

ADD COMMENTlink written 2.5 years ago by seta1.2k

No, it is not normal.

But as we can easily see from the command-line you provided, you run both times with the same dataset.

ADD REPLYlink written 2.5 years ago by h.mon28k

Never, not running with the same dataset. I checked all commands again. What should I do?!

Actually, the second dataset is those data that two read files of a single paired-end file had various length and I asked about it in this post (enter link description here, and you kindly suggested to remove ftl=20 ftr=90 from the related command of bbduk for read trimming and I did it. However, mapping percentage was almost good, about 82-84% for all samples. What should I do for checking the accuracy of results?

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by seta1.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 675 users visited in the last hour