Question: STAR+Cufflinks for RNA-Seq analysis
0
gravatar for wangdp123
10 weeks ago by
wangdp123140
Oxford
wangdp123140 wrote:

Hi there,

I am using STAR+cufflink combination to handle the unstranded paired-end RNA-Seq datasets.

  1. I wonder if there are a set of typical parameters for both STAR and cufflink?

For STAR:

STAR --runThreadN 1 --runMode alignReads --genomeDir index --readFilesIn sample_r1.fq sample_r2.fq --outFileNamePrefix sample_ --outSAMtype BAM SortedByCoordinate --outSAMattributes All --outSAMstrandField intronMotif

For cufflink:

cufflinks -p 1 -G sample.gtf -o sample_clout sample_Aligned.sortedByCoord.out.bam

By running the above two command lines, I encountered a warning message:

"Warning: Using default Gaussian distribution due to insufficient paired-end reads in open ranges. It is recommended that correct parameters (--frag-len-mean and --frag-len-std-dev) be provided."

Does this warning matter? Is anything gone wrong?

  1. I noticed from the STAR manual that for unstranded RNA-Seq data, we should give the parameter "--outSAMstrandField intronMotif" to STAR. And I tested the following three scenarios:

(1) without --outSAMattributes without --outSAMstrandField, the error message from Cufflinks: BAM record error: found spliced alignment without XS attribute (1) --outSAMattributes All --outSAMstrandField intronMotif, no error message, I can see the XS attribute in the BAM file. (1) --outSAMattributes Standard --outSAMstrandField intronMotif, no error message, I can see that there is NO XS attribute in the BAM file.

Does this mean using either "--outSAMattributes All" and "--outSAMattributes Standard" will get to the same destination? Does Cufflinks treat them in the same way?

Thanks for your help,

Tom

rna-seq star cufflinks • 201 views
ADD COMMENTlink modified 10 weeks ago by Istvan Albert ♦♦ 80k • written 10 weeks ago by wangdp123140
0
gravatar for Istvan Albert
10 weeks ago by
Istvan Albert ♦♦ 80k
University Park, USA
Istvan Albert ♦♦ 80k wrote:

The error has nothing to do with the strandedness of the data.

It is about the estimated fragment sizes for the read pairs. The 9th column of the BAM file contains the TLEN field, (template lenght). The --frag-len-mean and --frag-len-std-dev is asking for the mean value of TLEN and its standard deviation. I am not sure why there seem to be insufficient data there.

Either ignore the warning or figure out the mean and stdev for your data from that column.

ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by Istvan Albert ♦♦ 80k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1166 users visited in the last hour