Question: Running STAR aligner with paired-end and single-end reads simultaneously
0
gravatar for nanoide
8 days ago by
nanoide30
nanoide30 wrote:

Hi all

So, I recently got some RNA-seq raw reads, both paired end (2 x 150 bp) and single-end (1x75 bp) I want to map them using STAR aligner. My main questions are then, how would you deal with these? Can STAR take both paired-end and single-end .fq files simultaneosuly? Or mapping separetely and then merging the bam files is also possible?

Any ideas?

Thank you for your advice

ADD COMMENTlink modified 7 days ago • written 8 days ago by nanoide30
1

Thank you all for your thoughts and useful comments!

ADD REPLYlink written 7 days ago by nanoide30
2
gravatar for swbarnes2
8 days ago by
swbarnes25.0k
United States
swbarnes25.0k wrote:

I don't think STAR can take them both together. I'd process the two separately, and merge results at the end if it looks like the two experiments are telling you the same thing.

ADD COMMENTlink written 8 days ago by swbarnes25.0k
3

And I would suggest to maybe hold on to merging the results until you look at a PCA of the data first to ensure there is not a batch effect of sequencing types etc...

ADD REPLYlink modified 8 days ago • written 8 days ago by lshepard200

Thank you both for your answers. So I guess I will map separately. Then with the bam files I will use deeptools plotPCA (maybe plotCorrelation too?) to check and then samtools merge. Does that sound good? Any thoughts?

Thanks!

ADD REPLYlink written 8 days ago by nanoide30

Hi, please let me ask you for a clarification. When you stated 'merge the results at the end if it looks like the two experiments are telling you the same thing', did you mean merging the output from STAR (i.e. bam files) or counting also independently and then suming the counts if they are correlated, cluster together... etc Thank you!

ADD REPLYlink written 4 hours ago by nanoide30
1
gravatar for Charles Warden
7 days ago by
Charles Warden6.4k
Duarte, CA
Charles Warden6.4k wrote:

I typically use single-end 50 bp reads for gene expression analysis.

If you are just interested in getting counts for differential expression (and FPKM/CPM for visualization), perhaps trim the longer R1 from the PE experiment to 75 bp?

To be safe, I probably would start by processing them separately and seeing how well the replicates cluster. If they really look like technical replicates, I think you could justify combined analysis with the trimmed reads in your Supplemental Materials.

ADD COMMENTlink written 7 days ago by Charles Warden6.4k
1
gravatar for amulyashastry
7 days ago by
amulyashastry40 wrote:

Hello,

I would also recommend producing a correlation (Spearman or Pearson) distance matrix to see how well the samples correlate within their group. DESEQ2 has this option to produce heatmaps of distance matrix as well.

ADD COMMENTlink written 7 days ago by amulyashastry40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2213 users visited in the last hour