Question: Alignment of seq reads to a genome, process after STAR?
0
gravatar for Biogeek
15 months ago by
Biogeek240
Biogeek240 wrote:

Hey guys,

Just a quick question and some advice. I've indexed my target organism's genome and I am now aligning my cleaned reads back to the genome with Star again. My reads were cleaned with Trimmomatic.

I've read that people regularly use cufflinks package for all in one analysis;however, I would be keen on using EdgeR. Once my reads have been aligned. Is there anyway I can use the SAM/ converted BAM files to calculate counts then feed them into R and EdgeR? Most of my experience has so far been in de novo assembly.

Are there any good tutorials I can visit online?

Thanks.

ADD COMMENTlink modified 15 months ago by Devon Ryan73k • written 15 months ago by Biogeek240

Yes you can. STAR now has the ability to generate counts during alignments or you could use featureCounts with the aligned sequence files to generate the count matrix.

ADD REPLYlink written 15 months ago by genomax37k

Hi Genomax2,

This presumably does away with the need of using cufflinks software? I have multiple replicates per treatment and I read cuff-merge is good for this. Any obvious advantages to using cufflinks or straight up STAR?

Thanks

ADD REPLYlink written 15 months ago by Biogeek240

Yes. You would want to use DESeq2 or edgeR anyway. Sounds like you are all set with replicates etc. See the paper Devon linked below. Vignette for DESeq2 would be similarly useful.

ADD REPLYlink written 15 months ago by genomax37k
1
gravatar for Devon Ryan
15 months ago by
Devon Ryan73k
Freiburg, Germany
Devon Ryan73k wrote:

This F1000 article has commands for generating counts (near the end, note that they use featureCounts from within R, though you can use it at the command line too) and using edgeR. That'll be a good tutorial to base your analysis on.

ADD COMMENTlink modified 15 months ago • written 15 months ago by Devon Ryan73k

Thanks for the article Devon, much appreciated. I've had a read and whilst appealing, I am going to try using STAR first with the new transcript counts feature in Version 2.4.2a. I'll then feed the BAM into RSEM and do my usual pipeline from there on in. I am determining if de novo is better than using the draft genome of the organism in terms of coverage. Perhaps I may venture into using Rsubread down the line. Thanks.

ADD REPLYlink written 15 months ago by Biogeek240

If you want to go that route you might appreciate that Salmon or Kallisto will get you similar results in a fraction of the time.

ADD REPLYlink written 15 months ago by Devon Ryan73k

I would second salmon or kallisto in that case since both will run faster generating counts and tpm for each replicates and finally one can aggregate the results to generate the matrix. If I am not wrong the latest version of salmon already has trascript to gene summarisation if one is keen on gene count matrix else you will have transcript counts. Good luck!

ADD REPLYlink written 15 months ago by vchris_ngs4.1k

Thanks guys. I've already completed the de novo analysis using RSEM and EdgeR, so I guess it would be most appropriate to stick with RSEM again and EdgeR, as to not go off a beaten track......The reason I'm doing such analysis as additional work to the de novo, is so that I can compare coverage of the genome in case I'm asked when defending my thesis why I didn't use the reference.

Bit of a generalized question. Have any of you attempted a hybrid assembly, or is that highly time consuming and requiring a lot of knowledge?

ADD REPLYlink written 15 months ago by Biogeek240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1463 users visited in the last hour