Question: Trinity: Merged assembly, Is it enough?
1
gravatar for Prasad
5.0 years ago by
Prasad1.6k
India
Prasad1.6k wrote:

Hello,

I am new to ngs analysis.  I have 2 samples (2X150) for denovo assembly and using Trinity for it. According to trinity DGE pipeline, i have to merge 2 sample reads and generate the transcripts and back align the reads to get the read count. My requirement is to annotate each sample transcripts and DGE calculation.

Q: To get individual sample transcripts instead of assembling sample wise, can i seperate the transcripts for each sample from the above merged assembly based on read count

Thanks

 

ADD COMMENTlink modified 5.0 years ago by st.ph.n2.5k • written 5.0 years ago by Prasad1.6k
1
gravatar for Renesh
5.0 years ago by
Renesh1.6k
United States
Renesh1.6k wrote:

I think you can not separate the individual transcript from merged. Because, there will be common reads which are shared by two samples and will map to one to particular location only. This will be problematic in case of alternative transcripts. You need to be careful. 

ADD COMMENTlink written 5.0 years ago by Renesh1.6k
1
gravatar for st.ph.n
5.0 years ago by
st.ph.n2.5k
Philadelphia, PA
st.ph.n2.5k wrote:

You can assemble each sample individually if that is your goal. However in order to assess DE genes/transcripts between the samples you need to combine the reads to create a single assemble composed of both samples. As PyPerl said, there will be common reads across the two samples.

After aligning the raw reads back to the single assembly, you'll be able to create some matrix files following the downstream pipeline outlined by Trinity. You'll get in the first column, transcript/gene id's, and the next columns will be normalized read counts per sample (depending on how many samples you have). If you use edgeR in the DE pipeline, there will be a "sample1_vs_sample2.UP" file, that will show the up-regulated genes, based on the FDR and FC you supply, and the normalized matrix based on read counts. You can pull from this file, transcript ID's you can cross-reference to the assembled file to find up-regulated sequences for each sample. Ultimately though, you won't be able to "separate" all transcripts between samples. Again, if you're looking for individual trasncripts by samples. you're better off assembling individually. If you're looking for a DE analysis, and read counts, up-transcripts by samples, then you need a consnsus assembly.

ADD COMMENTlink written 5.0 years ago by st.ph.n2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 787 users visited in the last hour