Trinity: Merged assembly, Is it enough?
2
1
Entering edit mode
9.5 years ago
Prasad ★ 1.6k

Hello,

I am new to ngs analysis. I have 2 samples (2X150) for denovo assembly and using Trinity for it. According to trinity DGE pipeline, i have to merge 2 sample reads and generate the transcripts and back align the reads to get the read count. My requirement is to annotate each sample transcripts and DGE calculation.

Q: To get individual sample transcripts instead of assembling sample wise, can i seperate the transcripts for each sample from the above merged assembly based on read count

Thanks

RNA-Seq Assembly trinity transcriptome • 3.0k views
ADD COMMENT
2
Entering edit mode
9.5 years ago
st.ph.n ★ 2.7k

You can assemble each sample individually if that is your goal. However in order to assess DE genes/transcripts between the samples you need to combine the reads to create a single assemble composed of both samples. As PyPerl said, there will be common reads across the two samples.

After aligning the raw reads back to the single assembly, you'll be able to create some matrix files following the downstream pipeline outlined by Trinity. You'll get in the first column, transcript/gene id's, and the next columns will be normalized read counts per sample (depending on how many samples you have). If you use edgeR in the DE pipeline, there will be a "sample1_vs_sample2.UP" file, that will show the up-regulated genes, based on the FDR and FC you supply, and the normalized matrix based on read counts. You can pull from this file, transcript ID's you can cross-reference to the assembled file to find up-regulated sequences for each sample. Ultimately though, you won't be able to "separate" all transcripts between samples. Again, if you're looking for individual transcripts by samples. you're better off assembling individually. If you're looking for a DE analysis, and read counts, up-transcripts by samples, then you need a consensus assembly.

ADD COMMENT
1
Entering edit mode
9.5 years ago
Renesh ★ 2.2k

I think you can not separate the individual transcript from merged. Because, there will be common reads which are shared by two samples and will map to one to particular location only. This will be problematic in case of alternative transcripts. You need to be careful

ADD COMMENT

Login before adding your answer.

Traffic: 1471 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6