Question: Ngs Multi-Dataset Analysis
gravatar for Diogo Santos
9.0 years ago by
Diogo Santos0 wrote:

Hello I have some doubts on my analysis of an 454 EST s assembly of data that come from 3 different datasets. Each dataset come from the same organism, but in different conditions (resistance to fungi). I've assembly them separated (I've tried the all together approach to, but for that my question doesn't apply) and I have blast them all. after that I had apply blast2go on blast results. How can I discover what is important for the conditions on test?
I have tried that approach: - compare the blast best hits from each dataset with each other and said that, if a contig have the same best blast hit than other, it is the same thing - that way, i define a group of reads that exists in all conditions an others that exist only on some conditions - i made the blast2go enrichments analyses for this group of sequences, but because blast2go doesn't use only the best blast hit, the sequences that I had assume to be the same don't have the same GO terms Any suggestions on how to continue?

assembly analysis est blast • 1.7k views
ADD COMMENTlink written 9.0 years ago by Diogo Santos0

Do you have a reference sequence or reference transcripts for your organism?

ADD REPLYlink written 9.0 years ago by Sean Davis26k

It's not clear to me what the point of your experiment is. You have three conditions, and you want to assemble a transcriptome from piles of reads measured under each condition, for the purpose of quantifying what genes are expressed in response to that condition? If you have no reference (as Sean asked) then I think you have a distillation problem. You'll likely have to assemble them all together to create a kind of reference, then pull sequences from that for your blast2go. The key is your statement "sequences that I had assume to be the same", you have to solve this ambiguity.

ADD REPLYlink written 9.0 years ago by seidel7.1k

Your questions is too broad, try to reduce and simplify it. No one here can easily advise you in general terms of what might be wrong with your data.

ADD REPLYlink written 9.0 years ago by Istvan Albert ♦♦ 84k

No, I don't have a reference for my organism. My 3 conditions are no fungi, and resistance to fungi a and resistance to fungi b. I think that the dataset with no fungi will be my base line. The "sequences i had assume to be the same" phrase mean that, when i try to compare the datasets to see what they have in common, I classify a sequence in dataset A to be the same as a sequence in dataset B if their best hit is the same.

ADD REPLYlink written 8.9 years ago by Diogo Santos0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 868 users visited in the last hour