Question

Ngs Multi-Dataset Analysis

0

Entering edit mode

12.7 years ago

Diogo Santos • 0

Hello I have some doubts on my analysis of an 454 EST s assembly of data that come from 3 different datasets. Each dataset come from the same organism, but in different conditions (resistance to fungi). I've assembly them separated (I've tried the all together approach to, but for that my question doesn't apply) and I have blast them all. after that I had apply blast2go on blast results. How can I discover what is important for the conditions on test?
I have tried that approach: - compare the blast best hits from each dataset with each other and said that, if a contig have the same best blast hit than other, it is the same thing - that way, i define a group of reads that exists in all conditions an others that exist only on some conditions - i made the blast2go enrichments analyses for this group of sequences, but because blast2go doesn't use only the best blast hit, the sequences that I had assume to be the same don't have the same GO terms Any suggestions on how to continue?

est blast assembly analysis • 2.3k views

ADD COMMENT • link 12.7 years ago by Diogo Santos • 0

0

Entering edit mode

Do you have a reference sequence or reference transcripts for your organism?

ADD REPLY • link 12.7 years ago by Sean Davis 26k

0

Entering edit mode

It's not clear to me what the point of your experiment is. You have three conditions, and you want to assemble a transcriptome from piles of reads measured under each condition, for the purpose of quantifying what genes are expressed in response to that condition? If you have no reference (as Sean asked) then I think you have a distillation problem. You'll likely have to assemble them all together to create a kind of reference, then pull sequences from that for your blast2go. The key is your statement "sequences that I had assume to be the same", you have to solve this ambiguity.

ADD REPLY • link 12.7 years ago by seidel 11k

0

Entering edit mode

Your questions is too broad, try to reduce and simplify it. No one here can easily advise you in general terms of what might be wrong with your data.

ADD REPLY • link 12.7 years ago by Istvan Albert 100k

0

Entering edit mode

No, I don't have a reference for my organism. My 3 conditions are no fungi, and resistance to fungi a and resistance to fungi b. I think that the dataset with no fungi will be my base line. The "sequences i had assume to be the same" phrase mean that, when i try to compare the datasets to see what they have in common, I classify a sequence in dataset A to be the same as a sequence in dataset B if their best hit is the same.

ADD REPLY • link 12.6 years ago by Diogo Santos • 0