Double peaks on FASTQC
1
0
Entering edit mode
7.1 years ago
User 4014 ▴ 40

Hi folks,

I just got HiSeq results of fungal genomes back from our sequencing facility. After adapter and quality trimming, I ran fastqc and found double peaks as shown in the pictures. Previous threads mentioned a contamination from other species, but in my case it was a pure culture I sequenced. May I have your opinion whether I should just ignore the peaks and continue with assembly? Or there is any extra step I should try to make sure that there is nothing wrong with the data?

Thanks in advance and have a great weekend!

Best Vin

forward and reverse reads

next-gen Assembly genome • 2.2k views
ADD COMMENT
0
Entering edit mode

Thanks so much Brian. :) I was thinking about de novo assembly first and compare with a reference genome since I feel like the ref genome needs a lot of improvement. In this case, it's very likely that my stuff is contaminated. Do you think it's better if I map it back to a reference genome, seclude unmapped ones and just play with what's left?

Also I was thinking if the culture is not pure as I thought and the contamination could be from different isolates of the same species. But if that's the case, the GC value should be very similar or even identical?

Best , Vin

ADD REPLY
0
Entering edit mode

Contamination from a different strain would not be obvious on a GC graph; they would completely overlap - contamination from a different strain is virtually impossible to detect. You can assemble, map the reads to the assembly, call variants, and view them in IGV to see if you have two strains in a library.

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT to reply to an earlier answer, as such this thread remains logically structured and easy to follow. I have now moved your reaction, but as you can see it's not optimal.

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.

ADD REPLY
1
Entering edit mode
7.1 years ago

I suggest you simply go ahead with assembly, then BLAST the contigs. I think you probably have contamination by something, but it's much easier to detect and remove contamination once you have contig-length sequences, compared to reads. It could also be an organelle.

ADD COMMENT

Login before adding your answer.

Traffic: 2041 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6