Hi guys, I'm very new to bioinformatics and trying to collect some public data sets to do the analysis for my undergraduate thesis. My supervisor only wants v3 data sequenced by Illumina Miseq and told me that I can take v3 sequence from v3-v4 sequences if I found a lot of v3-v4 data. This theoretically makes sense to me because I can use the trimming tool to cut off where the forward and reverse v3 primers are and throw out the part for V4.
I use SMS primer map (https://www.bioinformatics.org/sms2/primer_map.html) to check if primers bind onto the sequences. When I checked that, there was only 537r (for v3 reverse) and 515f (for v4 forward) primers binding at around 130 bp on R1 and no primers found on R2 in most of the v3-v4 data sets I found. Does this mean only the first 130 bp of read 1 are from v3 region and the conversed region beside, and the rest of the read 1 and all of the read 2 are all from v4 region?
Thanks in advance,
Aelita
I didn't get the motivation but there is some risk at comparing v3 to truncated v3v4 as there are sequencing errors when approaching the end. I think that usually v4 is used, never saw a v3 only region
Hi,
I was wondering if you got the solution for this, could you please share it as I am in the same situation and want to remove V3 from V3/V4.
Thanks in advance Sara