Question

Mapping to a template sequece

0

Entering edit mode

3 months ago

QX ▴ 80

Hi all,

I have a template sequence that I want to map all my FASTQ reads against. The template sequence is around 300 bp long, with two regions of approximately 20 bp each that contain variable bases; the rest is fixed.

I have used BWA-MEM before, but it is more suitable for mapping to a reference genome with global alignment. Can anyone suggest a suitable method for this?

sequencing • 775 views

ADD COMMENT • link 3 months ago by QX ▴ 80

1

Entering edit mode

with two regions of approximately 20 bp each that contain variable bases

Where are these located in the reference?

bwa-mem2 aligns reads to a small reference without problems. It may depend on what you are exactly looking for. Just go ahead and try it.

Otherwise doing a MSA will work too. Problematic part may be "all my reads", in case you have millions. You could use a clustering program like clumpify.sh from BBMap suite and reduce the number down significantly.

ADD REPLY • link 3 months ago by GenoMax 154k

0

Entering edit mode

the template is look like this: -----[fix base ~100bp]-------[20bp varying/diverse]------[fix-remain]---[20bp varying/diverse]-------[fix base ~100bp]-----

shall I set 20bp as NNNNNx5 in the template?

ADD REPLY • link 3 months ago by QX ▴ 80

0

Entering edit mode

Not clear what you are trying to do here. Are you looking for quantify the "diverse" tags in your data or is it something else?

ADD REPLY • link 3 months ago by GenoMax 154k

0

Entering edit mode

so I have a template with that design, where some bases are fixed, others are diverse on purpose. However, those reads that generated from this temple are not always 'good'; some reads for perfectly align with temples in the fix region, but some have a shift in 1-2 bases or technical mismatch in the fix region. I want to use the mapping the re-align or detect the indel/deletion/technical error in these reads

ADD REPLY • link 3 months ago by QX ▴ 80

0

Entering edit mode

Without more details about your problem, I'd suggest you to look at other amplicon-analysis workflows like dada2 or qiime2. These workflows originate from 16S/ITS-amplicon analysis.

ADD REPLY • link 3 months ago by michael.ante ★ 4.0k