Question: Aligning multiple short reads with multiple long reference reads
0
gravatar for MAPK
11 days ago by
MAPK1.0k
United States
MAPK1.0k wrote:

Hi, I was wondering if there a way to align short reads with multiple long reads and see a long stretch of aligned region from the same genome? I have millions of short reads and I want to align those short reads to thousands of long sequences from the same genome and see the aligned region. Thanks

alignment • 114 views
ADD COMMENTlink written 11 days ago by MAPK1.0k

That is basically what every NGS aligner does. So is there a question here?

ADD REPLYlink written 11 days ago by genomax33k

Hehe Just got confused. So basically can use BWA?

ADD REPLYlink written 11 days ago by MAPK1.0k
1

Yes, you can use BWA, but if the error rate of the long reads (are they long reads or contigs / scaffolds?) is high BWA will be very slow and possibly many reads will remain unaligned.

Or maybe you want to align short reads AND long reads to the same reference genome?

Also, there are tools to use illumina reads to do error correction of long reads.

ADD REPLYlink written 11 days ago by h.mon8.7k

If you're talking about reads around the 30-50bp mark, I'd use bowtie2 and switch on uniquely-mapped reads only (--best -m 1). If your reads are >70bp in length, use BWA mem.

ADD REPLYlink written 11 days ago by Kevin Blighe1.3k

I did use bowtie, but the problem arises when it extracts lots of sequences that are not exact match ( it allows for too many mismatches as well). My sequenses are srna reads and I want to align them to retrotransposons (LTR) regions. I have about a thousand LTR sequences and they are a few hundred bases long each. My short reads should match pretty well with LTR regions if there are any read from that region, but I am expecting very few matches from my experimental data. In any case, bowtie pulls out too many reads even from not-so-perfectly aligned regions

ADD REPLYlink modified 11 days ago • written 11 days ago by MAPK1.0k

If you have the reference genome I would suggest to align the reads to the genome first and re-map to the LTR the reads that didn't map to the genome. That should reduce the noise you're dealing with.

ADD REPLYlink written 11 days ago by Asaf4.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1372 users visited in the last hour