Question: How to align non overlapping fastq files
gravatar for Rongen
3.2 years ago by
Rongen10 wrote:

Hi, We have sequenced 250 bp read length library at 100 X 2 cycles on Miseq by mistake, because of which the R1 and R2 fastq files are of 100 bp in length each and are non-overlapping. How to align these fastq files to reference genome. Can we align R1 as single end sequencing file to generate .BAM for R1. Also reverse complement R2 and do the same. Later we can merge two .BAM files for downstream processing. Any suggestions?? Thanks in advance.

next-gen alignment • 1.6k views
ADD COMMENTlink modified 3.2 years ago by WouterDeCoster44k • written 3.2 years ago by Rongen10
gravatar for WouterDeCoster
3.2 years ago by
WouterDeCoster44k wrote:

There is absolutely no need to have overlapping reads for alignment. Most (if not all) aligners take read pairs (two files) as input.

ADD COMMENTlink written 3.2 years ago by WouterDeCoster44k

To expand on that, you generally need to tell the aligner the locations of the read files and reference (often you have to index the reference in one step, and then align the two read files together in a second step), and it will take care of everything else. Do not align them independently, and do not reverse-complement read 2, and definitely do not concatenate the files or fuse the reads together (without a read-merging program designed for that purpose); any of those will confuse the aligner and yield inferior output.

ADD REPLYlink written 3.2 years ago by Brian Bushnell17k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2072 users visited in the last hour