Question: bowtie2 paired end insertion length
0
gravatar for jesselee516
5 months ago by
jesselee51670
United States
jesselee51670 wrote:

Hi,

I am using bowtie2 to do alignment to reference genome. Actually, instead of getting real data, I simulated the paired-end reads(100bp) with insertion length as 400bp between mate1 and mate2. Does anyone has any idea how to set the parameters for bowtie2 in this case? Because I simulated the reads, I know the read length is 100bp with 400bp insertion length between mates. See one of my alignment information as below, although it has 100% overall alignment, what is wrong with '1670714 (97.50%) aligned concordantly 0 times'?

Bowtie2 command:

bowtie2 -p 6 -x ./GCF_000157355.2_ASM15735v2_genomic.fna -1 input.read1.fastq -2 input.read2.fastq -S res.sam


1713638 reads; of these:
  1713638 (100.00%) were paired; of these:
    1670714 (97.50%) aligned concordantly 0 times
    39863 (2.33%) aligned concordantly exactly 1 time
    3061 (0.18%) aligned concordantly >1 times
    ----
    1670714 pairs aligned concordantly 0 times; of these:
      1638899 (98.10%) aligned discordantly 1 time
    ----
    31815 pairs aligned 0 times concordantly or discordantly; of these:
      63630 mates make up the pairs; of these:
        0 (0.00%) aligned 0 times
        14763 (23.20%) aligned exactly 1 time
        48867 (76.80%) aligned >1 times
100.00% overall alignment rate
ADD COMMENTlink modified 5 months ago by d-cameron1.9k • written 5 months ago by jesselee51670
2
gravatar for d-cameron
5 months ago by
d-cameron1.9k
Australia
d-cameron1.9k wrote:

See one of my alignment information as below, although it has 100% overall alignment, what is wrong with '1670714 (97.50%) aligned concordantly 0 times'?

Nothing is wrong per se, it's just a combination of bowtie2 incorrectly placed at least one of the reads and/or the fragment size being unusually small/large (assuming you simulated fragments with a median of 400bp instead of all exact 400bp) in 2.5% of your data set.

ADD COMMENTlink written 5 months ago by d-cameron1.9k

Thanks for reply. Do you any suggestion for me in this case? Does that will cause problem? I use DWGSIM to simulate reads with insertion length =400bp, and the standard deviation of the distance for pairs is 50bp which is default value. Because of simulated reads, I think they may be perfectly or highly like to match perfectly, like high aligned concordantly exactly proportion.

ADD REPLYlink written 5 months ago by jesselee51670

Unless you've given a custom -X parameter to bowtie2, fragments longer than 500bp are going to be considered discordant.

ADD REPLYlink written 5 months ago by d-cameron1.9k

Thanks a lot. I add two parameters(-I 500 -X 700). Because I simulated 100bp paired end reads with 400bp insertion length. My fragments should be 600bp(400bp+100bp2), is that correct? And the std is 50bp. So 500bp to 700bp is already mean with 2std. Now I get '1618473 (94.45%) aligned concordantly exactly 1 time'. It seems that the problem has been solved?

ADD REPLYlink written 5 months ago by jesselee51670

I just use -X 1000 so it works on practically all input libraries. Aren't you getting fewer concordantly alignment reads now? Check the documentation on your simulator (and run picard CollectionInsertSizeMetrics) to see what fragment lengths you've actually simulated.

The community can't agree on whether to include read lengths in the length fields (the SAM spec even redefined their TLEN field, but tools didn't update to use the 'correct' one), so you need to check the definition used for every single tool you use because they might be different.

ADD REPLYlink written 5 months ago by d-cameron1.9k

Thank you so much. The result now is reasonable. I changed the parameters to -I 200 and -X 1000, then the result is 100% concordantly(98.61% exactly 1 time, and 1.39% greater than 1 times.). I will accept your solution for this post. Thanks again!!!

ADD REPLYlink written 5 months ago by jesselee51670
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 811 users visited in the last hour