Question: After sorting bam file not able to index ?
0
gravatar for sunnykevin97
9 days ago by
sunnykevin9710
sunnykevin9710 wrote:

Hi after sorting not able to index the bam file showing error, how to sort it out ?

samtools index -b -@ 4 A-DS1546_1__R1_dedup_2_ReadGroupssorted.bam

[E::hts_idx_push] unsorted positions on sequence #1: 1 followed by 0 samtools index: failed to create index for "A-DS1546_1__R1_dedup_2_ReadGroupssorted.bam"

thanks!

next-gen alignment • 145 views
ADD COMMENTlink modified 9 days ago by Bastien Hervé2.8k • written 9 days ago by sunnykevin9710

Hello sunnykevin97,

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

Thank you!

ADD REPLYlink written 9 days ago by Bastien Hervé2.8k

Can you show the sort command?

ADD REPLYlink written 9 days ago by ATpoint12k

samtools sort -@ 4 A-DS1546_1__R1_dedup_2_ReadGroups.bam -o A-DS1546_1__R1_dedup_2_ReadGroupssorted.bam

ADD REPLYlink modified 9 days ago by ATpoint12k • written 9 days ago by sunnykevin9710

Do you have information in your A-DS1546_1__R1_dedup_2_ReadGroupssorted.bam (not empty)

ADD REPLYlink written 9 days ago by Bastien Hervé2.8k

It generated a sorted bam file ~70 GB

ADD REPLYlink modified 9 days ago by Bastien Hervé2.8k • written 9 days ago by sunnykevin9710

I shorten your comment as you reply the exact same thing to ATpoint and he has took his time to reformat your comment

This is a sam file not a bam file

ADD REPLYlink modified 9 days ago • written 9 days ago by Bastien Hervé2.8k

Come on, show some effort. Did you try anything, did you look at the data? Please show some lines of the file.

ADD REPLYlink written 9 days ago by ATpoint12k
samtools view A-DS1546_1__R1_dedup_2_ReadGroupssorted.bam | head

C0V63ACXX120821:6:2101:1050:8813    73  chr1    0   0   101M    =   0   0   NTTCCCGTGGGGGTGTGGCTAGGCGAGACGTCATGAGCTACACTTGGAGTGTGTGCTTGATGCCAGCTCTCTTTGATCAGCGATGATTTAGAAGGCGTATT   #1=BDFFDFHHHH:@FDHGIIIIAGBA/;ECHHEHAEDEFDBDEEEC;>(;;@?A@AC<<3>A::A?ACCCCCCCACCACCB5A@>:BD>ACCCBBB9>@>   X0:i:3  X1:i:5  MD:Z:0  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:3  XN:i:100    XO:i:0  XT:A:N  RG:Z:A-DS1546_1__R1
C0V4CACXX120821:1:1107:15183:40205  89  chr1    0   0   101M    =   0   0   ATTTCCCGTGGGGGTTGGGCTAGGCGAGACGTCATGAGCTACACTTGGAGTGTGTGCTTGATGCCAGCTCTCTTTGATCAGCGATGATTTAGAAGGCGTAT   ###########################################AEGHCFB@FAD?B@BB9B3:)?:31C?919@@GF?8<<GHED>IIDFBHHDAABA;;?   X0:i:3  X1:i:0  XA:Z:chrX,-51621000,101M,5;gi|117937807|gb|DQ188829.2|,+142,101M,5; MD:Z:0  XG:i:0  AM:i:0  NM:i:0  SM:i:0  XM:i:5  XN:i:100    XO:i:0  XT:A:N  RG:Z:A-DS1546_1__R1
C0V4CACXX120821:1:2202:17464:95285  163 chr1    3000001 29  15S20M1I46M19S  =   3000233 333 ATAATCTGTGTCACCTTTCCGAGGCTACCACTTCCAAGGACTACAGGAGTCGTCTCAGGTGAAAGACCATACGAAAGAGGCCCGGTTCGTACGGTAAAACA   ;@@ADD?DF4ADDBE<GGCHG<C1E=<FHFDC?DBBB3;BD;?/9B9DGHCAFFDDDDA7@.??E>DE@C@3;(6'9A?######################   MD:Z:21C33A10   XG:i:1  AM:i:29 NM:i:3  SM:i:29 XM:i:2  XO:i:1  XT:A:M  RG:Z:A-DS1546_1__R1
C0V4CACXX120821:1:2316:15160:19446  99  chr1    3000001 29  3S20M1I76M1S    =   3000256 356 ACCTTTCCGAGGCTACCACTTCCAACGACTACAGGAGTCGTCTCAGGTGAAAGACCATAAGAAAGAGGCCTGGTGTTTATGTATAAAAAGGTAATTATTAA   @@@DFFFFH>FADGIJJGGGIHHEHGIEHGGJIJIDIBFGFHEHIGG8@GHCIIHHHHHC?>CCDC>?ABDDD;?9?CC<>>@CCACCCBD>C::>@@###   MD:Z:96 XG:i:1  AM:i:29 NM:i:1  SM:i:29 XM:i:0  XO:i:1  XT:A:M  RG:Z:A-DS1546_1__R1
C0V63ACXX120821:6:2206:5185:69456   163 chr1    3000001 29  62S20M1I18M =   3000173 273 CTTCAACTATTTATACGATGTACCAATGAACACCTTCATCAAGCTCGATAATCTGTGTCACCTTTCCGAGGCTACCACTTCCAACGACTACAGGAGTCGTC   CCCFFFFFHHHHHJJJJIJJHHIJJJIIJJJJJJJJJJJJIJJJIJJJJFHIIJJIJJJJJJIJJJJIJJHFFFEECEEEDDCDDDDDDDDDDDDD?CBB?   MD:Z:38 XG:i:1  AM:i:29 NM:i:1  SM:i:29 XM:i:0  XO:i:1  XT:A:M  RG:Z:A-DS1546_1__R1
C0V63ACXX120821:7:1106:4563:28548   163 chr1    3000001 29  31S20M1I14M35S  =   3000035 135 ACCTTCATCAAGCTCGATAATCTGTGTCACCTTTCCGAGGCTACCACTTCCAACGACTACAGGAGTCGTCTCAGGTGAAAGACCATAAGAAAGAGGCCTGG   CCCFFFFFGHHHHJJJJJJJJJJGGGIJJJJJJJJJJJIJJGIIJGIIIJGHIGIIJJHFHHFFFEFDDDDDEDD@CDACDDDCDDCDDCDDCDDDDDCDB   MD:Z:34 XG:i:1  AM:i:29 NM:i:1  SM:i:29 XM:i:0  XO:i:1  XT:A:M  RG:Z:A-DS1546_1__R1
C0V63ACXX120821:7:2206:5222:58210   163 chr1    3000001 29  62S20M1I18M =   3000225 325 CTTCAACTATTTATACGATGTACCAATGAACACCTTCATCAAGCTCGATAATCTGTGTCACATTTCCGAGGCTACCACTTCCAAGGACTACAGGAGTCGTC   @@<+B?DDHHGAHIIIBGC@CCFHJHGFGHGGIJJJIGIAHFHBGIIG@FD<FH@<FBFFF).8CDF=BEED??CECEECCCAC>=?A:?CCA<A######   MD:Z:21C16  XG:i:1  AM:i:29 NM:i:2  SM:i:29 XM:i:1  XO:i:1  XT:A:M  RG:Z:A-DS1546_1__R1
C0V63ACXX120821:7:2208:17232:6944   99  chr1    3000001 29  26S20M1I54M =   3000209 309 CATCAAGCTCGATAATCTGTGTCACCTTTCCGAGGCTACCACTTCCAAGGACTACAGGAGTCGTCTCAGGTGAAAGACCATAAGAAAGAGTCCTGGTGTTT   CCCFFFFFHHHHHJJJJJFHGHJJJJJJJJIHFHGIJJJJJJJJJJIGHIIIIEHHJIGICEHBFFFFEE>ACEC>CDDDDDDDDDDDDC@CDDDD3<<@<   MD:Z:21C41G10   XG:i:1  AM:i:29 NM:i:3  SM:i:29 XM:i:2  XO:i:1  XT:A:M  RG:Z:A-DS1546_1__R1
C0V63ACXX120821:7:2210:18879:53607  99  chr1    3000001 29  59S20M1I21M =   3000217 317 CAACTATTTATACGATGTACCAATGAACACCTTCATCAAGCTCGATAATCTGTGTCACCTTTCCGAGGCTACCACTTCCAACGACTACAGGAGTCGTCTCA   @@@DDDDDHA?DFD7CB3AFAHFB3A<+A;;@GG@<EDDEHICE)0B@F<BBBDHCDBAFEGB=@AA=E?D);;36;>C:>CBBB/':A>3(23?<22??:   MD:Z:41 XG:i:1  AM:i:29 NM:i:1  SM:i:29 XM:i:0  XO:i:1  XT:A:M  RG:Z:A-DS1546_1__R1
C0V63ACXX120821:7:2309:11600:84812  163 chr1    3000004 60  17M1I83M    =   3000299 396 CCGAGGCTACCACTTCCAACGACTACAGGAGTCGTCTCAGGTGAAAGACCATAAGAAAGAGGCCTGGTGTTTATGTATAAAAAGGTAATTATTAATAAGTT   @CCFFFFDHHGFHJJJJIIIJJJIJJJJJ@FGIHHJJJJJJFHEGIJIJIEGHIJJJHFEHBFFFDD;A=?CCDDDFDDDDDDDD3>CDDEDDCCDDCAD:   X0:i:1  X1:i:0  MD:Z:93G6   XG:i:1  AM:i:37 NM:i:2  SM:i:37 XM:i:1  XO:i:1  XT:A:U  RG:Z:A-DS1546_1__R1
ADD REPLYlink modified 9 days ago by ATpoint12k • written 9 days ago by sunnykevin9710

Folks, is it normal for those first two entries to not have the -f 4 flagged, but to have mapping coordinates of 0?

ADD REPLYlink written 9 days ago by swbarnes24.7k

Seems like the mate is unmapped for the two first entries. But if the read is mapped there should has be a position

Can we see the alignment command line please ?

ADD REPLYlink written 8 days ago by Bastien Hervé2.8k

hello, this the command line they used..

@SQ SN:gi|117937807|gb|DQ188829.2|  LN:17010
@RG ID:A-DS1546_1__R1   PL:Illumina PU:01   LB:Library  SM:DS1546
@PG ID:bwa  PN:bwa  VN:0.7.8-r455   CL:/groups/reich/sw/bwa-0.7.8/bwa sampe /groups/reich/reference-genomes/loxAfr3_mod_mt/bwa-0.7.8/loxAfr3com.v2.fa aln.sai1 aln.sai2 R1.trim.fastq.gz R2.trim.fastq.gz
ADD REPLYlink written 8 days ago by sunnykevin9710

Could you check if your bam is not truncated or corrupt

How to systematically check if a bam file is truncated

ADD REPLYlink written 8 days ago by Bastien Hervé2.8k

As said finswimmer in this post : How to specify the sort based on name in samtools sort?

you can only index the bam file when it ist coordinate-sorted

ADD REPLYlink written 9 days ago by Bastien Hervé2.8k

I tried to index the sorted bam file

ADD REPLYlink written 9 days ago by sunnykevin9710

How was the unsorted bam file generated? Is this the output from an aligner or is it converted from a sam file? Check to confirm that there is a header on your sam file:

samtools view -H sortedBamFile.bam
ADD REPLYlink modified 9 days ago • written 9 days ago by shawn.w.foley140

For alignment BWA sampe they used, I downloaded 7 sorted bam files from a NCBI bioproject. I don't have any problem with other bam files except this. I tried to call variants using this bam file the problems persists repeatedly. How to sort it out ?

Bam file before sorting

samtools view -H A-DS1546_1__R1_dedup_2_ReadGroups.bam | head

@HD VN:1.0  SO:coordinate
@SQ SN:chr1 LN:214701375
@SQ SN:chr10    LN:103745917
@SQ SN:chr11    LN:68864707
@SQ SN:chr12    LN:83603808
@SQ SN:chr13    LN:103893473
@SQ SN:chr14    LN:70853943
@SQ SN:chr15    LN:88640724
@SQ SN:chr16    LN:45784276
@SQ SN:chr17    LN:69133140

Bam file after sorting

samtools view -H A-DS1546_1__R1_dedup_2_ReadGroupssorted.bam | head

@HD VN:1.0  SO:coordinate
@SQ SN:chr1 LN:214701375
@SQ SN:chr10    LN:103745917
@SQ SN:chr11    LN:68864707
@SQ SN:chr12    LN:83603808
@SQ SN:chr13    LN:103893473
@SQ SN:chr14    LN:70853943
@SQ SN:chr15    LN:88640724
@SQ SN:chr16    LN:45784276
@SQ SN:chr17    LN:69133140
ADD REPLYlink modified 8 days ago • written 8 days ago by sunnykevin9710
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1262 users visited in the last hour