Question: The confused result of samtools sort
0
gravatar for younglin113
11 months ago by
younglin11340
younglin11340 wrote:

Recently, I was doing an RNA-seq project, and I used spike-in control sequences in my experiment. So I mapped the reads to both human genome and spike-in sequences. And I successfully got the bam file, but when I tried to use the command samtools sort -o aa_sorted.bam aa.bam to sort the bam file according to the chromosome name, and got those confused order, here is the sorted bam file's header:

enter image description here

As you can see, The chromosomes marked with a red frame are placed in front of the chromosome 1. This result is really confusing. Can anyone help me here, thanks in advance.

ADD COMMENTlink modified 11 months ago by finswimmer13k • written 11 months ago by younglin11340
1
gravatar for Devon Ryan
11 months ago by
Devon Ryan94k
Freiburg, Germany
Devon Ryan94k wrote:

samtools sort doesn't change the order of the chromosomes, it just sorts the alignments according to them. There are many tools that require that the order of the chromosomes match between fasta and BAM files, so if you start playing around with that order you're likely to run into problems. Note that you CAN do this, but you'll need to do something like:

# make a new header with the chromosomes in the order you'd like
# name it "header"
cat header <(samtools view original_sorting.bam) | samtools sort -o desired_sorting.bam -
ADD COMMENTlink written 11 months ago by Devon Ryan94k

Oh, so it is. Thanks a lot. So I just need to change the header and it can sort as I want it to, is that what you mean ?

ADD REPLYlink written 11 months ago by younglin11340

Yes, but don't reheader the BAM file (this will completely screw things up), use something like I outlined where there's a SAM file produced and then piped into samtools sort.

ADD REPLYlink written 11 months ago by Devon Ryan94k
1
gravatar for finswimmer
11 months ago by
finswimmer13k
Germany
finswimmer13k wrote:

Hello,

samtools sort sort the reads by position for each contig. It doesn't sort the contig itself, because this is usually not necessary.

The order of the contig names you see, is given by the order of the contigs in the reference file you specify during alignment.

fin swimmer

ADD COMMENTlink written 11 months ago by finswimmer13k

Thanks a lot. It's really helped my doubts.

ADD REPLYlink written 11 months ago by younglin11340
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1130 users visited in the last hour