Question: Sort BAM files by reference, and then by read name within each reference?
0
gravatar for vitor.rca
15 days ago by
vitor.rca0
vitor.rca0 wrote:

Is there an efficient way to sort a BAM file by chromosome, and then by read names within each chromosome?

I could split the BAM by chromosome, sort each resulting BAM by read names, and finally merge the list of BAMs. But I wonder if there is a single and straightforward command to do that.

rna-seq alignment • 111 views
ADD COMMENTlink modified 15 days ago by Pierre Lindenbaum133k • written 15 days ago by vitor.rca0

I'm not even sure you can sort BAM on reference ? or do you mean on position ?

ADD REPLYlink written 15 days ago by lieven.sterck9.5k

The default behavior of samtools sort does that: "When the -n option is not present, reads are sorted by reference (according to the order of the @SQ header records), then by position in the reference". I am asking how to sort by reference, then by read name.

ADD REPLYlink written 15 days ago by vitor.rca0

ok, yes, that's what I mean with 'by position'. I don't think what you ask is possible with any of the existing tools. Splitting the BAM, as you mentioned, is likely the most easy way to get to it.

ADD REPLYlink modified 15 days ago • written 15 days ago by lieven.sterck9.5k

OK, thank you. It takes a long time to split a BAM with bamtools split, I mean, more time than I was anticipating for a BAM already sorted by position. So I was hoping for an alternative, but it seems like there's no reason for someone to implement such sorting scheme, since no tool requires it, except for the UBU sam-xlate tool which I'm trying to use (https://github.com/mozack/ubu/wiki).

ADD REPLYlink written 15 days ago by vitor.rca0
3
gravatar for Pierre Lindenbaum
15 days ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum133k wrote:

I wrote http://lindenb.github.io/jvarkit/SortSamRefName.html

ADD COMMENTlink written 15 days ago by Pierre Lindenbaum133k
1

damn, should have known you would have code at hand for that Pierre Lindenbaum . :-)

thx.

ADD REPLYlink written 15 days ago by lieven.sterck9.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2765 users visited in the last hour
_