Question: coverage bed error but produces output file
0
gravatar for bioguy24
4.0 years ago by
bioguy24190
Chicago
bioguy24190 wrote:

The below command produces a file (674 MB).

coverageBed -d -sorted -a /home/cmccabe/Desktop/NGS/bed/sorted_unix_5column_xgen_targets.bed -b /home/cmccabe/Desktop/NGS/pool_I_090215/IonXpress_008_150902_newheader.bam > /home/cmccabe/Desktop/NGS/pool_I_090215/IonXpress_008_150902_output.txt

Output

chr1    955542    955763    +    AGRN:exon.1    1    0
chr1    955542    955763    +    AGRN:exon.1    2    0
chr1    955542    955763    +    AGRN:exon.1    3    0
chr1    955542    955763    +    AGRN:exon.1    4    1

My question is even-though an output file results I get the below error:

cmccabe@DTV-A5211QLM:~/Desktop/NGS$ coverageBed -d -sorted -a /home/cmccabe/Desktop/NGS/bed/sorted_unix_5column_xgen_targets.bed -b /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_008_150902.bam > /home/cmccabe/Desktop/NGS/pool_I_090215/IonXpress_008_150902_counts.txt
ERROR: chromomsome sort ordering for file /media/cmccabe/C2F8EFBFF8EFAFB9/pool_I_090215/IonXpress_008_150902.bam is inconsistent with other files. Record was:
chr10	68734	68755	X28LU:04418:11727	6	+

I checked the bam and it is coordinate sorted:

cmccabe@DTV-A5211QLM:~/Desktop/NGS/pool_I_090215$ samtools view -H IonXpress_008_150902_newheader.bam | grep SO
@HD    VN:1.4    GO:none    SO:coordinate

I just want to make sure that I can use the data output of if I am missing something.  I can not find that record in the bam file, is that a sequencing error, a bug, or something unique to Ion Torrent bam files.  Thank you :).

Sorted bed file

chr1    955542    955763    +    AGRN:exon.1
chr1    957570    957852    +    AGRN:exon.2
chr1    976034    976270    +    AGRN:exon.2;AGRN:exon.3;AGRN:exon.4

I also verified the sort order in the bam

cmccabe@DTV-A5211QLM:~/Desktop/NGS/pool_I_090215$ samtools view -H IonXpress_008_150902_newheader.bam | grep SQ | cut -f 2 | awk '{ sub(/^SN:/, ""); print;}'
chr1
chr2
chr3
chr4
chr5
chr6
chr7
chr8
chr9
chr10
chr11
chr12
chr13
chr14
chr15
chr16
chr17
chr18
chr19
chr20
chr21
chr22
chrX
chrY
chrM

 

sort ngs bedtools • 2.3k views
ADD COMMENTlink modified 4.0 years ago by James Ashmore2.7k • written 4.0 years ago by bioguy24190
2

Is it a lexical vs numerical sort thing: 1 10 .. 2 20 ..?

ADD REPLYlink written 4.0 years ago by thackl2.7k
1

You've double-checked the order of your bed file too, especially the chromosome ordering, and the naming convention? The fact that the "mis-ordering" occurs relatively early in chromosome 10 makes me think of the formal possibility that one of your files (maybe the bed file?) is sorted chr1 chr10 chr11 .. chr2 chr20 .. instead of chr1 chr2 .. chr9 chr10..

I know you know awk, but one could "awk '{print $1}' aBed.bed | sort -u" to quickly check

I'm guessing your output that you showed is not comprehensive, in which case you might check the tail-end of the output file to see if it also ends on chr1, or if it covered the entire genome.

ADD REPLYlink written 4.0 years ago by Joseph Pearson450

Looks like I just need to sort -k1,1 -k2,2n

awk '{print $1}' sorted_unix_5column_xgen_targets.bed | sort -u
chr1
chr10
chr11
chr12
chr13
chr14
chr15
chr16
chr17
chr18
chr19
chr2
chr20
chr21
chr22
chr3
chr4
chr5
chr6
chr7
chr8
chr9
chrX
chrY

 

ADD REPLYlink written 4.0 years ago by bioguy24190
4
gravatar for thackl
4.0 years ago by
thackl2.7k
MIT
thackl2.7k wrote:

According to the bedtools manual -sorted needs to be done with sort -k1,1 -k2,2n, which would do a lexical sort and put chr10,11.. before chr2. You BAM file however, is sorted in a "human" order...

ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by thackl2.7k

-sorted just lets bedtools know that the input files are already sorted by some consistent rule. But you're right, that the bed file (sorted by the recommended -k1,1)  would give a lexical sort.

Here's a post describing how to use Unix sort to sort in "human" order:
How To Sort Bed Format File

ADD REPLYlink written 4.0 years ago by Joseph Pearson450
1
gravatar for James Ashmore
4.0 years ago by
James Ashmore2.7k
UK/Edinburgh/MRC Centre for Regenerative Medicine
James Ashmore2.7k wrote:

As the error says it look as though either your BAM or BED file are not sorted correctly. First inspect the sort order of the chromosomes in both the sorted BED and BAM files by running the following commands:

cut -f1 your_file.bed | uniq
samtools view your_file.bam | cut -f3 | uniq

If the order of the chromosomes is different then you will have to provide a genome file to Bedtools with the order your chromosomes are in the BED file.

ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by James Ashmore2.7k

Thank you everybody :).

ADD REPLYlink written 4.0 years ago by bioguy24190

I mis-read the post and my bed was sorted correctly but my bam is in "human" sorting so I need to re-sort the bam in "lexical"?  Thank you :).

The output file that results is (with the chromosome ordering):

cmccabe@DTV-A5211QLM:~/Desktop/NGS/bed$ awk '{print $1}' IonXpress_008_150902_counts.txt | sort -u
chr1
chr10
chr11
chr12
chr13
chr14
chr15
chr16
chr17
chr18
chr19
chr2
chr20
chr21
chr22
chr3
chr4
chr5
chr6
chr7
chr8
chr9

 

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by bioguy24190

Yep, a simple samtools sort your.bam your.sorted should suffice.
 

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by thackl2.7k
1

Beware samtools sort does not sort your alignments by chromosome, it sorts them by position in each chromosome. For instance, if your FASTA index which you aligned your reads to starts with chr12, chr13, chr15 e.t.c then your BAM file with start with chr12, chr13, chr15 e.t.c. Since it is much harder to sort the BAM file then the BED file, I suggest you sort your BED file by the order of the chromosomes in the FASTA index file.

ADD REPLYlink written 4.0 years ago by James Ashmore2.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1383 users visited in the last hour