Error in converting bed file to interval using Picard
0
0
Entering edit mode
4.4 years ago
Assa Yeroslaviz ★ 1.8k

We have a WES data set which was done using the Agilent Mouse exome capture library kit. I wanted to download the target file and got, similar to this post, a folder with several bed files (_AllTracks.bed, _Covered.bed, _Padded.bed, _Regions.bed and a file named Targets.txt). I am not really sure what they are, but my problem is more than that.

When I try to run the command

gatk BedToIntervalList \
-I input/S0276129_Covered.bed \
-O input/S0276129_Covered.intervals \
--SEQUENCE_DICTIONARY ../reference/mm10/mm10.dict

I get the following error:

picard.PicardException: Start on sequence 'chr1' was past the end: 195471971 < 196469947
        at picard.util.BedToIntervalList.doWork(BedToIntervalList.java:143)
        at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
        at org.broadinstitute.hellbender.cmdline.PicardCommandLineProgramExecutor.instanceMain(PicardCommandLineProgramExecutor.java:25)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
        at org.broadinstitute.hellbender.Main.main(Main.java:292)

Which based on the message tells me that the bed files show coordinates which are not given in the dict file for chr1.

This is true, when I look at chromosome 1 in the bed file I see:

 grep "chr1\s" input/S0276129_Covered.bed |  tail
chr1    196986946       196987186       entg|Cr2,ens|ENSMUST00000082321,ref|NM_007...
chr1    196989335       196989485       entg|Cr2,ens|ENSMUST00000082321,ref|NM_007...

but t he dict file shows

less ../reference/mm10/Sequence/WholeGenomeFasta/genome.dict 
@HD     VN:1.0  SO:unsorted
...
@SQ     SN:chr1 LN:195471971    UR:file:/illumina/scratch/iGenomes/Mus_musculus/UCSC/mm10/Sequence/WholeGenomeFasta/genome.fa   M5:c4ec915e7348d42648eefc1534b71c99
...

When I search for the gene Cr2, its coordinates are Chromosome 1: 195,136,811-195,176,716

Is there something wrong with the bed file from Agilent? Any ideas what is happening?

thanks

picard BedToIntervalList agilent exome WES • 1.7k views
ADD COMMENT

Login before adding your answer.

Traffic: 2044 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6