cram file to fastq conversion
1
1
Entering edit mode
21 months ago
eric.londin ▴ 50

Hi all, I received some cram files from the 1000 genomes data. I am trying to convert them back to a fastq file, but can't seem to figure out how to do this. I've tried using

samtools fastq -1 out.R1.fastq -2 out.R2.fastq input.cram


but when doing this, I get an error of:

Failed to populate reference for id 0 Unable to fetch reference #0 9999..134549 Failure to decode slice [M::bam2fq_mainloop] processed 0 reads

I guess I can convert these back to a bam file, then convert them to a fastq, but this seems like a lot of unnecessary steps. I would think that there would be a straight forward approach to go directly from a cram to fastq, but can't seem to find a good solution.

Thanks for any help.

cram fastq conversion • 2.9k views
2
Entering edit mode
21 months ago
GenoMax 115k

CRAM files are alignment files like BAM files. They represent a compressed version of the alignment. This compression is driven by the reference the sequence data is aligned to.

Get the reference they are aligned to to do the conversion. References used are noted on this page.

0
Entering edit mode

Thank you. I got that reference and still getting an error.

samtools fastq --reference GRCh38.fa -1 out.R1.fastq -2 out.R2.fastq input.cram


Failed to populate reference for id 0 Unable to fetch reference #0 9999..134549 Failure to decode slice [M::bam2fq_mainloop] processed 0 reads

So, I'm not sure what is happening here...

1
Entering edit mode

Sounds like the chromosome names don't match or they're in a different order. Compare the header from the CRAM file to the chromosome names in the fasta.

1
Entering edit mode

That was the problem. just had to sort the cram file, and worked fine. Thanks for the help

0
Entering edit mode

Would you please let us know how did you do that? As I get the same error?