Question: Mate mismatch errors in 1000genomes CRAM files
0
gravatar for Mehulsharma.253
7 months ago by
Mehulsharma.25310 wrote:

I downloaded some CRAM files for variant calling from the 1000 genomes FTP server. I also downloaded the reference genome and MD5 cache as per the instructions in this README doc.

However, running Picard tools' ValidateSAMfile gave errors (refer below) in both the CRAM as well as the subsequently converted (to) BAM. Running FixMateInformation gave zero errors after revalidation.

Is anyone else encountering such issues with 1000genomes GRCh38 CRAM files ? What could be the source of these errors ?

Errors:

Mate negative strand flag does not match read negative strand of mate

Mate alignment does not match alignment start of mate

Mate CIGAR string does not match CIGAR string of mate

.

.

.

ADD COMMENTlink modified 6 months ago • written 7 months ago by Mehulsharma.25310

Can you show the exact commands you've used? e.g. FixMateInformation has a ADD_MATE_CIGAR=true options, was it used?

For what it's worth, even the Broad Institute doesn't seem totally confident that their tool works well with CRAM files, but also can you make sure you're using the latest version of Picard tools? Maybe it's better now. I know I've had bad luck using outputs of samtools (which now includes cramtools) into Picard. Annecdotally I found them pretty much incompatible in a specific project and never figured out why.

You could try looking at external tools for BAM/CRAM validation, e.g. https://genome.sph.umich.edu/wiki/BamUtil

ADD REPLYlink modified 6 months ago • written 6 months ago by manuel.belmadani1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 755 users visited in the last hour