How to add 'placeholder' quality scores to FASTA files (for converting FASTA with no scores to SAM, BAM and VCF )
1
0
Entering edit mode
6.0 years ago

I'm working on converting several of the historical reference genome FASTA files into SAM, BAM and VCF files for a project. (Alignment will be using hg19.) The primary issue is that many of these FASTA files are missing scores (quality metrics). One solution appears to be to add 'placeholder' scores to these FASTA files so the conversions can be accomplished.

Is there an existing script that can accomplish this (add all necessary scores to a FASTA file so that it can be processed into SAM and beyond)? If not, which FASTA scores are required for these conversions to occur and how can those scores be added (or made unnecessary)? (If you have a better solution for FASTA2BAM and FASTA2VCF when scores are not included in the FASTA files). Thanks - Irene

FASTA Quality Metrics FASTA Metrics FASTA Scores • 2.3k views
ADD COMMENT
3
Entering edit mode
6.0 years ago

Many alignment programs, such as BBMap, can align fasta files without giving them fake quality values.  But if you want to give them quality values, you can use the included reformat script:

reformat.sh in=reads.fasta out=reads.fastq qfake=30

ADD COMMENT

Login before adding your answer.

Traffic: 1066 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6