Fasta to FastQ with known qualities
1
0
Entering edit mode
4.9 years ago
ADDNOTHIING ▴ 10

Hi everyone,

I am working on aDNA and I try to simulate some of those reads. So at some point, I have reads into a FastQ file, with qualities, and I run them through Gargammel (a software to simulate aDNA damage). However this software take as input a Fasta file and output another Fasta file with the updated reads.

My question is: I want my FastQ back so, so far, I was using BBMap to go back to Fasta BUT I was putting dummies qualities to all nucleotides of all reads. I wanna know if there is an easy way to recover my previous FastQ qualities and add them to my new created Fasta sequences and create another FastQ file with those sequences and qualities ?

I can do it with by creating my own made script but pretty sure there is an easier/quicker way to do it.

Thanks a lot,

ADDNOTHIING

Fastq Fasta sequence • 1.3k views
ADD COMMENT
0
Entering edit mode

Are you going to make a change to Q-score where a nucleotide was updated or you don't care about that? @Bastien's solution should work if you don't.

ADD REPLY
0
Entering edit mode

For now I don't really want to change the quality score. That might (or might not) be the next step tho ! Thank you, I will try his solution.

ADD REPLY
0
Entering edit mode

with seqkit and join:

$ join -1 1 -2 1 -t $'\t' <(seqkit fx2tab test.fa) <(seqkit fx2tab test.fq) -o 2.1,1.2,2.3 | seqkit tab2fx

output:

@SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
GGGTGATGGCCGCTGCCGATGGCGaaaaaaaaaaaa
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IG9IC

input:

$ cat test.fa
>SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
GGGTGATGGCCGCTGCCGATGGCGaaaaaaaaaaaa 

$ cat test.fq
@SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
GGGTGATGGCCGCTGCCGATGGCGTCAAATCCCACC
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IG9IC
ADD REPLY
3
Entering edit mode
4.9 years ago

Hello ADDNOTHIING

Is it what you are looking for ? If you have solo line fasta this should work otherwise a python solution may be more suitable

cat reads.fa | sed 's/>/@/g' | paste - - <(seq -w 1 $(grep -c ">" reads.fa) | xargs printf '+\n%.s') <(awk 'NR % 4 == 0' reads.fq) | sed 's/\t/\n/g' > new.fq
ADD COMMENT

Login before adding your answer.

Traffic: 2583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6