What Galaxy tools add Ns to variable length FASTQ sequences to get uniform length? (FASTA if needed)
1
0
Entering edit mode
2.7 years ago
lnrrnl ▴ 20

Hello!

I am attempting to perform alignments for a variety of FASTQ files. I need the sequences to be the same length, 250 bp. That being said, I do not want to remove sequences that are under 250 bp. The sequence length varies from 66-250. I would like the sequences that are 66 bp in length to be padded or extended to be 250. I have seen this done with the insertion of trailing gaps, however, I would like Ns to be inserted instead.

Please let me know if there are any tools on the Usegalaxy.org server that would successfully complete this. (The Eu server works as well!)

Thank you so much for your time.

length galaxy fastq ngs • 815 views
ADD COMMENT
2
Entering edit mode

this is your second question in a row about altering query sequences. This is not a typical solution. I have seen aligners that require uniform lengths but they are pretty archaic. Which modern aligner is forcing you to do this?

ADD REPLY
0
Entering edit mode
2.7 years ago

awk with the following program.Assuming input is not gzipped

'{if(NR%4==1 || NR%4==3) {print;} else {C=(NR%4==2?"N":"#");N=length($0);printf("%s",$0);for(i=N;i<250;i++) printf("%s",C); printf("\n");} }' 
ADD COMMENT

Login before adding your answer.

Traffic: 2736 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6