Biopython And Fastqgeneraliterator
1
0
Entering edit mode
11.7 years ago

I am trying FastqGeneralIterator. Which is pretty fast, faster that the SeqIO.parse I was using. The only problem is that - say for an original record

@M12MX:9:47
AGTCTATAC
+
0::99:DDD

It removes the @ from the 1st line, and does not have information for the 3rd line. I could just put them in, but I would be guessing. Does anyone have a solution for this?

Thanks

biopython fastq • 2.7k views
ADD COMMENT
0
Entering edit mode

Thanks very much

ADD REPLY
0
Entering edit mode

You should add comments like this by clicking the comment hyperlink under brentp's answer.

ADD REPLY
2
Entering edit mode
11.7 years ago
brentp 24k

You can read the code to see what it is doing: https://github.com/biopython/biopython/blob/master/Bio/SeqIO/QualityIO.py#L785

It only removes the "@" after checking that's the first character of the title line.

For the 2nd title line, it checks that it starts with "+". If there are more characters, it checks that it matches the title line. So, if your file does have info in the 3rd line, it has to be redundant or FastqGeneralIterator would generate an error.

You don't have to guess about the first line, just add the "@" back in.

ADD COMMENT

Login before adding your answer.

Traffic: 2052 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6