Entering edit mode
7.3 years ago
oshin707
▴
10
I have fasta files of reads, with many line starting with '>' I want to keep the first one and remove the other lines. Mainly I want count the GC content from the fasta files. I used the code below but it removes all the line starting with ''>'' but i want to keep the first line.
less out.85l1M.fa | grep -v ">" > 85l_without_header.txt ##doesnt work for out.85l1M.fa
this is what i used to count the GC content
SeqIO.read("Podisma_mito.fasta", "fasta") #worked for this file
recordGC = SeqIO.read("Podisma_mito.fasta", "fasta") ##name the record whatever you want
recordGC.seq.count("A")
recordGC.seq.count("T")
recordGC.seq.count("G")
recordGC.seq.count("C")
recordGC.seq.count("GC")
recordGC.seq.count("AT")
Okay, but what does this have to do with removing headers?
Also, if your fasta file has multiple records you should use
SeqIO.parse()
Finally, your code is not getting you the GC content. But you probably already know that.