Question: [Python] reads from fastq file
0
gravatar for thebioinfo
2.6 years ago by
thebioinfo0
thebioinfo0 wrote:

I am looking for the fastest python code to extract only reads from fastq file and store it in new file.

fastq python • 3.5k views
ADD COMMENTlink modified 2.6 years ago by Buffo1.8k • written 2.6 years ago by thebioinfo0

to extract only reads from fastq file and store it in new file

What do you want to do?

ADD REPLYlink written 2.6 years ago by WouterDeCoster44k

as a beginner of python, i just need to learn how to do it.

ADD REPLYlink written 2.6 years ago by thebioinfo0

You just want to copy all reads to a new file?

ADD REPLYlink written 2.6 years ago by WouterDeCoster44k

yes i want to copy all reads to a new file

ADD REPLYlink written 2.6 years ago by thebioinfo0

Do you mean fastq to fasta?

ADD REPLYlink written 2.6 years ago by Buffo1.8k

yes fastq to fasta.....

ADD REPLYlink written 2.6 years ago by thebioinfo0

Well, you better be a more precise next time when asking questions.

ADD REPLYlink written 2.6 years ago by WouterDeCoster44k
3
gravatar for said3427
2.6 years ago by
said342790
Mexico
said342790 wrote:
from Bio import SeqIO

SeqIO.convert('myfile.fastq','fastq','myoutput.fasta','fasta')
ADD COMMENTlink written 2.6 years ago by said342790
2
gravatar for Alex Reynolds
2.6 years ago by
Alex Reynolds31k
Seattle, WA USA
Alex Reynolds31k wrote:
$ python -c "import subprocess; subprocess.check_call(\"awk '{ if (NR%4==1) { print \\\">\\\"\$0; } else if (NR%4==2) { print \$0; } }' source.fq > destination.fa\", shell=True)"

Or:

#!/usr/bin/env python
import subprocess
subprocess.check_call("awk '{ if (NR%4==1) { print \">\"$0; } else if (NR%4==2) { print $0; } }' source.fq > destination.fa", shell=True)
ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Alex Reynolds31k

While technically correct and probably the most efficient, it doesn't really teach OP something about Python :-p

ADD REPLYlink written 2.6 years ago by WouterDeCoster44k

Doing things on the command line is going to be the fastest route, so the subprocess library is useful to know, if the asker is forced to use Python.

ADD REPLYlink written 2.6 years ago by Alex Reynolds31k
2
gravatar for WouterDeCoster
2.6 years ago by
Belgium
WouterDeCoster44k wrote:

this is a general code snippet to copy files, what you are asking for... but quite useless in my opinion.

with open("myfile.fastq") as infile, open("myoutput.fastq", 'w') as output:
    for line in infile:
        output.write(line)
ADD COMMENTlink written 2.6 years ago by WouterDeCoster44k

I guess @thebioinfo wants only reads, so each 4the line.

ADD REPLYlink written 2.6 years ago by grant.hovhannisyan2.0k
1
gravatar for Buffo
2.6 years ago by
Buffo1.8k
Buffo1.8k wrote:

As you are looking for fastq to fasta, I think t is a duplicated question here

If you want to learn some python you may try:

import sys
filename = sys.argv[1]

with open(filename, "r") as infile:
    line_ct = 0
    for line in infile:
        if (line_ct % 4 == 0):
            print(">" + line[1:], end="")
            line_ct = 0
        if (line_ct  == 1):
            print(line, end = "")
        line_ct += 1
ADD COMMENTlink written 2.6 years ago by Buffo1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2297 users visited in the last hour