Question: fastQ files exploration
0
gravatar for ezraamustafa3
14 months ago by
ezraamustafa30 wrote:

How can I extract the sequences identifier only from a fastq file without the sequences or the quality scores using linux?

ngs fastq • 370 views
ADD COMMENTlink modified 14 months ago by h.mon30k • written 14 months ago by ezraamustafa30
1
gravatar for h.mon
14 months ago by
h.mon30k
Brazil
h.mon30k wrote:

Another option to print only read names is to print every 4th line, starting from the first line:

zcat file.fastq.gz | awk 'NR%4==1'
ADD COMMENTlink written 14 months ago by h.mon30k
0
gravatar for swbarnes2
14 months ago by
swbarnes27.9k
United States
swbarnes27.9k wrote:

Use zcat myfile.fastq.gz | head to see the first 10 lines. You should be able to see a couple of read names. The first few parts of it should be instrument ID and run ID, and if the fastqs are straight from the instrument, those should be constant in every read. Something like zgrep M012933 myfile.fastq.gz should get you all the read names.

ADD COMMENTlink written 14 months ago by swbarnes27.9k

It worked with me, really thanks!

ADD REPLYlink written 14 months ago by ezraamustafa30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1692 users visited in the last hour