Question: Number of read with fastq file name
0
gravatar for Bioinfonext
10 months ago by
Bioinfonext190
Korea
Bioinfonext190 wrote:

I want no. of reads in fastq file with file name, below script can only give no. of reads in fastq file:

File name are like this:

Soil-6_S22_L001.m150-p1.join.fq

Soil-7.m150-p1.join.fq

Soil-8_S32_L001.m150-p1.join.fq

I am interested to get no. of reads with file name like this:

Soil-6              384994

Soil-7              205889

How should I modify below script?

#!/bin/bash

for i in `ls *.fq`; do echo $(cat ${i} | wc -l)/4|bc; done

Kind Regards

bash linux • 475 views
ADD COMMENTlink modified 10 months ago by lakhujanivijay4.7k • written 10 months ago by Bioinfonext190

input fastq:

$ cat test.fq 
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

with awk to print from one file:

$ awk -v OFS="\t" '/@/ {count++} END{print FILENAME,count}' test.fq 
test.fq 3

from multiple files:

$ parallel awk -v "OFS='\t' '/@/ {count++} END{print FILENAME,count}'" {} ::: *.fq
test10.fq   3
test1.fq    3
test2.fq    3
test3.fq    3
test4.fq    3
test5.fq    3
test6.fq    3
test7.fq    3
test8.fq    3
test9.fq    3


$ find . -type f -name "*.fq" -exec awk -v OFS="\t" '/@/ {count++} END{print FILENAME,count}' {} \; 
./test5.fq  3
./test4.fq  3
./test2.fq  3
./test8.fq  3
./test7.fq  3
./test3.fq  3
./test9.fq  3
./test6.fq  3
./test.fq   3
./test10.fq 3
./test1.fq  3
ADD REPLYlink modified 10 months ago • written 10 months ago by cpad011212k
1
gravatar for lakhujanivijay
10 months ago by
lakhujanivijay4.7k
India
lakhujanivijay4.7k wrote:
#!/bin/bash

for i in `ls *.fq`; do file_name=$(basename -s .fq $i);  printf "$file_name\t$(cat ${i} | wc -l)/4|bc\n"; done

Explanation

file_name=$(basename -s .fq $i)

basename = linux command to strip directory and suffix from filenames

-s = SUFFIX, remove a trailing SUFFIX

file_name=$(basename -s .fq $i) = remove the suffix .fq from the given file and store the name in the variable called file_name

ADD COMMENTlink modified 10 months ago • written 10 months ago by lakhujanivijay4.7k
1
gravatar for Pierre Lindenbaum
10 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum125k wrote:
for i in  *.fq ; do echo -n "$i " &&  cat $i | paste - - - - | wc -l ; done
ADD COMMENTlink written 10 months ago by Pierre Lindenbaum125k
1
gravatar for JC
10 months ago by
JC9.3k
Mexico
JC9.3k wrote:
for i in *.fq; do echo "$( echo $i | perl -pe 's/_.*//')  $( grep -c '@' $i)"
ADD COMMENTlink modified 10 months ago • written 10 months ago by JC9.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2496 users visited in the last hour