Question: Number of read with fastq file name
0
gravatar for Bioinfonext
25 days ago by
Bioinfonext140
Korea
Bioinfonext140 wrote:

I want no. of reads in fastq file with file name, below script can only give no. of reads in fastq file:

File name are like this:

Soil-6_S22_L001.m150-p1.join.fq

Soil-7.m150-p1.join.fq

Soil-8_S32_L001.m150-p1.join.fq

I am interested to get no. of reads with file name like this:

Soil-6              384994

Soil-7              205889

How should I modify below script?

#!/bin/bash

for i in `ls *.fq`; do echo $(cat ${i} | wc -l)/4|bc; done

Kind Regards

bash linux • 134 views
ADD COMMENTlink modified 25 days ago by Vijay Lakhujani4.0k • written 25 days ago by Bioinfonext140

input fastq:

$ cat test.fq 
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

with awk to print from one file:

$ awk -v OFS="\t" '/@/ {count++} END{print FILENAME,count}' test.fq 
test.fq 3

from multiple files:

$ parallel awk -v "OFS='\t' '/@/ {count++} END{print FILENAME,count}'" {} ::: *.fq
test10.fq   3
test1.fq    3
test2.fq    3
test3.fq    3
test4.fq    3
test5.fq    3
test6.fq    3
test7.fq    3
test8.fq    3
test9.fq    3


$ find . -type f -name "*.fq" -exec awk -v OFS="\t" '/@/ {count++} END{print FILENAME,count}' {} \; 
./test5.fq  3
./test4.fq  3
./test2.fq  3
./test8.fq  3
./test7.fq  3
./test3.fq  3
./test9.fq  3
./test6.fq  3
./test.fq   3
./test10.fq 3
./test1.fq  3
ADD REPLYlink modified 24 days ago • written 25 days ago by cpad011211k
1
gravatar for Vijay Lakhujani
25 days ago by
Vijay Lakhujani4.0k
India
Vijay Lakhujani4.0k wrote:
#!/bin/bash

for i in `ls *.fq`; do file_name=$(basename -s .fq $i);  printf "$file_name\t$(cat ${i} | wc -l)/4|bc\n"; done

Explanation

file_name=$(basename -s .fq $i)

basename = linux command to strip directory and suffix from filenames

-s = SUFFIX, remove a trailing SUFFIX

file_name=$(basename -s .fq $i) = remove the suffix .fq from the given file and store the name in the variable called file_name

ADD COMMENTlink modified 25 days ago • written 25 days ago by Vijay Lakhujani4.0k
1
gravatar for Pierre Lindenbaum
25 days ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:
for i in  *.fq ; do echo -n "$i " &&  cat $i | paste - - - - | wc -l ; done
ADD COMMENTlink written 25 days ago by Pierre Lindenbaum119k
1
gravatar for JC
25 days ago by
JC7.7k
Mexico
JC7.7k wrote:
for i in *.fq; do echo "$( echo $i | perl -pe 's/_.*//')  $( grep -c '@' $i)"
ADD COMMENTlink modified 25 days ago • written 25 days ago by JC7.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 831 users visited in the last hour