Question: Number of read with fastq file name
0
gravatar for Bioinfonext
18 months ago by
Bioinfonext250
Korea
Bioinfonext250 wrote:

I want no. of reads in fastq file with file name, below script can only give no. of reads in fastq file:

File name are like this:

Soil-6_S22_L001.m150-p1.join.fq

Soil-7.m150-p1.join.fq

Soil-8_S32_L001.m150-p1.join.fq

I am interested to get no. of reads with file name like this:

Soil-6              384994

Soil-7              205889

How should I modify below script?

#!/bin/bash

for i in `ls *.fq`; do echo $(cat ${i} | wc -l)/4|bc; done

Kind Regards

bash linux • 915 views
ADD COMMENTlink modified 18 months ago by lakhujanivijay5.2k • written 18 months ago by Bioinfonext250

input fastq:

$ cat test.fq 
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

with awk to print from one file:

$ awk -v OFS="\t" '/@/ {count++} END{print FILENAME,count}' test.fq 
test.fq 3

from multiple files:

$ parallel awk -v "OFS='\t' '/@/ {count++} END{print FILENAME,count}'" {} ::: *.fq
test10.fq   3
test1.fq    3
test2.fq    3
test3.fq    3
test4.fq    3
test5.fq    3
test6.fq    3
test7.fq    3
test8.fq    3
test9.fq    3


$ find . -type f -name "*.fq" -exec awk -v OFS="\t" '/@/ {count++} END{print FILENAME,count}' {} \; 
./test5.fq  3
./test4.fq  3
./test2.fq  3
./test8.fq  3
./test7.fq  3
./test3.fq  3
./test9.fq  3
./test6.fq  3
./test.fq   3
./test10.fq 3
./test1.fq  3
ADD REPLYlink modified 18 months ago • written 18 months ago by cpad011214k
1
gravatar for lakhujanivijay
18 months ago by
lakhujanivijay5.2k
India
lakhujanivijay5.2k wrote:
#!/bin/bash

for i in `ls *.fq`; do file_name=$(basename -s .fq $i);  printf "$file_name\t$(cat ${i} | wc -l)/4|bc\n"; done

Explanation

file_name=$(basename -s .fq $i)

basename = linux command to strip directory and suffix from filenames

-s = SUFFIX, remove a trailing SUFFIX

file_name=$(basename -s .fq $i) = remove the suffix .fq from the given file and store the name in the variable called file_name

ADD COMMENTlink modified 18 months ago • written 18 months ago by lakhujanivijay5.2k
1
gravatar for Pierre Lindenbaum
18 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum130k wrote:
for i in  *.fq ; do echo -n "$i " &&  cat $i | paste - - - - | wc -l ; done
ADD COMMENTlink written 18 months ago by Pierre Lindenbaum130k
1
gravatar for JC
18 months ago by
JC11k
Mexico
JC11k wrote:
for i in *.fq; do echo "$( echo $i | perl -pe 's/_.*//')  $( grep -c '@' $i)"
ADD COMMENTlink modified 18 months ago • written 18 months ago by JC11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1950 users visited in the last hour