Question: Iteration to count the number of raw reads
0
gravatar for wanaga3166
14 months ago by
wanaga31660
Montpellier
wanaga31660 wrote:

Hi everyone,

I made a short script to count the number of raw reads (fastq.gz) in my folder. The results will save in a text file (.txt).

echo "We count the number of reads for each file (fastq.gz)."
echo "file\traw_reads\ttrimmed_reads" > count_read_evolution_trim_test.txt
for f1 in *paired.fq.gz
  do
     echo $f1
     RAW_READS=`gunzip -c $f1 | echo $((`wc -l`/4))`
     echo -e "$f1\t$RAW_READS" >> count_read_evolution_trim_test.txt
  done

But I obtained this error message:

./test_count_raw_reads.sh: command substitution: line 9: syntax error near unexpected token `)'
./test_count_raw_reads.sh: command substitution: line 9: `/4))'
./test_count_raw_reads.sh: command substitution: line 9: unexpected EOF while looking for matching `)'
./test_count_raw_reads.sh: command substitution: line 10: syntax error: unexpected end of file
./test_count_raw_reads.sh: line 9: -l: command not found code here

How I can fix this problem?

Thank you.

ADD COMMENTlink modified 14 months ago by 2nelly220 • written 14 months ago by wanaga31660

When I did this command line:

gunzip -c $f1 | echo $((`wc -l`/4))

It works, I obtained the number of raw reads. I just want to put the number of raw reads in a table, like in this example.

file | raw_read_number

my_file_1 | 130000

my_file_2 | 160000

my_file_3 | 200000

I don't understand where I did a mistake.

ADD REPLYlink written 14 months ago by wanaga31660
gunzip -c $f1 | echo 

you're piping a stream in echo : wrong

ADD REPLYlink written 14 months ago by Pierre Lindenbaum130k

your use of ` is messing up this line:

`gunzip -c $f1 | echo $((`wc -l`/4))`
ADD REPLYlink modified 14 months ago • written 14 months ago by Martombo2.6k

If you watch my initial code (in the top) of this post, I put the back quote as you wrote. And I can't save the number of raw reads in my text table (count_read_evolution_trim_test.txt).

The problem in my code, that a I have two backquote in this line:

RAW_READS=`gunzip -c $f1 | echo $((`wc -l`/4))`

And it doesn't count the raw reads number.

ADD REPLYlink modified 14 months ago • written 14 months ago by wanaga31660
2
gravatar for Martombo
14 months ago by
Martombo2.6k
Seville, ES
Martombo2.6k wrote:

Yep, exactly. The second ` stops the expression at:

`gunzip -c $f1 | echo $((`

If you want to keep this line you can maybe use awk to divide by 4:

RAW_READS=`gunzip -c $f1 | wc -l | awk '{print($1/4)}'`

edit: this is probably more efficient:

RAW_READS=`gunzip -c $f1 | awk 'END{print(NR/4)}'`
ADD COMMENTlink modified 14 months ago • written 14 months ago by Martombo2.6k

Thank you Martombo. Your solution works. I will be more careful with the backquotes.

ADD REPLYlink written 14 months ago by wanaga31660
1
gravatar for Pierre Lindenbaum
14 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum130k wrote:

you want:

find . -name "*q.gz" | while read F; do echo -n "$F " && echo $(( $(gunzip -c $F | wc -l) / 4 )) ; done
ADD COMMENTlink written 14 months ago by Pierre Lindenbaum130k

When I did this command line:

gunzip -c $f1 | echo $((wc -l/4)) It works, I obtained the number of raw reads. I just want to put the number of raw reads in a table, like in this example.

file | raw_read_number my_file_1 | 130000 my_file_2 | 160000 my_file_3 | 200000 I don't understand where I did a mistake.

ADD REPLYlink written 14 months ago by wanaga31660
0
gravatar for 2nelly
14 months ago by
2nelly220
Geneva,Switzerland
2nelly220 wrote:

Hi wanaga3166

what about this:

for f in *fastq.gz; do echo -n "$f | "  && echo $(($(zcat $f | echo $((`wc -l`/4))))); done > output && sed  -i '1i file | raw_read_number' output
ADD COMMENTlink written 14 months ago by 2nelly220
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1527 users visited in the last hour