Shell scripts and grep, awk command
0
0
Entering edit mode
17 months ago

Dear all,

I have a scripts like the following, but it always reported grep: .pep: No such file or directory when I running the script directly. And I am also confused with the awk command line as well as the paste command line in this script, if anyone could help me to clarify the meaning...

ls *BGI* | while read pipeline
do
grep '>' ${line}.pep | sed 's/>//g' > ${line}.modify.seqid
awk '{if($3 =="gene"){a=index($0,"~~");b=index($0,";");print substr($0,a+2,b-a-2)}}' > ${line}.origin.seqid
paste -d '\t' ${line}.origin.seqid ${line}.modify.seqid > ${line}_coress.seqid
done

Also, here are the related files and what they looks like:

  1. Prunella_fulvescens_BGI.pep
  2. Prunella_himalayana_BGI.pep

enter image description here I tried the 'grep command line' like this:

grep '>' Prunella_fulvescens_BGI.pep.fa | sed 's/>//g' > Prunella_fulvescens_BGI.modify.seqid

It will creat a file Prunella_fulvescens_BGI.modify.seqid like this:

PRUFUL_R14685  [mRNA]  locus=scaffold1534:862:1958:- [translate_table: standard]
PRUFUL_R05501  [mRNA]  locus=scaffold10:6068:6277:- [translate_table: standard]
PRUFUL_R10205  [mRNA]  locus=scaffold1726:12292:22891:+ [translate_table: standard]
PRUFUL_R07295  [mRNA]  locus=scaffold2643:33509:34789:+ [translate_table: standard]
PRUFUL_R07296  [mRNA]  locus=scaffold2643:7708:8886:+ [translate_table: standard]
PRUFUL_R10726  [mRNA]  locus=scaffold1079:172399:200680:- [translate_table: standard]
PRUFUL_R13095  [mRNA]  locus=scaffold1079:16827:27665:+ [translate_table: standard]
PRUFUL_R13096  [mRNA]  locus=scaffold1079:111918:114560:+ [translate_table: standard]
PRUFUL_R14411  [mRNA]  locus=scaffold1079:153277:155466:+ [translate_table: standard]
PRUFUL_R07297  [mRNA]  locus=scaffold2912:21167:49513:+ [translate_table: standard]

I will be appreciated if you could help combine the scritps and explain a little bit...:)

scripts shell grep awk command • 653 views
ADD COMMENT
0
Entering edit mode

it seems like your ${line} is empty. Try echo your ${line} in for loop first

ADD REPLY
0
Entering edit mode

Yes, exactly!! When I give up the loop and use grep '>' Prunella_fulvescens_BGI.pep.fa | sed 's/>//g' > Prunella_fulvescens_BGI.modify.seqid directly, this step works well. But if you know how to echo the ${line} to make it work well in the loop....I am very new to the scripts etc....:(

ADD REPLY
0
Entering edit mode

grep '>' ${line}.pep

This line in your script is looking for files ending with .pep whereas they seem to have .pep.fa ending. That is first issue. There are likely others.

At a minimum modify that bit to

grep '>' ${line}.pep.fa

and see if that helps.

Note: Please post text sample snippets instead of screen shots. People can't use the data for example purposes when posted as images.

ADD REPLY
0
Entering edit mode

Thank you so much for the guidance, I tried to change ".pep" to ".pep.fa" but still the same error....:(

ADD REPLY

Login before adding your answer.

Traffic: 1979 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6