Question: (Closed) Keep special character in a .txt file
0
gravatar for mostafarafiepour
16 months ago by
mostafarafiepour60 wrote:

Hi all,

I have a .txt file including three columns which based on column Annotation, i want to keep only rows including synonymous and missense. I will be grateful if you can help me to solve this problem.

CHROM_POS       ANN     Annotation
CM009840.1_932          A|intergenic_region|MODIFIER|CHR_START-LOC112587351|CHR_START-gene0|||  nongenic
CM009840.1_1096         T|intergenic_region|MODIFIER|CHR_START-LOC112587351|CHR_START-gene0||   nongenic
CM009840.1_4421500      A|missense_variant|MODERATE|LOC102415844|gene14  |1/1|c.298C>T|p.Ar||   missense
CM009840.1_4421553      A|missense_variant|MODERATE|LOC102415844|gene14|transcript|rna37|p ||    missense
CM009840.1_4421600      G|synonymous_variant|LOW|LOC102415844|gene14|transcript|rna37|protei ||   synonymous
CM009840.1_4421630      C|synonymous_variant|LOW|LOC102415844|gene14|transcript|rna37|pro||   synonymous
awk R • 328 views
ADD COMMENTlink modified 16 months ago by ahmad mousavi470 • written 16 months ago by mostafarafiepour60
2

Hello mostafarafiepour!

We believe that this post does not fit this site. You repeatedly do not show any effort in solving basic unix questions yourselves. For this reason we have closed your question.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLYlink modified 16 months ago • written 16 months ago by WouterDeCoster43k

with awk and sed:

$ sed -nr '/CHROM|mis|\tsyn/p' test.txt 

CHROM_POS   ANN Annotation
CM009840.1_4421500  A|missense_variant|MODERATE|LOC102415844|gene14 |1/1|c.298C>T|p.Ar||    missense
CM009840.1_4421553  A|missense_variant|MODERATE|LOC102415844|gene14|transcript|rna37|p  ||  missense
CM009840.1_4421600  G|synonymous_variant|LOW|LOC102415844|gene14|transcript|rna37|protei    ||  synonymous
CM009840.1_4421630  C|synonymous_variant|LOW|LOC102415844|gene14|transcript|rna37|pro|| synonymous

.

$ awk 'NR==1 {print}; /\tsynonymous|missense/ {print $0}' test.txt 
CHROM_POS   ANN Annotation
CM009840.1_4421500  A|missense_variant|MODERATE|LOC102415844|gene14 |1/1|c.298C>T|p.Ar||    missense
CM009840.1_4421553  A|missense_variant|MODERATE|LOC102415844|gene14|transcript|rna37|p  ||  missense
CM009840.1_4421600  G|synonymous_variant|LOW|LOC102415844|gene14|transcript|rna37|protei    ||  synonymous
CM009840.1_4421630  C|synonymous_variant|LOW|LOC102415844|gene14|transcript|rna37|pro|| synonymous

input:

$ cat test.txt 

CHROM_POS   ANN Annotation
CM009840.1_932  A|intergenic_region|MODIFIER|CHR_START-LOC112587351|CHR_START-gene0|||  nongenic
CM009840.1_1096 T|intergenic_region|MODIFIER|CHR_START-LOC112587351|CHR_START-gene0||   nongenic
CM009840.1_4421500  A|missense_variant|MODERATE|LOC102415844|gene14 |1/1|c.298C>T|p.Ar||    missense
CM009840.1_4421553  A|missense_variant|MODERATE|LOC102415844|gene14|transcript|rna37|p  ||  missense
CM009840.1_4421600  G|synonymous_variant|LOW|LOC102415844|gene14|transcript|rna37|protei    ||  synonymous
CM009840.1_4421630  C|synonymous_variant|LOW|LOC102415844|gene14|transcript|rna37|pro|| synonymous
CM009840.1_4421630      C|synonymous_variant|LOW|LOC102415844|gene14|transcript|rna37|pro||     non-synonymous
CM009840.1_4421630      C|synonymous_variant|LOW|LOC102415844|gene14|transcript|rna37|pro||     nonsynonymous
ADD REPLYlink modified 16 months ago • written 16 months ago by cpad011212k
2
gravatar for ahmad mousavi
16 months ago by
ahmad mousavi470
Royan Institute, Tehran, Iran
ahmad mousavi470 wrote:

Hi

grep 'synonymous' file.txt > new_file.txt
grep "PATTERN1\|PATTERN2" file.txt > new_file.txt
ADD COMMENTlink written 16 months ago by ahmad mousavi470

Thank you for the good answer.

But for those who close the post to prevent the problem, they should be sorry.

ADD REPLYlink written 16 months ago by mostafarafiepour60
2

We didn’t close your post to prevent people offering answers, we closed the post because you consistently ask low-effort questions where you do not attempt to learn how to solve the problem yourself.

You will be better off in the long run if you spend extra time trying to learn these very basic command now, even if it takes you a little longer to solve to problem.

ADD REPLYlink written 16 months ago by Joe16k

What is the solution if we do not waste the header?

The header is meant : CHROM_POS ANN Annotation

ADD REPLYlink written 16 months ago by mostafarafiepour60
1

Why dont you read about grep and the other tools you can use for this task, and try to figure this out for yourself. It will not be difficult.

ADD REPLYlink written 16 months ago by Joe16k

I'm agree with @jej.healey , you should try some coding by yourself, millions of people have tried before you, at least 100K. and those question are too simple and have answers for sure.

ADD REPLYlink modified 16 months ago • written 16 months ago by ahmad mousavi470
Please log in to add an answer.
The thread is closed. No new answers may be added.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2063 users visited in the last hour