Question: problem with filtering "Sequence unavailable"
0
gravatar for ashkan
2.6 years ago by
ashkan110
ashkan110 wrote:

I have a file like the small example: small example:

>ENSG00000004142|ENST00000003607|POLDIP2|||2118
Sequence unavailable
>ENSG00000003056|ENST00000000412|M6PR|9099001;9102084|9099001;9102551|2756
CCAGGTTGTTTGCCTCTGGTCGGAAAGGGAAACTACCCCTGCTTCCACTCTGACAGCAGA

but I have too many "Sequence unavailable". I want to get rid of those transcripts. and the results would be like this:

>ENSG00000003056|ENST00000000412|M6PR|9099001;9102084|9099001;9102551|2756
CCAGGTTGTTTGCCTCTGGTCGGAAAGGGAAACTACCCCTGCTTCCACTCTGACAGCAGA

I tried to filter out those parts in bash but

grep -v "$(grep -B 1 "Sequence unavailable" file.txt)" file.txt

but gave this error:

Argument list too long

how can i filter out them in bash or python?

sequence • 832 views
ADD COMMENTlink modified 2.6 years ago by RamRS22k • written 2.6 years ago by ashkan110

How about (should work as long as the first record is Sequence Unavailable, you can be creative otherwise): grep -A 2 "Sequence" your.fa | grep -v "\-\-" | sed -n '/Sequence/!p' > new.fa

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by genomax69k

It would be nice to provide feedback to the proposed solution of genomax2. In addition, you have more questions which you left "open/unsolved" after people tried to help you. That's not respectful.

I pledged to help you on your previous thread, but my questions remain unanswered, although it's clear that you have been active multiple times on biostars since my comment. You shouldn't take our help for granted.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by WouterDeCoster40k

Dear ashkan, please respond to questions/give follow up comments on your past posts. Abandoning a question after you ask it borders on troll-like behavior. Unless you follow up on your past questions, your future questions may not be taken seriously or your posts may be treated even more sternly.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by RamRS22k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1665 users visited in the last hour