RNA Seq
1
0
Entering edit mode
2.8 years ago
far.zi ▴ 10

I have a fasta file and I need to count the number of exact matched sequences in it. I tried

cat my.fasta | tr -d "\n" | grep "MySequence"

but it's not working for me.

RNA fasta • 1.1k views
ADD COMMENT
1
Entering edit mode

How is this question related to RNAseq?

ADD REPLY
0
Entering edit mode

How is it not working for you? Does grep not find the pattern that you can see is there? Do you need to create 1 line per fasta entry? (Multiline Fasta To Single Line Fasta)? If you have bowtie available, you could create an alignment index with your fasta, then supply "MySequence" on the command line, and count up the results.

ADD REPLY
0
Entering edit mode
2.8 years ago

I'd do

grep ">" myfile.fa | wc -l

That should just count the number of lines with ">", so line breaks won't matter.

ADD COMMENT
1
Entering edit mode

Going off the grep in their command and the phrase "exact sequence" in their post, I think OP is trying to count number of exact matches to their query sequence.

ADD REPLY
0
Entering edit mode

Yes, you're right. I need to count how many of an exact short sequence are there in my data. I could open it in TextWrangler and find but the file is huge and my mac is not helping.

ADD REPLY
1
Entering edit mode

Please edit your question and rephrase it so this is clear to everyone reading the question. This question has also been addressed multiple times on the forum, please use the search bar. If you want a quick lead, try exploring seqkit grep.

ADD REPLY

Login before adding your answer.

Traffic: 1052 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6