Question: how to create a gff3 file of gap regions in an assembly?
0
gravatar for m.eitel
10 months ago by
m.eitel0
m.eitel0 wrote:

Hi.

I would like to generate a gff3 file of gap regions ('N') in an assembly. Is there a fast way/script to do that?

Thanks Michael

assembly • 295 views
ADD COMMENTlink modified 10 months ago by lakhujanivijay4.4k • written 10 months ago by m.eitel0
6
gravatar for lakhujanivijay
10 months ago by
lakhujanivijay4.4k
India
lakhujanivijay4.4k wrote:

Ultrafast solution

toy fasta file

$ cat fasta.fa 
>1
TGTACGTNNATT
>2
TTTAANNTTTNN
>3
NNTT
TTNN

solution using seqkit

seqkit locate -p N+ fasta.fa --gtf -P

output

1   SeqKit  location    8   9   0   +   .   gene_id "N+"; 
2   SeqKit  location    6   7   0   +   .   gene_id "N+"; 
2   SeqKit  location    11  12  0   +   .   gene_id "N+"; 
3   SeqKit  location    1   2   0   +   .   gene_id "N+"; 
3   SeqKit  location    7   8   0   +   .   gene_id "N+";
ADD COMMENTlink written 10 months ago by lakhujanivijay4.4k

Dear Vijay. Thanks for the fast reply! I will give it a try. Michael

ADD REPLYlink written 10 months ago by m.eitel0
1

Dear m.eitel

Sure, give it a try ! If this is helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. Upvote|Bookmark|Accept

ADD REPLYlink written 10 months ago by lakhujanivijay4.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 828 users visited in the last hour