How can I extract annotation for genes from a GTF file that are more than 200 bp apart from neighboring genes?
2
0
Entering edit mode
6.5 years ago
biplab ▴ 110

I am new in the field of computational biology. This questions might answered somewhere else but I could not find by searching. How can I extract annotation for genes from a GTF file that are more than 200 bp apart from neighboring genes? I was looking into bedtools for this but most functions in bedtools compare two files but I would like to compare genes within my annotation file. It will very helpful if someone can suggest how can I do this. For example:

Input files:

I   ensembl gene    335 649 .   +   .   gene_id "YAL069W"; gene_source "ensembl"; gene_biotype "protein_coding";
I   ensembl gene    538 792 .   +   .   gene_id "YAL068W-A"; gene_source "ensembl"; gene_biotype "protein_coding";
I   ensembl gene    1807    2169    .   -   .   gene_id "YAL068C"; gene_name "PAU8"; gene_source "ensembl"; gene_biotype "protein_coding";
I   ensembl gene    2480    2707    .   +   .   gene_id "YAL067W-A"; gene_source "ensembl"; gene_biotype "protein_coding";

Output:

I   ensembl gene    1807    2169    .   -   .   gene_id "YAL068C"; gene_name "PAU8"; gene_source "ensembl"; gene_biotype "protein_coding";

Thank you so much.

genome next-gen RNA-Seq • 2.2k views
ADD COMMENT
2
Entering edit mode
6.5 years ago

Via BEDOPS gtf2bed and closest-features:

$ gtf2bed < genes.gtf > genes.bed
$ closest-features --no-ref --dist genes.bed genes.bed | awk -v threshold=200 -v FS='|' '{ if (($4>threshold)&&($4!="NA")) { print $3; }}' | uniq > genes_more_than_200_nt_apart.bed
ADD COMMENT
0
Entering edit mode

Thank you so much. Very good solution.

ADD REPLY
1
Entering edit mode
6.5 years ago

I was looking into bedtools for this but most functions in bedtools compare two files but I would like to compare genes within my annotation file.

how about using the same file twice ?

ADD COMMENT

Login before adding your answer.

Traffic: 2835 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6