Question: GFF file multiple features for 1 gene region, how to collapse into 1?
gravatar for YOUSEUFS
12 months ago by
YOUSEUFS10 wrote:

Hello, Noob here

My GFF3 file (Converted into BED) contains multiple lines that describe the same gene region but with varying feature ID's (Below)

NC_002978.6     3027    3115    gene2   .       +       RefSeq  gene    .       ID=gene2;Dbxref=GeneID:29555340;Name=WD_RS00025;gbkey=Gene;gene_biotype=tRNA;locus_tag=WD_RS00025;old_locus_tag=tRNA-Leu-1

NC_002978.6     3027    3115    id1     .       +       tRNAscan-SE     exon    .       ID=id1;Parent=rna0;Dbxref=GeneID:29555340;anticodon=(pos:3062..3064);gbkey=tRNA;inference=COORDINATES: profile:tRNAscan-SE:1.23;pr

NC_002978.6     3027    3115    rna0    .       +       tRNAscan-SE     tRNA    .       ID=rna0;Parent=gene2;Dbxref=GeneID:29555340;anticodon=(pos:3062..3064);gbkey=tRNA;inference=COORDINATES: profile:tRNAscan-SE:1.23;

How would I collapse these to give me a single gene region associated with a single feature?

Context: This would then be fed into "bedtools closest" so I can match transcriptional start sites to their closest annotated gene

P.s apologies in advance for any incorrect formatting

rna-seq gff bedtools • 522 views
ADD COMMENTlink modified 12 months ago by Carambakaracho1.8k • written 12 months ago by YOUSEUFS10

Hi Noob,

Your file somewhat resembles a BED, but it's quite confusing. Anyway, start with this to filter for only gene features:

awk '$8 == "gene"' your_file.bed > your_file.genes.bed

Now you should only have genes, which may still overlap, but will be unique genes.

ADD REPLYlink written 12 months ago by goodez460

Thank you very much!

ADD REPLYlink written 12 months ago by YOUSEUFS10
gravatar for Carambakaracho
12 months ago by
Carambakaracho1.8k wrote:

Filter for the gene features, either your gff column 3 or your bed file. However, a tRNA feature might rather be an exception than the rule. You can do this even with excel.

ADD COMMENTlink written 12 months ago by Carambakaracho1.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1365 users visited in the last hour