Question: Remove overlapping features from a gtf file?
0
gravatar for RoryC
4.1 years ago by
RoryC30
Uppsala, Sweden
RoryC30 wrote:

Hi, this seems like quite a straightforward question so apologies if it has been asked before (I couldn't find anything similar). I have a .gtf file containing CDS coordinates for a chromosome, and I plan to extract codons containing 4d sites. Therefore I would like to remove any CDS that overlap (on the same, or opposite, strand) so there is no ambiguity about what is a 4d site and what isn't. I've been trying to do this with bedtools but I'm not having much luck, as intersect would need to have an option where features that overlap 100% are ignored for me to compare the file to itself. Thanks  

cds bedtools overlap gtf • 2.2k views
ADD COMMENTlink modified 4.0 years ago • written 4.1 years ago by RoryC30

Bedtools intersect -v option with -r 1 does not work?

ADD REPLYlink written 4.1 years ago by dally180

Hi, thanks for your answer. Do you mean -v -f 1 -r ? This gives me zero output as every CDS is overlapped 100%.

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by RoryC30
2
gravatar for RoryC
4.0 years ago by
RoryC30
Uppsala, Sweden
RoryC30 wrote:

So after coming back to this some time after I think I've found a relatively simple way of doing this. First use bedtools merge with -o count to merge overlapping elements and add a fourth column which shows how many original elements contributed to the new elements. Then use a command prompt to remove any rows that have a number greater than 1 in the fourth column, thus removing anything that originally overlapped and was merged. For example with a bed file (the same could be done with gtf):

bedtools merge -i file.bed -c 1 -o count | awk '  { if($4==1) print $0} ' > newfile.bed

ADD COMMENTlink written 4.0 years ago by RoryC30
1
gravatar for geek_y
4.1 years ago by
geek_y10k
Barcelona
geek_y10k wrote:

You can use the script (dexseq_prepare_annotation.py) given in the DEXSeq package to collapse overlapping exons. See Figure.1 of this paper http://genome.cshlp.org/content/22/10/2008.full

If this is not what you wanted, you may need to tweak the script a bit.

ADD COMMENTlink modified 11 months ago • written 4.1 years ago by geek_y10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1287 users visited in the last hour