Estimate distances to closest gene from VCF and GFF files
0
0
Entering edit mode
4.2 years ago
elcortegano ▴ 200

Hi, I am doing now some analyses that require knowledge of the distance from some variant (and invatiant) positions in VCF files to the closest genomic feature reported on a separate GFF file.

I am currently doing so by processing these two files as spreadsheets and calculating these distances with some R code, but as you can imagine this is probably suboptimal, messy, and prone to errors.

I am assuming that there is probably a package on R Bioconductor or Python that allows to generate a dataset of genomic positions taken from a VCF and distances to different genomic features in GFF files, but so far I haven't found anything.

Do you have any recommendation on how to handle this?

next-gen vcf gff • 883 views
ADD COMMENT
2
Entering edit mode

Check out the bedtools collection of tools and its subcommand bedtools closest.

https://github.com/arq5x/bedtools2

ADD REPLY
0
Entering edit mode

Perhaps I am being unlucky, but this failed for me at some <NON_REF> ALT positons. After replacing them with sed -i -e 's/<NON_REF>/./', bedtools returns a core dumped error.

ADD REPLY
0
Entering edit mode

Without code and data examples it is impossible to debug.

ADD REPLY

Login before adding your answer.

Traffic: 1456 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6