find co-ordinates of overlapping fasta sequences
0
0
Entering edit mode
3.8 years ago
sg197 ▴ 40

Hi, I have 2 fasta files (one of precursor novel microRNAs and one of mature novel microRNAs), and a bed file of the precursors. What I want to do is find where the 2 fasta files overlap and get the coordinates, relative to the precursor bed. I can't find any tool that would let me do this... the closest is bedtools getfasta, but I guess I want to do the opposite, rather than get a fasta sequence of overlapping regions I want to get the coordinates.

I considered converting the bed to gtf, indexing precursor gtf, and then mapping to this. But I realised the coordinates won't match that of where it actually is in the genome. Any suggestions would be really helpful!

bedtools fasta microRNAs • 938 views
ADD COMMENT
0
Entering edit mode

What I want to do is find where the 2 fasta files overlap

In terms of sequence, correct? Sequence files themselves will only have local co-ordinates. So align the two fasta files first (blat may be good for this) and then see where the overlapping part is w.r.t genome and then cross-reference to GTF. Does that describe what you need.

ADD REPLY
0
Entering edit mode

If I understand what you're suggesting, I already have the 'overlapping part', that's my fasta file of the mature microRNA. It's the overlapping wrt to genome that I'm having trouble with... as these are short reads I think they may align to multiple places and not just within the precursor region (which I have both sequence and bed file for).

ADD REPLY
0
Entering edit mode

You can use ungapped alignments with your mature RNA (e.g. bowtie v.1) but as you say you will need to allow for these reads to multi-map. There is nothing you can do about that. Those reads that multi-map will have an equal chance of having come from any of those locations.

ADD REPLY

Login before adding your answer.

Traffic: 1379 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6