Question: BedTools getfasta -split
0
gravatar for Alice
14 days ago by
Alice0
Alice0 wrote:

Hello,

I'm trying to extract some sequences from a multifasta file (a genome) using the following command:

bedtools getfasta -fi T_aestivum_genomeA.fa -bed urartuAestivum_blocks_sort.bed12 -split -name -fo blocks_aestivumA.fa

I didn't get any kind of error from the program but, in the output multifasta file, for some sequences, there is only the header. I checked the bed12 file and I didn't find any anomaly in the rows corresponding to the missing sequences. I also manually checked the coordinates on the genome of some missing sequences and there wasn't anything strange (Ns or something). I got the correct output if I don't use the -split option but I don't want the entire sequence, so I think the problem is in the blocks.

Here is my how my bed12 file looks like:

7A  25225503    25225944    TCONS_00077526_aestivumA    *   *   *   *   *   1   441,    25225503,
7A  35229975    35230420    TCONS_00076940_aestivumA    *   *   *   *   *   1   445,    35229975,
7A  35501306    35501751    TCONS_00170589_aestivumA    *   *   *   *   *   2   139,306,    35501306,35501445,
7A  131421239   131421684   TCONS_00107436_aestivumA    *   *   *   *   *   2   281,88, 131421239,131421596,
7A  10711045    10711495    TCONS_00150021_aestivumA    *   *   *   *   *   1   450,    10711045,
7A  167627488   167627939   TCONS_00024036_aestivumA    *   *   *   *   *   1   451,    167627488,
7A  48932559    48933013    TCONS_00136773_aestivumA    *   *   *   *   *   1   454,    48932559,

The forth line corresponds to one of the sequence I didn't get.

Anyone experienced a similar problem? Thank you!

Alice

getfasta -split bedtools • 149 views
ADD COMMENTlink written 14 days ago by Alice0
3
gravatar for microfuge
14 days ago by
microfuge610
microfuge610 wrote:

Hi, May be I am wrong but the last column block start is supposed to in relative to chrom start (here ). I have not checked your sample data properly but they seem to be very large.

ADD COMMENTlink written 14 days ago by microfuge610
1

Yes, it looks like the original poster is using absolute coordinates (the block start is equal to chrom start) - none of the lines are correct.

ADD REPLYlink modified 14 days ago • written 14 days ago by Istvan Albert ♦♦ 73k

Thanks a lot to both of you, I tried to change the block start column putting the values relative to chromosomes coordinates (the first block always starts with 0) and it worked! I realized that also for the other lines, for which BedTools extracted a sequence, that sequence was actually wrong (because, as you said, the blocks starts were not relative to chrom start), so I don't understand how it managed to extract something. Anyway I will change the last column of every line as you said. Thanks again!

ADD REPLYlink written 11 days ago by Alice0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1530 users visited in the last hour