I'm trying to extract some sequences from a multifasta file (a genome) using the following command:
bedtools getfasta -fi T_aestivum_genomeA.fa -bed urartuAestivum_blocks_sort.bed12 -split -name -fo blocks_aestivumA.fa
I didn't get any kind of error from the program but, in the output multifasta file, for some sequences, there is only the header. I checked the bed12 file and I didn't find any anomaly in the rows corresponding to the missing sequences. I also manually checked the coordinates on the genome of some missing sequences and there wasn't anything strange (Ns or something). I got the correct output if I don't use the -split option but I don't want the entire sequence, so I think the problem is in the blocks.
Here is my how my bed12 file looks like:
7A 25225503 25225944 TCONS_00077526_aestivumA * * * * * 1 441, 25225503, 7A 35229975 35230420 TCONS_00076940_aestivumA * * * * * 1 445, 35229975, 7A 35501306 35501751 TCONS_00170589_aestivumA * * * * * 2 139,306, 35501306,35501445, 7A 131421239 131421684 TCONS_00107436_aestivumA * * * * * 2 281,88, 131421239,131421596, 7A 10711045 10711495 TCONS_00150021_aestivumA * * * * * 1 450, 10711045, 7A 167627488 167627939 TCONS_00024036_aestivumA * * * * * 1 451, 167627488, 7A 48932559 48933013 TCONS_00136773_aestivumA * * * * * 1 454, 48932559,
The forth line corresponds to one of the sequence I didn't get.
Anyone experienced a similar problem? Thank you!