Entering edit mode
4.9 years ago
kmyers2
▴
80
I want to remove four regions from a bam file. I have two files, the .bam file and the .bed file. I am using the following command:
bedtools2-2.27.0/bedtools intersect -abam input.bam -b filter.bed -v > output_filtered.bam
My filter.bed file is:
NC_000913.3 2747963 2748753
NC_000913.3 3470145 3471329
NC_000913.3 3802039 3804137
NC_000913.3 3807064 3808098
and my BAM file is standard bam:
NB501872:398:HGHWFBGXB:1:11101:26117:1029
1:N:0:CAGATC - NC_000913.3 3803471 TTTTTCGAAATCGGAGCCATCACCCAATACATGGAGTTT EEEEAEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEA 0
NB501872:398:HGHWFBGXB:1:11101:3328:1031
1:N:0:CAGATC + NC_000913.3 2748273 GCGCTGGCAACCTTCATGCCCAATGAATACATCACCCAT EEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEAEEEEA 0
NB501872:398:HGHWFBGXB:1:11101:23381:1032
1:N:0:CAGATC - NC_000913.3 4176778 TGCGTGGTATCAAACGTGAAGAAATCGAACGTGGTCAGG EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 1
NB501872:398:HGHWFBGXB:1:11101:8594:1034
1:N:0:CAGATC - NC_000913.3 2107391 ACTGGATAGTAGGGTTCGTCGCCAACTTTCCACTCTAAT EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 0
NB501872:398:HGHWFBGXB:1:11101:16388:1035
1:N:0:CAGATC + NC_000913.3 1756723 CTCGAGTTCAACAATGACAACCGTAAACTGCGCATTACC EEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEE 0
NB501872:398:HGHWFBGXB:1:11101:7169:1035
1:N:0:CAGATC + NC_000913.3 1166876 ATTGAGGCTGATCTGATGGTATGGGCAGCCGGGATCAAA EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 0
NB501872:398:HGHWFBGXB:1:11101:4800:1036
1:N:0:CAGATC + NC_000913.3 2748273 GCGCTGGCAACCTTCATGCCCAATGAATACATCACCCAT EEEEEEEEE/EEEEEEEEEEEEEEEEEEEE<EE<EEEEE 0
NB501872:398:HGHWFBGXB:1:11101:16607:1036
1:N:0:CAGATC + NC_000913.3 1111389 GGATTTGTGGGTTTCCTTTATGCAGCTTCTGCCTTATAT EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 0
NB501872:398:HGHWFBGXB:1:11101:16422:1039
1:N:0:CAGATC - NC_000913.3 2815223 CCACCTGCGGTAACGGGAACTGTGCGGTTTCGCAGCCGA EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 0
NB501872:398:HGHWFBGXB:1:11101:19492:1039
1:N:0:CAGATC - NC_000913.3 2415498 TCGCCGCGTTAAATCCGTCACTTTCTGCGCACGCAGCAT <//</A/EEEE/AAEA/EE//EE//EEEEEE/EE/AEE/ 0
When I run the command above, I get the following error:
Error: Type checker found wrong number of fields while tokenizing data line.
Perhaps you have extra TAB at the end of your line? Check with "cat -t"
I've run cat -t on the filter.bed:
NC_000913.3^I2747963^I2748753
NC_000913.3^I3470145^I3471329
NC_000913.3^I3802039^I3804137
NC_000913.3^I3807064^I3808098
I've run cat -t on the input.bam:
NB501872:398:HGHWFBGXB:1:11101:26117:1029 1:N:0:CAGATC^I-^INC_000913.3^I3803471^ITTTTTCGAAATCGGAGCCATCACCCAATACATGGAGTTT^IEEEEAEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEA^I0^I
NB501872:398:HGHWFBGXB:1:11101:3328:1031 1:N:0:CAGATC^I+^INC_000913.3^I2748273^IGCGCTGGCAACCTTCATGCCCAATGAATACATCACCCAT^IEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEAEEEEA^I0^I
NB501872:398:HGHWFBGXB:1:11101:23381:1032 1:N:0:CAGATC^I-^INC_000913.3^I4176778^ITGCGTGGTATCAAACGTGAAGAAATCGAACGTGGTCAGG^IEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^I1^I
NB501872:398:HGHWFBGXB:1:11101:8594:1034 1:N:0:CAGATC^I-^INC_000913.3^I2107391^IACTGGATAGTAGGGTTCGTCGCCAACTTTCCACTCTAAT^IEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^I0^I
NB501872:398:HGHWFBGXB:1:11101:16388:1035 1:N:0:CAGATC^I+^INC_000913.3^I1756723^ICTCGAGTTCAACAATGACAACCGTAAACTGCGCATTACC^IEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEE^I0^I
NB501872:398:HGHWFBGXB:1:11101:7169:1035 1:N:0:CAGATC^I+^INC_000913.3^I1166876^IATTGAGGCTGATCTGATGGTATGGGCAGCCGGGATCAAA^IEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^I0^I
NB501872:398:HGHWFBGXB:1:11101:4800:1036 1:N:0:CAGATC^I+^INC_000913.3^I2748273^IGCGCTGGCAACCTTCATGCCCAATGAATACATCACCCAT^IEEEEEEEEE/EEEEEEEEEEEEEEEEEEEE<EE<EEEEE^I0^I
NB501872:398:HGHWFBGXB:1:11101:16607:1036 1:N:0:CAGATC^I+^INC_000913.3^I1111389^IGGATTTGTGGGTTTCCTTTATGCAGCTTCTGCCTTATAT^IEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^I0^I
NB501872:398:HGHWFBGXB:1:11101:16422:1039 1:N:0:CAGATC^I-^INC_000913.3^I2815223^ICCACCTGCGGTAACGGGAACTGTGCGGTTTCGCAGCCGA^IEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^I0^I
NB501872:398:HGHWFBGXB:1:11101:19492:1039 1:N:0:CAGATC^I-^INC_000913.3^I2415498^ITCGCCGCGTTAAATCCGTCACTTTCTGCGCACGCAGCAT^I<//</A/EEEE/AAEA/EE//EE//EEEEEE/EE/AEE/^I0^I
I've tried removing the end of line tabs from the bam file but that didn't work. I tried adding tabs to the end of the lines in the filter.bad file and that didn't work.
I'm at a loss of what to try or how to fix it. Any help and advice would be greatly appreciated.