Closed:Type checker error with bedtools intersect
0
0
Entering edit mode
4.9 years ago
kmyers2 ▴ 80

I want to remove four regions from a bam file. I have two files, the .bam file and the .bed file. I am using the following command:

bedtools2-2.27.0/bedtools intersect -abam input.bam -b filter.bed -v > output_filtered.bam

My filter.bed file is:

NC_000913.3     2747963 2748753
NC_000913.3     3470145 3471329
NC_000913.3     3802039 3804137
NC_000913.3     3807064 3808098

and my BAM file is standard bam:

NB501872:398:HGHWFBGXB:1:11101:26117:1029
1:N:0:CAGATC    -   NC_000913.3 3803471 TTTTTCGAAATCGGAGCCATCACCCAATACATGGAGTTT EEEEAEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEA 0   
NB501872:398:HGHWFBGXB:1:11101:3328:1031
1:N:0:CAGATC    +   NC_000913.3 2748273 GCGCTGGCAACCTTCATGCCCAATGAATACATCACCCAT EEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEAEEEEA 0   
NB501872:398:HGHWFBGXB:1:11101:23381:1032
1:N:0:CAGATC    -   NC_000913.3 4176778 TGCGTGGTATCAAACGTGAAGAAATCGAACGTGGTCAGG EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 1   
NB501872:398:HGHWFBGXB:1:11101:8594:1034
1:N:0:CAGATC    -   NC_000913.3 2107391 ACTGGATAGTAGGGTTCGTCGCCAACTTTCCACTCTAAT EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 0   
NB501872:398:HGHWFBGXB:1:11101:16388:1035
1:N:0:CAGATC    +   NC_000913.3 1756723 CTCGAGTTCAACAATGACAACCGTAAACTGCGCATTACC EEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEE 0   
NB501872:398:HGHWFBGXB:1:11101:7169:1035
1:N:0:CAGATC    +   NC_000913.3 1166876 ATTGAGGCTGATCTGATGGTATGGGCAGCCGGGATCAAA EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 0   
NB501872:398:HGHWFBGXB:1:11101:4800:1036
1:N:0:CAGATC    +   NC_000913.3 2748273 GCGCTGGCAACCTTCATGCCCAATGAATACATCACCCAT EEEEEEEEE/EEEEEEEEEEEEEEEEEEEE<EE<EEEEE 0   
NB501872:398:HGHWFBGXB:1:11101:16607:1036
1:N:0:CAGATC    +   NC_000913.3 1111389 GGATTTGTGGGTTTCCTTTATGCAGCTTCTGCCTTATAT EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 0   
NB501872:398:HGHWFBGXB:1:11101:16422:1039
1:N:0:CAGATC    -   NC_000913.3 2815223 CCACCTGCGGTAACGGGAACTGTGCGGTTTCGCAGCCGA EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE 0   
NB501872:398:HGHWFBGXB:1:11101:19492:1039
1:N:0:CAGATC    -   NC_000913.3 2415498 TCGCCGCGTTAAATCCGTCACTTTCTGCGCACGCAGCAT <//</A/EEEE/AAEA/EE//EE//EEEEEE/EE/AEE/ 0

When I run the command above, I get the following error:

Error: Type checker found wrong number of fields while tokenizing data line.
Perhaps you have extra TAB at the end of your line? Check with "cat -t"

I've run cat -t on the filter.bed:

NC_000913.3^I2747963^I2748753
NC_000913.3^I3470145^I3471329
NC_000913.3^I3802039^I3804137
NC_000913.3^I3807064^I3808098

I've run cat -t on the input.bam:

NB501872:398:HGHWFBGXB:1:11101:26117:1029 1:N:0:CAGATC^I-^INC_000913.3^I3803471^ITTTTTCGAAATCGGAGCCATCACCCAATACATGGAGTTT^IEEEEAEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEA^I0^I
NB501872:398:HGHWFBGXB:1:11101:3328:1031 1:N:0:CAGATC^I+^INC_000913.3^I2748273^IGCGCTGGCAACCTTCATGCCCAATGAATACATCACCCAT^IEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEAEEEEA^I0^I
NB501872:398:HGHWFBGXB:1:11101:23381:1032 1:N:0:CAGATC^I-^INC_000913.3^I4176778^ITGCGTGGTATCAAACGTGAAGAAATCGAACGTGGTCAGG^IEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^I1^I
NB501872:398:HGHWFBGXB:1:11101:8594:1034 1:N:0:CAGATC^I-^INC_000913.3^I2107391^IACTGGATAGTAGGGTTCGTCGCCAACTTTCCACTCTAAT^IEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^I0^I
NB501872:398:HGHWFBGXB:1:11101:16388:1035 1:N:0:CAGATC^I+^INC_000913.3^I1756723^ICTCGAGTTCAACAATGACAACCGTAAACTGCGCATTACC^IEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEE^I0^I
NB501872:398:HGHWFBGXB:1:11101:7169:1035 1:N:0:CAGATC^I+^INC_000913.3^I1166876^IATTGAGGCTGATCTGATGGTATGGGCAGCCGGGATCAAA^IEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^I0^I
NB501872:398:HGHWFBGXB:1:11101:4800:1036 1:N:0:CAGATC^I+^INC_000913.3^I2748273^IGCGCTGGCAACCTTCATGCCCAATGAATACATCACCCAT^IEEEEEEEEE/EEEEEEEEEEEEEEEEEEEE<EE<EEEEE^I0^I
NB501872:398:HGHWFBGXB:1:11101:16607:1036 1:N:0:CAGATC^I+^INC_000913.3^I1111389^IGGATTTGTGGGTTTCCTTTATGCAGCTTCTGCCTTATAT^IEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^I0^I
NB501872:398:HGHWFBGXB:1:11101:16422:1039 1:N:0:CAGATC^I-^INC_000913.3^I2815223^ICCACCTGCGGTAACGGGAACTGTGCGGTTTCGCAGCCGA^IEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^I0^I
NB501872:398:HGHWFBGXB:1:11101:19492:1039 1:N:0:CAGATC^I-^INC_000913.3^I2415498^ITCGCCGCGTTAAATCCGTCACTTTCTGCGCACGCAGCAT^I<//</A/EEEE/AAEA/EE//EE//EEEEEE/EE/AEE/^I0^I

I've tried removing the end of line tabs from the bam file but that didn't work. I tried adding tabs to the end of the lines in the filter.bad file and that didn't work.

I'm at a loss of what to try or how to fix it. Any help and advice would be greatly appreciated.

bedtools rnaseq intersect error type checker • 357 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2399 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6