Question: Bedtools intersect QTL animalgenome
0
gravatar for Wietje
20 months ago by
Wietje180
Germany
Wietje180 wrote:

I am trying to find QTLs in the bovine genome (downloaded as bed-file from the CattleQTLdb here: http://www.animalgenome.org/cgi-bin/QTLdb/BT/download?file=bedUMD_3.1) that intersect with candidate lnc's I've filtered out with FEElnc.

I have unzipped the file and tried the bedtools intersect command:

$ bedtools intersect -a QTL-file -b candidate-lnc-file > intersect_qtl_lnc

but I get this Error-message: * ERROR: too many digits/characters for integer conversion in string . Exiting...

Has anyone encountered this problem before? Is there an additional conversion step I need to make? I appreciate your efforts! thx

rna-seq • 932 views
ADD COMMENTlink modified 20 months ago • written 20 months ago by Wietje180

Could you post a small example of your -a and -b files please?

ADD REPLYlink written 20 months ago by James Ashmore2.6k
1
gravatar for Wietje
20 months ago by
Wietje180
Germany
Wietje180 wrote:

It's alright, I found the bug -> removing lines with missing coordinates and duplicates does the trick (http://seqanswers.com/forums/showthread.php?t=69484), it works now.

ADD COMMENTlink modified 20 months ago • written 20 months ago by Wietje180
1

Just an observation from the bedtools source

if (len < 1 || len > 10) {
    fprintf(stderr, "***** ERROR: too many digits/characters for integer conversion in string %s. Exiting...\n", str);
    exit(1);

}

The coordinates seem to be under 10 digits. I could get get the error after giving a fake bed file with chromEnd of 10 digits. But the limits seems not relevant for everyday use.

ADD REPLYlink written 20 months ago by microfuge1.0k
0
gravatar for Wietje
20 months ago by
Wietje180
Germany
Wietje180 wrote:

The QTL-file (bed format) starts with several comment lines, each starting with # and the "normal content" look like this:

X 65152465 65152505 Body weight (mature) QTL (65327) + 65152465 65152505 . . . .

X 65221953 65221993 Body weight (mature) QTL (65328) + 65221953 65221993 . . . .

X 64847299 64847339 Body weight (mature) QTL (65329) + 64847299 64847339 . . . .

I tried reducing the file to the first three columns but that doesn't help either.

The lnc-file (gtf format) has the following structure:

17 StringTie exon 66683795 66684113 1000 - . gene_id "MSTRG.4748"; transcript_id "MSTRG.4748.1"; exon_number "6"; 3 StringTie exon 12569270 12569448 1000 + . gene_id "MSTRG.11128"; transcript_id "MSTRG.11128.16"; exon_number "1"; 3 StringTie exon 12576517 12576543 1000 + . gene_id "MSTRG.11128"; transcript_id "MSTRG.11128.16"; exon_number "2";

I have worked with the lnc-file before, so the problem seems to be with the downloaded file.

ADD COMMENTlink modified 20 months ago • written 20 months ago by Wietje180
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1157 users visited in the last hour