Question: Bedtools intersect QTL animalgenome
0
gravatar for Wietje
2.1 years ago by
Wietje200
Germany
Wietje200 wrote:

I am trying to find QTLs in the bovine genome (downloaded as bed-file from the CattleQTLdb here: http://www.animalgenome.org/cgi-bin/QTLdb/BT/download?file=bedUMD_3.1) that intersect with candidate lnc's I've filtered out with FEElnc.

I have unzipped the file and tried the bedtools intersect command:

$ bedtools intersect -a QTL-file -b candidate-lnc-file > intersect_qtl_lnc

but I get this Error-message: * ERROR: too many digits/characters for integer conversion in string . Exiting...

Has anyone encountered this problem before? Is there an additional conversion step I need to make? I appreciate your efforts! thx

rna-seq • 1.2k views
ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Wietje200

Could you post a small example of your -a and -b files please?

ADD REPLYlink written 2.1 years ago by James Ashmore2.6k
1
gravatar for Wietje
2.1 years ago by
Wietje200
Germany
Wietje200 wrote:

It's alright, I found the bug -> removing lines with missing coordinates and duplicates does the trick (http://seqanswers.com/forums/showthread.php?t=69484), it works now.

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Wietje200
1

Just an observation from the bedtools source

if (len < 1 || len > 10) {
    fprintf(stderr, "***** ERROR: too many digits/characters for integer conversion in string %s. Exiting...\n", str);
    exit(1);

}

The coordinates seem to be under 10 digits. I could get get the error after giving a fake bed file with chromEnd of 10 digits. But the limits seems not relevant for everyday use.

ADD REPLYlink written 2.1 years ago by microfuge1.2k
0
gravatar for Wietje
2.1 years ago by
Wietje200
Germany
Wietje200 wrote:

The QTL-file (bed format) starts with several comment lines, each starting with # and the "normal content" look like this:

X 65152465 65152505 Body weight (mature) QTL (65327) + 65152465 65152505 . . . .

X 65221953 65221993 Body weight (mature) QTL (65328) + 65221953 65221993 . . . .

X 64847299 64847339 Body weight (mature) QTL (65329) + 64847299 64847339 . . . .

I tried reducing the file to the first three columns but that doesn't help either.

The lnc-file (gtf format) has the following structure:

17 StringTie exon 66683795 66684113 1000 - . gene_id "MSTRG.4748"; transcript_id "MSTRG.4748.1"; exon_number "6"; 3 StringTie exon 12569270 12569448 1000 + . gene_id "MSTRG.11128"; transcript_id "MSTRG.11128.16"; exon_number "1"; 3 StringTie exon 12576517 12576543 1000 + . gene_id "MSTRG.11128"; transcript_id "MSTRG.11128.16"; exon_number "2";

I have worked with the lnc-file before, so the problem seems to be with the downloaded file.

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Wietje200
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1423 users visited in the last hour