Question: (Error) bedtools fastafrombed and STAR aligner
0
gravatar for umn_bist
3.1 years ago by
umn_bist320
umn_bist320 wrote:

Essentially I made a pseudo reference genome using a BED file with specific regions of interest (eg TP53 at chr17).

Now I have to generate a genome.fq with splice junctions inserted using STAR before mapping my samples. Easy enough.

But when I try mapping, I get an error stating no valid exon lines in GTF file. A common cause is chromosome naming difference between my fastq and gtf.

Upon closer inspection, the chrName.txt for my pseudo refgenome and whole ref genome that's generated by STAR are different:

pseudogenome

17:7565096-7579937
17:7569403-7579937
17:7571719-7578811
17:7571719-7590868
17:7571719-7590868
17:7577498-7590868
17:7577498-7590868
17:7571719-7576926
17:7571719-7578811
17:7571719-7578811
17:7571719-7590868
17:7571719-7590868
17:7577498-7578554
17:7579311-7590868
17:7577850-7590868
17:7571719-7579937
17:7571719-7590868

whole genome

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Y
MT

I'm assuming bedtools fastafrombed is the culprit. Could it be that using BED12 (instead of BED6) is cause for all this?

BED file:

17  7565096 7579937 uc002gig.1  0   -   7565256 7579912 0   7   236,110,113,184,279,22,99,  0,12402,13080,13274,14215,14603,14742,
17  7569403 7579937 uc002gih.3  0   -   7569523 7579912 0   9   159,74,137,110,113,184,279,22,99,   0,7449,7615,8095,8773,8967,9908,10296,10435,
17  7571719 7578811 uc002gii.2  0   -   7572926 7578452 0   7   1289,107,74,137,110,113,441,    0,2207,5133,5299,5779,6457,6651,
17  7571719 7590868 uc002gij.3  0   -   7572926 7579569 0   11  1289,107,74,137,110,113,184,279,22,102,174, 0,2207,5133,5299,5779,6457,6651,7592,7980,8119,18975,
17  7571719 7590868 uc002gim.3  0   -   7572926 7579912 0   11  1289,107,74,137,110,113,184,279,22,99,174,  0,2207,5133,5299,5779,6457,6651,7592,7980,8119,18975,
17  7577498 7590868 uc002gin.3  0   -   7577500 7579912 0   6   110,113,184,22,102,174, 0,678,872,2201,2340,13196,
17  7577498 7590868 uc002gio.3  0   -   7577500 7578533 0   4   110,113,184,174,    0,678,872,13196,
17  7571719 7576926 uc010cne.1  0   -   7571719 7571719 0   2   1289,74,    0,5133,
17  7571719 7578811 uc010cnf.2  0   -   7576536 7578452 0   8   1289,107,60,74,137,110,113,441, 0,2207,4805,5133,5299,5779,6457,6651,
17  7571719 7578811 uc010cng.2  0   -   7576624 7578452 0   8   1289,107,133,74,137,110,113,441,    0,2207,4805,5133,5299,5779,6457,6651,
17  7571719 7590868 uc010cnh.2  0   -   7576536 7579912 0   12  1289,107,60,74,137,110,113,184,279,22,102,174,  0,2207,4805,5133,5299,5779,6457,6651,7592,7980,8119,18975,
17  7571719 7590868 uc010cni.2  0   -   7576624 7579912 0   12  1289,107,133,74,137,110,113,184,279,22,102,174, 0,2207,4805,5133,5299,5779,6457,6651,7592,7980,8119,18975,
17  7577498 7578554 uc010cnj.1  0   -   7577498 7577498 0   2   110,184,    0,872,
17  7579311 7590868 uc010cnk.2  0   -   7579311 7580659 0   5   279,22,102,103,174, 0,388,527,1331,11383,
17  7577850 7590868 uc010vug.3  0   -   7578137 7579569 0   5   439,184,279,241,174,    0,520,1461,1849,12844,
17  7571719 7579937 uc031qyp.1  0   -   7571719 7571719 0   10  1289,107,74,137,110,118,184,279,22,99,  0,2207,5133,5299,5779,6452,6651,7592,7980,8119,
17  7571719 7590868 uc031qyq.1  0   -   7572926 7579569 0   10  1289,107,74,137,110,113,184,279,241,174,    0,2207,5133,5299,5779,6457,6651,7592,7980,18975,
rna-seq star bedtools • 974 views
ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by umn_bist320

I did not understood completely what you are trying to do, but when you use fastafrombed, the sequence names are obviously renamed to the chr:start-end format, which would be different from your original fasta files.

ADD REPLYlink written 3.1 years ago by geek_y9.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1357 users visited in the last hour