(Error) bedtools fastafrombed and STAR aligner
0
0
Entering edit mode
8.1 years ago
umn_bist ▴ 390

Essentially I made a pseudo reference genome using a BED file with specific regions of interest (eg TP53 at chr17).

Now I have to generate a genome.fq with splice junctions inserted using STAR before mapping my samples. Easy enough.

But when I try mapping, I get an error stating no valid exon lines in GTF file. A common cause is chromosome naming difference between my fastq and gtf.

Upon closer inspection, the chrName.txt for my pseudo refgenome and whole ref genome that's generated by STAR are different:

pseudogenome

17:7565096-7579937
17:7569403-7579937
17:7571719-7578811
17:7571719-7590868
17:7571719-7590868
17:7577498-7590868
17:7577498-7590868
17:7571719-7576926
17:7571719-7578811
17:7571719-7578811
17:7571719-7590868
17:7571719-7590868
17:7577498-7578554
17:7579311-7590868
17:7577850-7590868
17:7571719-7579937
17:7571719-7590868

whole genome

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Y
MT

I'm assuming bedtools fastafrombed is the culprit. Could it be that using BED12 (instead of BED6) is cause for all this?

BED file:

17  7565096 7579937 uc002gig.1  0   -   7565256 7579912 0   7   236,110,113,184,279,22,99,  0,12402,13080,13274,14215,14603,14742,
17  7569403 7579937 uc002gih.3  0   -   7569523 7579912 0   9   159,74,137,110,113,184,279,22,99,   0,7449,7615,8095,8773,8967,9908,10296,10435,
17  7571719 7578811 uc002gii.2  0   -   7572926 7578452 0   7   1289,107,74,137,110,113,441,    0,2207,5133,5299,5779,6457,6651,
17  7571719 7590868 uc002gij.3  0   -   7572926 7579569 0   11  1289,107,74,137,110,113,184,279,22,102,174, 0,2207,5133,5299,5779,6457,6651,7592,7980,8119,18975,
17  7571719 7590868 uc002gim.3  0   -   7572926 7579912 0   11  1289,107,74,137,110,113,184,279,22,99,174,  0,2207,5133,5299,5779,6457,6651,7592,7980,8119,18975,
17  7577498 7590868 uc002gin.3  0   -   7577500 7579912 0   6   110,113,184,22,102,174, 0,678,872,2201,2340,13196,
17  7577498 7590868 uc002gio.3  0   -   7577500 7578533 0   4   110,113,184,174,    0,678,872,13196,
17  7571719 7576926 uc010cne.1  0   -   7571719 7571719 0   2   1289,74,    0,5133,
17  7571719 7578811 uc010cnf.2  0   -   7576536 7578452 0   8   1289,107,60,74,137,110,113,441, 0,2207,4805,5133,5299,5779,6457,6651,
17  7571719 7578811 uc010cng.2  0   -   7576624 7578452 0   8   1289,107,133,74,137,110,113,441,    0,2207,4805,5133,5299,5779,6457,6651,
17  7571719 7590868 uc010cnh.2  0   -   7576536 7579912 0   12  1289,107,60,74,137,110,113,184,279,22,102,174,  0,2207,4805,5133,5299,5779,6457,6651,7592,7980,8119,18975,
17  7571719 7590868 uc010cni.2  0   -   7576624 7579912 0   12  1289,107,133,74,137,110,113,184,279,22,102,174, 0,2207,4805,5133,5299,5779,6457,6651,7592,7980,8119,18975,
17  7577498 7578554 uc010cnj.1  0   -   7577498 7577498 0   2   110,184,    0,872,
17  7579311 7590868 uc010cnk.2  0   -   7579311 7580659 0   5   279,22,102,103,174, 0,388,527,1331,11383,
17  7577850 7590868 uc010vug.3  0   -   7578137 7579569 0   5   439,184,279,241,174,    0,520,1461,1849,12844,
17  7571719 7579937 uc031qyp.1  0   -   7571719 7571719 0   10  1289,107,74,137,110,118,184,279,22,99,  0,2207,5133,5299,5779,6452,6651,7592,7980,8119,
17  7571719 7590868 uc031qyq.1  0   -   7572926 7579569 0   10  1289,107,74,137,110,113,184,279,241,174,    0,2207,5133,5299,5779,6457,6651,7592,7980,18975,
RNA-Seq STAR bedtools • 2.0k views
ADD COMMENT
0
Entering edit mode

I did not understood completely what you are trying to do, but when you use fastafrombed, the sequence names are obviously renamed to the chr:start-end format, which would be different from your original fasta files.

ADD REPLY

Login before adding your answer.

Traffic: 3434 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6