Question: GTF file for HIV strain pNL4-3
0
gravatar for caggtaagtat
18 months ago by
caggtaagtat900
caggtaagtat900 wrote:

Hi,

I have RNA-seq data of HIV infected cells, which I now want to map to a mixed human-HIV genome. For the creation of that genome, I need the GTF file of my HIV strand. I din't find strain specific annotation files for HIV. Do you maybe know where one could find something like that, or a better way to evaluate transcript abundance of HIV in RNA-seq data?

Ok I thought I could convert my annotations in genius by hand in the gff text file, to convert it to a GTF file, but I am very uncertain, if my annotations a sufficient for that.

My GFF file looks like this:

pNL4-3  Geneious    region  1   9709    .   +   0   Is_circular=true
pNL4-3  Geneious    insertion   1186    1186    .   +   .   Name=p17/p24
pNL4-3  Geneious    polyA_signal    9602    9607    .   +   .   Name=POLY_A
pNL4-3  Geneious    LTR 9076    9709    .   +   .   Name=3'_LTR
pNL4-3  Geneious    LTR 1   634 .   +   .   Name=5'_LTR
pNL4-3  Geneious    invisible_Parent    8888    15012   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346934513538.20
pNL4-3  Geneious    invisible_Parent    5304    8887    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346934513382.19
pNL4-3  Geneious    misc_feature    5005    5034    .   .   .   Name=Fragment3
pNL4-3  Geneious    misc_feature    5743    5744    .   +   .   Name=JNCTN_NY5/LAV
pNL4-3  Geneious    repeat_region   454 551 .   +   .   Name=R
pNL4-3  Geneious    repeat_region   9529    9626    .   +   .   Name=R
pNL4-3  Geneious    repeat_region   552 634 .   +   .   Name=U5
pNL4-3  Geneious    intron  744 5776    .   +   .   Name=TAT/REV/NEF_I
pNL4-3  Geneious    intron  6045    8368    .   +   .   Name=TAT_II
pNL4-3  Geneious    intron  6045    8368    .   +   .   Name=TAT/REV/NEF_II
pNL4-3  Geneious    intron  6045    8368    .   +   .   Name=REV_II
pNL4-3  Geneious    CDS 2085    5096    .   +   .   Name=POL
pNL4-3  Geneious    CDS 5969    8643    .   +   .   Name=REV
pNL4-3  Geneious    CDS 5830    8414    .   +   .   Name=TAT
pNL4-3  Geneious    CDS 6221    8785    .   +   .   Name=ENV
pNL4-3  Geneious    CDS 790 2292    .   +   .   Name=GAG
pNL4-3  Geneious    CDS 8787    9407    .   +   .   Name=NEF
pNL4-3  Geneious    CDS 5041    5619    .   +   .   Name=VIF
pNL4-3  Geneious    CDS 5559    5849    .   +   .   Name=VPR
pNL4-3  Geneious    CDS 6061    6306    .   +   .   Name=VPU
pNL4-3  Geneious    splicing signal 5059    5060    .   +   .   Name=SD2b
pNL4-3  Geneious    splicing signal 4963    4964    .   +   .   Name=SD2
pNL4-3  Geneious    splicing signal 5974    5975    .   +   .   Name=SA5
pNL4-3  Geneious    splicing signal 6720    6721    .   +   .   Name=(SD5)
pNL4-3  Geneious    splicing signal 744 745 .   +   .   Name=SD1
pNL4-3  Geneious    splicing signal 6045    6046    .   +   .   Name=SD4
pNL4-3  Geneious    splicing signal 5388    5389    .   +   .   Name=SA2
pNL4-3  Geneious    splicing signal 8367    8368    .   +   .   Name=SA7
pNL4-3  Geneious    splicing signal 5464    5465    .   +   .   Name=SD3
pNL4-3  Geneious    splicing signal 5775    5776    .   +   .   Name=SA3
pNL4-3  Geneious    splicing signal 5952    5953    .   +   .   Name=SA4a
pNL4-3  Geneious    splicing signal 5934    5935    .   +   .   Name=SA4c
pNL4-3  Geneious    splicing signal 5958    5959    .   +   .   Name=SA4b
pNL4-3  Geneious    splicing signal 4911    4912    .   +   .   Name=SA1
pNL4-3  Geneious    splicing signal 6602    6603    .   +   .   Name=(SA6)
pNL4-3  Geneious    invisible_Parent    5786    7812    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938163915.21
pNL4-3  Geneious    invisible_Parent    7813    15494   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938164320.22
pNL4-3  Geneious    invisible_Parent    5304    7812    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938256243.23
pNL4-3  Geneious    invisible_Parent    7813    15012   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938256306.24
pNL4-3  Geneious    invisible_Parent    639 5785    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938340538.25
pNL4-3  Geneious    invisible_Parent    5786    10347   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938340632.26
pNL4-3  Geneious    invisible_Parent    639 5303    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938465400.27
pNL4-3  Geneious    invisible_Parent    5304    10347   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938465494.28
pNL4-3  Geneious    invisible_Parent    712 5785    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938540694.29
pNL4-3  Geneious    invisible_Parent    5786    10420   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938540787.30
pNL4-3  Geneious    invisible_Parent    712 5303    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938613678.31
pNL4-3  Geneious    invisible_Parent    5304    10420   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938613756.32
pNL4-3  Geneious    invisible_Parent    5786    8465    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346941525286.33
pNL4-3  Geneious    invisible_Parent    8466    15494   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346941525411.34
pNL4-3  Geneious    invisible_Parent    5304    8465    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346941600376.35
pNL4-3  Geneious    invisible_Parent    8466    15012   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346941600470.36
pNL4-3  Geneious    invisible_Parent    712 5303    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1347027336363.0
pNL4-3  Geneious    invisible_Parent    5304    10420   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1347027336628.1

Do I have to look up the exon borders and insert them manually? Shoudl I delete the first line and do I have to delete the splice signal entries?

Do you maybe know another way to get to e.g. an exemplary HIV GTF file fro comaprison? Or even the one I need?

mapping hiv annotation • 567 views
ADD COMMENTlink modified 17 months ago • written 18 months ago by caggtaagtat900
1

If you have the genbank file you could try using a genbank2gtf type program to make one up. Here is one repo.

ADD REPLYlink written 18 months ago by genomax75k

Thank you! I have the annotation in genious and can download the GFF file from there. I just have to convert it then, which I guess can be done by hand, since the file is not that large.

ADD REPLYlink modified 18 months ago • written 18 months ago by caggtaagtat900
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1978 users visited in the last hour