HIV NL4-3 transcriptome fasta
2
0
Entering edit mode
5.5 years ago
xiaoleiusc ▴ 140

Hi, All,

I wonder is there a place which I can download HIV NL4-3 transcriptome Fasta file as a reference file in my NGS RNA reads mapping? Although I know how to use the HIV NL4-3 genome as a reference file, however, I am particularly interested in the reads that are generated from HIV RNA splicing, so I think it would be better if I could map my reads to NL4-3 transcriptome which have all the spliced sequences. What I would like to do is first align my reads to HIV genome (unspliced reads), and then extract the unmapped reads and align the unmapped reads to HIV transcriptome to find the spliced reads.

Thanks ahead,

Xiao

rna-seq CLIP-seq • 1.6k views
ADD COMMENT
0
Entering edit mode
4.7 years ago
Prisca :) • 0

Hi Xiao,

Were you able to make progress on this? If so, could you share what you did/found?

ADD COMMENT
0
Entering edit mode

Hi, Prisca, Unfortunately no progress on this. I could not find a database with NL4-3 splicing transcriptome.

Best, Xiao

ADD REPLY
0
Entering edit mode

Hi,

Wanted to update here that full transcriptome annotations for thousands of hiv genomes (including NL4-3) are now available at https://ccb.jhu.edu/HIV_Atlas/ (see my full response below)

ADD REPLY
0
Entering edit mode
1 hour ago
Ales ▴ 50

You can get full transcriptome annotation for the NL4-3 genomes: https://ccb.jhu.edu/HIV_Atlas/11676/AF324493.2 . Note the accession ID though - if you have a different one - either 1) search the HIV Atlas for your accession, 2) use the AF324493.2 instead (if you download from HVI Atlas it will provide you with both genome fasta and transcriptome annotation) or 3) use the Vira method to transfer HXB2 annotation (or AF324493.2 annotation) from the HIV Atlas onto your target genome assembly.

We have recently completed reference-grade annotation for several thousand HIV-1 genomes including the HXB2 (K03455.1), 89.6 and NL4-3 reference genomes: https://ccb.jhu.edu/HIV_Atlas/. Each annotation features full set of US, PS and FS messages along with protein assignment and major donor and acceptor sites. The annotations are provided in GTF and GFF formats, you can browse them directly on the web interface via integrated JBrowse2 and use them with any transcriptomic utilities you would use for human or other genomes (assembly, quantification, gene/tx expression, etc).

This project started from my personal need to improve spliced alignment with HISAT2/STAR and minimap around splice sites and to be able to compute transcript and junction expression effectively. Eventually, this took me down the rabbit hole of creating several methods for annotation transfer in HIV and annotation of thousands of LANL complete genome assemblies. You can also read more about the resource in our preprint

ADD COMMENT

Login before adding your answer.

Traffic: 2038 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6