Question: error in the location of chromosomes (bam files) - ncRNA ref (alignament tophat)... error compatibility with ht-seq count
0
gravatar for call
2.3 years ago by
call0
call0 wrote:

Hello

I made an alignment with ncRNA.fasta sequence by Ensembl using tophat. however... in my output the trancript_ID stay in the chr local... So when I tried to run ht-seq count none sequence is found..

Why does it happen? there is a way to change information for chromosome or be compatible with ht-seq?

please someone help me!!!

HWI-7001432L:176:C8DRKANXX:6:2102:9302:3530 pUr1s ENSSSCT00000001325.2 5 1 76M * 0 0 GCCACCGCGGTTCGCGGTTCTAAACTCTCCATCCATTCGCCTCGACTCCGCTTCTCTCCAGACTCCGAGGCTGAGG BGGGGGFGGGGGGGDGDGGEGDGGGGGGGGGEGGGGCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGCCCCC AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YT:Z:UU NH:i:3 CC:Z:ENSSSCT00000001326.2 CP:i:5 HI:i:0 HWI-7001432L:176:C8DRKANXX:2:1214:8451:68280 353 ENSSSCT00000000705.2 606 1 76M = 788 258 CTAAGTTTTCTTTCTTCTGGTTGGGATTATCATGGGATCCATCTGGCCAAAGGTGGCTGTGGGAAGATGGCACGCT BCCBCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGCGDFGGGGGGGGGGGGGGFGGEGGGGGGGG AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:76 YT:Z:UU NH:i:4 CC:Z:ENSSSCT00000033566.1 CP:i:641 HI:i:0

rna-seq ht-seq alignment • 934 views
ADD COMMENTlink modified 2.3 years ago by Satyajeet Khare1.3k • written 2.3 years ago by call0

Might be that I missed something, but why don't you just align the sequences (using a splice-aware aligner such as STAR) to the genome instead of to this ncRNA.fasta?

ADD REPLYlink written 2.3 years ago by WouterDeCoster36k

Hello! I did the align with tophat using the ncRNA.fasta..

ADD REPLYlink written 2.3 years ago by call0

Hi,

WouterDeCoster is suggesting to try whole genome fasta for pig. Thats what we do for typical tophat command. Transcript information is provided via GTF file. Best

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Satyajeet Khare1.3k

Hello call!

It appears that your post has been cross-posted to another site: http://seqanswers.com/forums/showthread.php?p=200202#post200202

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLYlink written 2.3 years ago by WouterDeCoster36k
1
gravatar for Satyajeet Khare
2.3 years ago by
Satyajeet Khare1.3k
Pune, India
Satyajeet Khare1.3k wrote:

You need to use genome fasta file and ncRNA gtf file. You can use ncRNA fasta file provided it has chromosomal locations in headers. You can get genome fasta and ncRNA gtf files from Gencode.

ADD COMMENTlink written 2.3 years ago by Satyajeet Khare1.3k

Thank you for the attention! but my ncRNA fasta there are the chromosomal locations in headers: >ENST00000347977 ncrna:miRNA chromosome:NCBI35:1:217347790:217347874:-1 gene:ENSG00000195671 gene_biotype:ncRNA transcript_biotype:ncRNA

But I dont know why this happed...

I work with pig sequence... so not have information in gencode. because that I did used the ensembl information...

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by call0
2

Okay, so the fasta format looks slightly different. Three options.

  1. Do you have access to ncRNA GTF file? If yes, try that with tophat.
  2. If not, you can make one from fasta file. You may need to build transcriptome to run tophat anyway. My guess is, if you do that you might generate GTF file when building the transcriptome. Use this GTF file with genome in fasta format from same build for tophat command.
  3. Perform tophat with genome fasta file and total RNA gtf file and then shortlist ncRNAs from your fasta file.

Best,

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Satyajeet Khare1.3k
1

Most straightforward would be option3!

ADD REPLYlink written 2.3 years ago by WouterDeCoster36k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1284 users visited in the last hour