HI, While I analyzed RNAseq data, I have some questions in using samtools and htseq-count. Can anyone help?
- I want to use htseq-count to get all counts on known genes, so i run:
htseq-count brain_fetus1.sam ~/knowngene_hg19.gff
Error occured in line 3 of file /cchome/che/knowngene_hg19.gff.
Error: Feature uc001aaa.3. does not contain a 'gene_id' attribute    
  [Exception type: SystemExit, raised in count.py:55]
The gff file like this:
chr1    hg19_knownGene    gene    11874    14409    0.000000    +    .    ID=uc001aaa.3;Name=
chr1    hg19_knownGene    mRNA    11874    14409    0.000000    +    .    ID=uc001aaa.3;Name=;Parent=uc001aaa.3
chr1    hg19_knownGene    exon    11874    12227    0.000000    +    .    ID=uc001aaa.3.;Name=;Parent=uc001aaa.3
chr1    hg19_knownGene    exon    12613    12721    0.000000    +    .    ID=uc001aaa.3.;Name=;Parent=uc001aaa.3
chr1    hg19_knownGene    exon    13221    14409    0.000000    +    .    ID=uc001aaa.3.;Name=;Parent=uc001aaa.3
Is there an appropriate tools to convert gtf to gff?
- another question is in samtools:
while i use samtools to figure out the counts in specified region, i run like this:
samtools mpileup -l test.bed brain.bam > test.txt
the test.bed file:
chr1    11873   14409   uc001aaa.3      0       +       11873   11873   0       3       354,109,1189,   0,739,1347,
chr1    11873   14409   uc010nxr.1      0       +       11873   11873   0       3       354,52,1189,    0,772,1347,
chr1    11873   14409   uc010nxq.1      0       +       12189   13639   0       3       354,127,1007,   0,721,1529,
It seems the -l option doesn't work. the result test.txt still contain the counts from the whole genome.
Thanks,
Che
Thanks. I think I got it.