Question: htseq-count not working with miRbase gff?
2
gravatar for manekineko
3.8 years ago by
manekineko130
Bulgaria
manekineko130 wrote:

I have tried to count mapped reads on the human miRNA regions from the miRBAse human GFF

the output was just:

__no_feature    70216353
__ambiguous     0
__too_low_aQual 0
__not_aligned   1383340
__alignment_not_unique  0

So there should be something I do wrong?

I have run it - htseq-count SAMfile Gfffile (SAMfile is from bowtie with option -S)

The GFF is from MirBase:

chr1    .    miRNA_primary_transcript    17369    17436    .    -    .    ID=MI0022705;Alias=MI0022705;Name=hsa-mir-6859-1
chr1    .    miRNA    17409    17431    .    -    .    ID=MIMAT0027618;Alias=MIMAT0027618;Name=hsa-miR-6859-5p;Derives_from=MI0022705
chr1    .    miRNA    17369    17391    .    -    .    ID=MIMAT0027619;Alias=MIMAT0027619;Name=hsa-miR-6859-3p;Derives_from=MI0022705
chr1    .    miRNA_primary_transcript    30366    30503    .    +    .    ID=MI0006363;Alias=MI0006363;Name=hsa-mir-1302-2
chr1    .    miRNA    30438    30458    .    +    .    ID=MIMAT0005890;Alias=MIMAT0005890;Name=hsa-miR-1302;Derives_from=MI0006363
chr1    .    miRNA_primary_transcript    187891    187958    .    -    .    ID=MI0026420;Alias=MI0026420;Name=hsa-mir-6859-2
chr1    .    miRNA    187931    187953    .    -    .    ID=MIMAT0027618_1;Alias=MIMAT0027618;Name=hsa-miR-6859-5p;Derives_from=MI0026420
chr1    .    miRNA    187891    187913    .    -    .    ID=MIMAT0027619_1;Alias=MIMAT0027619;Name=hsa-miR-6859-3p;Derives_from=MI0026420
chr1    .    miRNA_primary_transcript    632325    632413    .    -    .    ID=MI0022558;Alias=MI0022558;Name=hsa-mir-6723
chr1    .    miRNA    632382    632403    .    -    .    ID=MIMAT0025855;Alias=MIMAT0025855;Name=hsa-miR-6723-5p;Derives_from=MI0022558
gff htseq-count mirbase • 2.1k views
ADD COMMENTlink modified 3.7 years ago by swimwriteliv0 • written 3.8 years ago by manekineko130
2
gravatar for CraigM
3.8 years ago by
CraigM80
Ireland
CraigM80 wrote:

Did you use the -t flag to change the feature type to miRNA_primary_transcript or miRNA? As far as HTSeq would be concerned (without using this flag) you have given it no features to look for.

 

The default looks for exon in the third column

 

You should also use the -i flag to use 'ID' as the id attribute (default looks in the last column for 'gene_id')

Ensure you read the manuals before using these tools.

http://www-huber.embl.de/users/anders/HTSeq/doc/count.html

 

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by CraigM80
0
gravatar for manekineko
3.8 years ago by
manekineko130
Bulgaria
manekineko130 wrote:

oh I forgot to use -t, is it also nessesary to use -i ID ?

And is SAM file should be sorted?

I will read the manual.....

ADD COMMENTlink written 3.8 years ago by manekineko130

Sorry, I just edited my answer to say this as you submitted this. 

Yes you should use -i ID.

 

Your Sam file is likely already sorted (many aligners do this for you) or HTSeq would have thrown an error.

ADD REPLYlink written 3.8 years ago by CraigM80
0
gravatar for manekineko
3.8 years ago by
manekineko130
Bulgaria
manekineko130 wrote:

Thanks!

since my SAM is quite big (15G) I wait quite a lot, befor I see results. Is there a way to see if it is counting properly with this flags in the middle of the run?

ADD COMMENTlink written 3.8 years ago by manekineko130

You can use samtools to subsection a random percentage of your input file, for example 5% would use 

 

samtools view -s 0.05 infile.sam > sampledfile.sam

 

Sample Sam File

ADD REPLYlink written 3.8 years ago by CraigM80

Is there a way to count reads but with option if they are longer than XXnt?

ADD REPLYlink written 3.7 years ago by manekineko130

Not as far as I am aware.

You could control this during the preprocessing  or alignment stages.

It is also possible to extract only reads of a specified length from a sam file Bam : Extract Only Alignment With A Specific Length

 

Also, please mark my answer if accepted if it has solved your original issue with read counting.

ADD REPLYlink written 3.7 years ago by CraigM80

I found the solution - BBmap have tool to reformat SAM

reformat.sh in=file.sam out=file.sam minlength=50 maxlength=200

ADD REPLYlink written 3.7 years ago by manekineko130
0
gravatar for swimwriteliv
3.7 years ago by
Canada
swimwriteliv0 wrote:

Hi manekineko,

What program exactly did you download to reformat your sam file? Also, what do the options minlength and maxlength do?

Thanks!

ADD COMMENTlink written 3.7 years ago by swimwriteliv0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 657 users visited in the last hour