featuresCount order sam file?
4.4 years ago
noeD

Hello!

I am using featuresCount for read summarization, with the following command:

featureCounts -T 5 -t exon -g gene_name -a genomehg38.gtf -o counts.txt file.sam


I am wondering if I had to sort by name my sam file before to use featureCounts. I have read the featuresCount's documentation but I didn't understand it.

Best

It would be helpful for you to understand how featurecounts works if you tried both the options. Try it with and without sorted sam and let us know if it makes any difference.

From here ,

Due to internal requirements of featureCounts (ie., since it only counts feature hits if both read ends align to the same feature), these coordinate-sorted BAM input files are detected by the featureCounts binary and name-sorted prior to processing. Name-sorting occurs by featureCounts itself (as part of the binary), on a single core and in the current working directory.

Thank you! I have tried now and the results were the same :) Thank you again!

Thanks @noeD

4.4 years ago
h.mon

No, you do not need to sort by name, although it needs either position or name sorted sam / bam. From the manual, page 29:

Automatically detect input format (SAM or BAM).

Automatically sort paired-end reads. Users can provide either location-sorted or name- sorted bams files to featureCounts. Read sorting is implemented on the fly and it only incurs minimal time cost.

I apologize in advance for the stupid question... but what you mean with "No, you do not need to sort by name, although it needs either position or name sorted sam / bam"? I tried the tool without providing other files and it works, but I would like to be sure that it is the right procedure. Thank you for your kindly help!

How did you get your sam file? Did you sort it, or know if the mapper sorts it on the fly? And what do you mean by

I tried the tool without providing other files

What other files? You only need the gtf annotation and the sam / bam mapping file.

It is the result of the alignment by HISAT 2... I didn't re-order it after that and I don't know if the mapper sorts it on the fly...

I don't think it does. You may want to try samtools sort on the HiSAT2 output.

I have tried to use featuresCount sorting the read and without doing it... I obtained the same result...