I'm still getting to know STAR and wanted to ask you, if you just use Chromosomes 1 to 22 and X,Y, MT or also all the other chromosomes with longer names in your fasta reference file for mapping human reads?
It depends on what your aim is. If you care about those (I assume you are referring to random/haplotype type entries) then you will have to. This may be of interest.
Thanks, I'm interested in alternative splicing in human cells.
Include the various unplaced scaffolds but not the haplotype alleles. The latter will completely screw things up. Including the former will slightly decrease false-positive alignments.
Alright, thank you. So since I'm looking at alternative splicing, including the non-chromosomal fasta file in my genome could may be rewarding than.
Do you know where the unplaced scaffolds come from?
The random contigs fall into two groups: those with a known chromosome of origin and those with no known chromosome of origin. In the latter case I presume these actually have multiple copies, though I've never checked. In the former case it's likely that these are regions that just to integrate into the assembly well.
Ok thanks, I will included the fasta files (from ensembl) of chromosome 1-22, X, Y, of mitochondrial and non-chromosomal DNA for my mapping with STAR from now on.