Hello all! I mapped my reads against the reference human genome using the STAR command, the alignment runs smoothly, but the ReadsPerGene.out.tab contains both gene name AND gene ID in the line below.
See for example:
N_unmapped 5713151 5713151 5713151 N_multimapping 1390397 1390397 1390397 N_noFeature 13845611 18850689 18832660 N_ambiguous 764065 219094 220789
ENSG00000223972.5 0 0 0
DDX11L1 4 2 2
ENSG00000227232.5 0 0 0
WASH7P 0 0 0
ENSG00000278267.1 0 0 0
MIR6859-1 0 0 0
ENSG00000243485.5 0 0 0
MIR1302-2HG 0 0 0
ENSG00000284332.1 0 0 0
MIR1302-2 1 1 0
ENSG00000237613.2 0 0 0
FAM138A 0 1 0
ENSG00000268020.3 1 1 0
OR4G4P 0 0 0
ENSG00000240361.2 0 0 0
OR4G11P 3 1 2
which is obviously causing an error in further processing. What can be the cause of that type of behavior?
the STAR code I am running is as follows:
STAR --runThreadN 24 --genomeDir /Human/PRI/v35.primary_assembly/ --readFilesIn /physiology/2020_analysis/concatenated_fastq_files/ind_55_RISK_R1.fastq.gz /physiology/2020_analysis/concatenated_fastq_files/ind_55_RISK_R2.fastq.gz --readFilesCommand zcat --outFileNamePrefix /physiology/2020_analysis/Human_v35_star_ind_55_RISK_R2_ --quantMode GeneCounts
Any leads are appreciated! Thanks.
Shouldn't your command line have a bit about where the gtf is?