Question

Deleted:kallisto genomebam not showing reads on igv

0

Entering edit mode

2.6 years ago

Stella • 0

Hello! I am trying to produce bam files to load to igv after kallisto quant with --genobam option.

After producing and loading the pseudoalignment bam to the igv, it is empty.

This is my initial command:

kallisto quant -i Homo_sapiens.GRCh38.cdna.all.release-100.idx -o pseudo -t 10 --genomebam -g Homo_sapiens.GRCh38.100.gtf -c hg38.chrom.sizes R1.fastq.gz.trim_1.fq.gz R1.fastq.gz.trim_2.fq.gz

But I think the problem is the pseudobam file produced doesn't have chromosome information and everything is aligned to "*":

VH00268:2:AAAH2JYHV:1:1101:51046:1057   131 *   0   0   50M *   0   0   GTCTTCCCTGGACATCACTGCCTCTCCAGGGCATTCTCAGGCCCGGGGGT  CCCCCCCCCCC;CCCCCCCCCCCCCC;CCCCCCCCCCCCCCCC;CCCCCC  ZW:f:0
VH00268:2:AAAH2JYHV:1:1101:56689:1057   67  *   0   0   51M *   0   0   TTAAAAAAGTATTTTCTTAACTTTTTACCTTCATTTTTCAAAACAATTAGG CC;CCC-CC-CCCCCCCCCCCC;CCCCCCCCCC;CCCCC;CC;CCCCCCCC ZW:f:0
VH00268:2:AAAH2JYHV:1:1101:56689:1057   131 *   0   0   51M *   0   0   GGAGGTCGATGCTGCATTGAGCCGAGATTGTGCCACTGCACTCCAGCGTGG CCCCCCCCCCCCCC;;CCCCC;CCCCCCCCCCCCCCCCCC-CCCCCCCCCC ZW:f:0

Initial fasta for indexing and gtf file were downloaded both from ensemble release 100

Any idea on what could be wrong in the initial pseudoreads produced?

This is the start of my gtf file:

#!genome-build GRCh38.p13
#!genome-version GRCh38
#!genome-date 2013-12
#!genome-build-accession NCBI:GCA_000001405.28
#!genebuild-last-updated 2019-06
1   havana  gene    11869   14409   .   +   .   gene_id "ENSG00000223972"; gene_version "5"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene";
1   havana  transcript  11869   14409   .   +   .   gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; transcript_name "DDX11L1-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; tag "basic"; transcript_support_level "1";
1   havana  exon    11869   12227   .   +   .   gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; exon_number "1"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; transcript_name "DDX11L1-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; exon_id "ENSE00002234944"; exon_version "1"; tag "basic"; transcript_support_level "1";
1   havana  exon    12613   12721   .   +   .   gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; exon_number "2"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; transcript_name "DDX11L1-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; exon_id "ENSE00003582793"; exon_version "1"; tag "basic"; transcript_support_level "1";
1   havana  exon    13221   14409   .   +   .   gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000456328"; transcript_version "2"; exon_number "3"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; transcript_name "DDX11L1-202"; transcript_source "havana"; transcript_biotype "processed_transcript"; exon_id "ENSE00002312635"; exon_version "1"; tag "basic"; transcript_support_level "1";

and the chromosome size file is tab delimited containing "chr" for chromosome names

I also have quite low mapping percentage:

samtools flagstat pseudoalignments.bam
126464878 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
33881421 + 0 mapped (26.79% : N/A)
126464878 + 0 paired in sequencing
63232439 + 0 read1
63232439 + 0 read2
27267150 + 0 properly paired (21.56% : N/A)
27267150 + 0 with itself and mate mapped
6614271 + 0 singletons (5.23% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

Many thanks!

igv kallisto genomebam rnaseq • 542 views

ADD COMMENT • link 2.6 years ago by Stella • 0