How to extract all sequences mapped to a transcript from Kallisto output
1
0
Entering edit mode
2.6 years ago
hzc363 ▴ 10

I ran Kallisto with the --pseudobam option. How do I extract all the short reads that are mapped to a single transcript (e.g. ENST00000367969.8)? As a person without any previous SAM/BAM experience, I tried the following things without success. Really appreciate some help here.

I first tried:

samtools view -S pseudoalignments.bam ENST00000367969.8

but got the following error:

[E::idx_find_and_load] Could not retrieve index file for 'pseudoalignments.bam' [main_samview] random alignment retrieval only works for indexed BAM or CRAM files.

I tried to index the file using:

samtools index pseudoalignments.bam

but get the following error:

[E::hts_idx_push] NO_COOR reads not in a single block at the end 79729 -1 
[E::sam_index] Read 'A00742:184:HLTF2DSXY:4:1101:4417:1000' with ref_name='ENST00000556431.1', ref_length=4142, flags=83, pos=2098 cannot be indexed samtools index: failed to create index for "pseudoalignments.bam"
bam kallisto sam samtools • 1.2k views
ADD COMMENT
0
Entering edit mode

Cross-posted on bioinfo SE: https://bioinformatics.stackexchange.com/q/17685/650

Don't post your question on mutliple sites. That's inconsiderate, like asking multiple people to help you with something just to ensure you get help faster.

ADD REPLY
2
Entering edit mode
2.6 years ago
Shred ★ 1.4k

Before indexing you need to sort your alignment

samtools sort -o pseudoalignment 
ADD COMMENT

Login before adding your answer.

Traffic: 2832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6