Unmapped Reads in Kallisto
1
0
Entering edit mode
4.6 years ago

Hi, I'm using kallisto and bustools for single cell RNA-seq, in a similar way to what is done in the Pall Melsted et. al. paper here: https://www.biorxiv.org/content/10.1101/673285v1 I'm trying to get an idea of what information is in the reads that aren't aligned to anything, and get counts of the sequences that don't align to anything but appear frequently in the fastq. I've used CITE-seq-Count for a similar purpose before, and it creates an unmapped.csv file in its output, which is exactly what I'm looking for. However, I've been given a requirement to not use CITE-seq-Count for this project. Is there a way to get the reads that don't map to anything from the output of the kallisto bus command? Or alternatively, if I could get the ID's of the reads that do match, I could subset them out of the original fastq and use fastqc to get a list of overrrepresented sequences and their abundances in the remaining. Any advice would be greatly appreciated!

Kallisto RNA-Seq alignment bustools • 2.1k views
ADD COMMENT
0
Entering edit mode

Did you find a solution?

ADD REPLY
0
Entering edit mode

Create a new thread if you need help with this.

ADD REPLY
0
Entering edit mode

dsull you can add a solution to this thread. It looks like this poster has a similar issue described in original post. There is no need to create a new post since it will be duplicate and will leave this question unanswered.

ADD REPLY
0
Entering edit mode

Sounds good! I just added a solution here.

ADD REPLY
1
Entering edit mode
12 months ago
dsull ★ 6.9k

Running kallisto bus with the -n option will record the read numbers of the reads that have been successfully mapped in the output BUS file. You can inspect the BUS file in standard output using bustools text -pf /path/to/output/file.bus -- the last column will contain the read numbers (zero-indexed) of every read that has been successfully mapped. From there, you can figure out what the unmapped read numbers are and then go into your FASTQ file to pick out those sequences.

ADD COMMENT

Login before adding your answer.

Traffic: 2202 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6