Dear all,
I am trying to analyze some RNA-seq results from a method called SLAM-seq. Long story short, their recommended library prep is Quantseq whereas we used Kappa Poly-A. We figured out after sequencing that the difference between these includes using different strands for first strand synthesis leading the downstream analysis software to read the file opposite to how it should be (it counts T>C, but we got a higher A>G, which led us to figure out this was the problem). I transformed the data to the reverse-complement using seqtk and got the following.
$ head original_file.fastq
@HISEQ:326:HVL2VBCX2:2:1101:1771:1973 1:N:0:ATTGGCTTC
NAAAAAAGAAAACCAAAGTGGTCCACAAAACATTCTCCTTTCCTTCTGAAGGTTTTACGATGCATTGTTATCATTA
+
#<<DDHHHIEHIIIIGHIIHHHHIFHHIIIHEHHHHIGIHICHHHCHHIIIIIIIHHHEHGHEFHHHHEHHHIFHE
@HISEQ:326:HVL2VBCX2:2:1101:2172:1980 1:N:0:ATTGGCTTC
NAGACACATCAGGGTGGGGCCCAGCCGGCTGCCAGGCACCAGGTCCTCCACCACGAGCGCCGGAAACAGGTCGATG
+
#<<DDHHIIIIIIIIIHIIIIIHIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIH
@HISEQ:326:HVL2VBCX2:2:1101:2255:1977 1:N:0:ATTGGCTTC
NTCCTGCTCCATCTCCCACTTCCGCTCCCTCTCTTTTCCTCTGGTTCTCCAAGTCCAGGTCAGGCAAAGGGGCCAG
$ head reverse.fastq
@HISEQ:326:HVL2VBCX2:2:1101:1771:1973 1:N:0:ATTGGCTTC
TAATGATAACAATGCATCGTAAAACCTTCAGAAGGAAAGGAGAATGTTTTGTGGACCACTTTGGTTTTCTTTTTTN
+
EHFIHHHEHHHHFEHGHEHHHIIIIIIIHHCHHHCIHIGIHHHHEHIIIHHFIHHHHIIHGIIIIHEIHHHDD<<#
@HISEQ:326:HVL2VBCX2:2:1101:2172:1980 1:N:0:ATTGGCTTC
CATCGACCTGTTTCCGGCGCTCGTGGTGGAGGACCTGGTGCCTGGCAGCCGGCTGGGCCCCACCCTGATGTGTCTN
+
HIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIIIIIIIIIIIIIIIHIIIIIHIIIIIIIIIHHDD<<#
@HISEQ:326:HVL2VBCX2:2:1101:2255:1977 1:N:0:ATTGGCTTC
CTGGCCCCTTTGCCTGACCTGGACTTGGAGAACCAGAGGAAAAGAGAGGGAGCGGAAGTGGGAGATGGAGCAGGAN
It looks fine and using $wc -l on both files, it doesn't seem truncated or anything. Unfortunately the downstream Slam dunk software is resulting in a failed run without a very useful error. I'm trying to figure out if there is any obvious reason that this would be the case.. is there some other way the file is being recognized that could cause the reverse compliment file to not be taken as input? I have contacted Slam Dunk but haven't heard back yet. Just trying to troubleshoot. Unfortunately we did not do paired end reads so using the other R2 file is not an option.
Thanks in advance.
Dear miyagi, welcome on Biostars. I admit, I fail on this formating, too. Please use the code button
101010
on the fastq part to make it better readable. And please add the error from slam dunk, in case people recognize the error.Without that information we don't have anything to go on to figure out why you are not getting any output. Do you have to use
slam dunk
for the analysis or can you use any other standard RNAseq software?Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thank you!
Thanks for the formatting tip. The issue is that I'm using their Bluebee Analysis Pipeline (https://www.lexogen.com/lexogen-and-bluebee-launch-slamdunk-data-analysis-pipeline/) and so the error in this picture is totally useless unfortunately. I realize it is probably useless to anyone on this forum as well but in the meantime I've only started trying to use their github version while I see what the company has to say about their Bluebee version.