Entering edit mode
2.5 years ago
Wakala
▴
20
Hello everyone, I have two questions to ask for help about barcode in the fastq files:
- I want to know where the barcode is in the fastq file
- how can I get the barcode length
Thank you very much
For example:
@SRR15999465.1 GCGGATCGATGATACGCCGTAG:K00168:267:HCYCLBBXY:7:1101:21684:1156 length=51
NGGATACTAGGAGGAGTATTGATAACTGCCATTCATGGAACACCTGTGAAT
+SRR15999465.1 GCGGATCGATGATACGCCGTAG:K00168:267:HCYCLBBXY:7:1101:21684:1156 length=51
#AAFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJA<FJJ
Can you elaborate a bit? What kind of single cell platform do you use? What was the sequencing method?
scATAC,Illumina NextSeq 500 and Illumina HiSeq 4000.
The data is from GSE184462, and I want to run the cellranger-atac pipline, so the standard cellranger input file is needed, but the file uploaded by the author can only separate two fastq files with faster-dump, so I wonder if the barcode is in these two files.
The SRR number led me here: SRR15999465. At the bottom of their paragraph, it gives the read structure as
The
SRR*.1
leads me to believe this is Read 1. There is some info in the header:The sequence
GCGGATCGATGATACGCCGTAG
is exactly as long as Index1 + Index2 (22 nt). I think it's a fair assumption to say Index1 isGCGGATCGAT
and Index2 isGATACGCCGTAG
. Without knowing more about how the libraries are prepped, that's about as far as I can go.