Sample-level demultiplex raw FASTQ file
1
0
Entering edit mode
10 weeks ago
Bhavya • 0

I am analyzing some public Drop-Seq data, where the data is not demultiplexed.

When I download the data, I get two FASTQ files - the 'R1' file has barcode sequences, which are in-line. The 'R2' file has the actual sequencing data. I don't have the original Illumina BaseCalls directory. I only have these two files and a list of the barcodes. The barcodes, which are 6 bp in length, aren't necessarily at a specific location in the R1 reads; they are often in the middle of the read.

The barcodes are like this, where each barcode corresponds to a sample:

AAAACT
AAAGTT
AAATTG
AAGATT
AATACA

I'm providing the first few lines of each file as an example:

R1:

@HISEQ:284:C9JKFANXX:1:1101:1202:1999 1:N:0:
NTATTGCACTAAGGTA
+
#3=ABGGGEGGGCGFG

@HISEQ:284:C9JKFANXX:1:1101:1274:1979 1:N:0:
NAAACTTACGTGCTTT
+
#=AABFGCGGGGEGGG
@HISEQ:284:C9JKFANXX:1:1101:1406:1981 1:N:0:
NGCGGGACAGTGTGCC

R2:

@HISEQ:284:C9JKFANXX:1:1101:1202:1999:3:N:0:
ATCCAGGAGAATGGCTCTTTGGTTGAAATCCGAAATTTCTTGGGTGAAA
+
3>3<>;>;F@BFE1CFG11;F1EB>:1=FGG/>>/:EC1C1100880:0B
@HISEQ:284:C9JKFANXX:1:1101:1274:1979 3:N:0:
TCCTTCTTGGGTATGGAATCCTGTGGCATCCATGAAACTACATTCACTTC
+
BBBBBGDEGG0F11F1;=DG1FGGGBFDGGGCFE@DFGGFGGGG>C0=:
@HISEQ:284:C9JKFANXX:1:1101:1406:1981 3:N:0:

I am stumped on how to proceed, and any help would be greatly appreciated!

drop-seq demultiplex rna-seq single-cell • 779 views
ADD COMMENT
0
Entering edit mode

Perhaps tools designed specifically for drop-seq would be the way to go : https://github.com/broadinstitute/Drop-seq

There are also suggestions in this thread --> Tools for demultiplexing a large fastq file based on random in-line barcodes

ADD REPLY
0
Entering edit mode

How would you like the data demultiplexed? Each barcodes gets its own individual FASTQ file?

I have solutions (such as my own software: splitcode) that I can help you use for purposes like this; can show you how to use if that's what you're looking for.

ADD REPLY
0
Entering edit mode
10 weeks ago

I don't know how you do this if you don't know where in the read the barcode is, and there's no constant sequence you can exclude.

ADD COMMENT

Login before adding your answer.

Traffic: 1671 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6