Demultiplexing Tool
1
0
Entering edit mode
4 months ago
adarsh_munna ▴ 50

Hi,

I have a merged fastq file which contains reads from multiple samples with thermo barcodes. I want to demultiplex this file into corresponding fastq files based on the barcodes. However these barcodes are not exactly coming in the beginning of 5' end of the read. They are coming after some bases in each read as shown below:

ATATTTGTCGCCTTTGGTTAGCAGTTGCGTGTTGCTAAGGTTAAACGTAACTTCATTGTCCCTGAACAGCACCTCCATCTCATCCCTGCGTGTCTCCGACTCAGTTCCGAAGTCACGATGGTGTGTGGAGGATAGAGACTGAAAAAGGAGCTGAAACATACAGGGTATTCATTCTACTTCCACTTCCCCAGTGTGTCAGGGTTAACACTGGATATGCGTACACATGTCCACACATGCAGGCACACGAATACATACATACATACTTCCCATACATATGCACCCACACACCATCACCGACTGCCCATAGAGAGGAAAGCGGAGGCGTAGTGGAGGTGCTGTTCAGGGAACAAACCAAGTTACGTTTAACCTTAGCAATACGTT

GTGTTATGTAACCTTACTTCAACAACTCAGTTTTTACGTATTGCTAAGGTTAAACGTAACTTGGTTTGTTCCTGAACAGCACCTCCATCTCATCCCTGCGTGTCTCCGACTCAGTCAGGCCGAACGATGGTGTGTGGGCAGGGATGCCCCATACAACTTTACTCAGATCTTGGAACTTTAGATGCTGTCCAACTCCCAAAGAAATAACAATTACTCACTGCAGGGGACACCAGAAGGGAGACACTTTTATTATTAGAGGAAATTCCCTGGTGGAAAGAGCAGCTAAGGCCACAACTAAGGAAACTCAACAAACTCAAAGATAAAACCAGTAAGTAATTCAAATGCCCATCAATGATAGACTGGATACAGAAAATGTGGCACATATACACCACGGAATACCATGCAGCCATAAAAAAGGATGAGTTCATGTACTTTGCACGGACATGGATGAAGCTGGAAACCATCATTCTTTGGCAAAACAACACAGGAACAGAAAACCGAACACTGCACGTTCTCATTCATAAATGGGAGTTGAACAATGAGAACACATGGACACAGGGAGGGGCACACCACACACCATCGTCCATTGCAAGCTGAGTGGGAGCACGAGGTGCTGTTCAGGGAACAAACCAAGTTACGTTTAACCTTAGCAATACGTGG

The above is just an example.

I want to know if there are any specific tools for this purpose. I have tried Flexbar, however the results were not satisfactory. Moreover, there are high chances of errors in the barcodes of the reads. So I have to deal with that as well. These are single ended reads and not paired.

Please suggest any good tool or a solution.

Thanks

NGS Fastq Demultiplexing Thermo • 471 views
ADD COMMENT
0
Entering edit mode

Do you need to keep the sequece before the barcode (they look like adaptor sequences to me).

If not, then use a tool like cutadapt to trim off the 5' adaptor sequences and then a standard demultiplexing tool (such as fastx) to demultiplex them.

ADD REPLY
0
Entering edit mode

cutadapt can demultiplex during the trim as well. It seems to work well for this case.

Would these be 3' adapters?

ADD REPLY
0
Entering edit mode
4 months ago
rfran010 ★ 1.3k

How comfortable are you with using command line tools and can you be more specific with how it is not satisfactory?

I ask because flexbar should be very flexible and if it's not serving your purpose, I'm not sure other tools can do better for you. Did you try anything different after reading their manual?

I imagine changing a few flexbar options can get you exactly what you need, however, another popular option is cutadapt. This might be easier to run in the way you want.

Unless I misunderstood the demultiplexing described in this paper is close to what you want... although you will need to adjust options here as well, for example it sounds like indels may be allowed, so removing --no-indels would be one change. (ref: https://www.sciencedirect.com/science/article/pii/S2666166722004592, section 5.1 "Demultiplexing and adapter trimming of sequencing reads")

#Example barcodes.fa file for demultiplexing

    >I1

    GATATCGTCA

    >I2

    GATAGCTACA

    >I3

    GATGCATACA

    >I4

    GATTCTAGCA

# Demultiplex each sample fastq file using barcodes.fa

for i in *.fastq.gz

 do fn=$(basename $i .fastq.gz)

 cutadapt --no-indels -q 30,30 --trimmed-only -j 10 -a file:barcodes.fa -m 10 -o $fn'_{name}_trim.fastq.gz' $i 1> $fn'_log.txt'

 done
ADD COMMENT

Login before adding your answer.

Traffic: 5894 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6