Question: Dual demultiplexing on illumina sequences
0
gravatar for U
2.2 years ago by
U70
Boston, MA
U70 wrote:

Is it possible to add additional sample ID's to the reads in a fastq file whilst demultiplexing.

I have pooled sequences from 95 wells for each plate and primer sequences determines the well ID. So currrently, after demultiplexing, I have a script that takes in the input fastq file, reads the first 30 basepairs whilst looking for "NN" and then for "Y" and converts the primer sequences to degenerate bases to get the right primer set. This primer set then helps assign well id. However to make the process workflow simpler, I would like this to happen right at the demultiplexing stage. Any insight will be most helpful.

i.e. From read in fastq file

@M04012:86:000000000-BCB57:1:1101:17394:1866 CACGGTTGACTCAGCCCTTGACCAGGCACCTCGAATTCCACAGGGC

converts to

>C04 12:86:000000000-BCB57:1:1101:17394:1866 CACGGTTGACTCAGCCCTTGACCAGGCACCTCGAATTCCACAGGGC

Here C04 is my well ID. I have a primerset Sequence file given by Name, type, chain, index and sequence. So, CO4 id is like so

Col_VK_C04,Col,VK,C04,NNTCTGTCATGAYATTGTG,,,,,

demultiplexing ig ngs illumina • 769 views
ADD COMMENTlink modified 2.2 years ago by h.mon29k • written 2.2 years ago by U70

I think you should QC your fastq reads, and then merge them. Convert the merged file to fasta, and then look for left primers based on well position, after which you can trim and translate to your V-region sequences.

ADD REPLYlink written 2.2 years ago by st.ph.n2.5k

I edited your post because it seemed to me you want to convert from fastq (@M04012:86:000000000-BCB57:1:1101:17394:1866) to fasta (>C04 12:86:000000000-BCB57:1:1101:17394:1866), is that right?

ADD REPLYlink written 2.2 years ago by h.mon29k

I have a feeling OP wants to add the sample name in the fastq header (original post is worded badly so hard to be sure). Sounds like something needed for Qiime like pipeline.

ADD REPLYlink written 2.2 years ago by genomax78k

The well location can be added regardless. I think however, it would be easier to demultiplex post merging read pairs, but I don't know the OP's downstream process or goal. Judging by the information above, this sounds similar to something I've done in the past and have written a python demultiplex script to assign sequences to wells.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by st.ph.n2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1456 users visited in the last hour