Amplicon Sequencing samples identification

0

Entering edit mode

6.9 years ago

misterie ▴ 110

I have assigned 20 amplicons (~13,000 bp together) generated for 350 individuals to three 96-well plates for a variant calling analysis. So I have 3 directories with 96 x 2 fastq files (paired-end) in each directory. Summarizing 3 x 96 x 2 fastq files.

How can I identify my samples. I have done quality control analysis (FastQC) but I think I should do alignment for every individual separately.

Can you help me with describing my data set? I have never work with dataset with well plates.

Samples have been sequenced using Nextera XT.

amplicon sequencing data fastq • 1.7k views

ADD COMMENT • link 6.9 years ago by misterie ▴ 110

0

Entering edit mode

You have no information about key-pairs for Samples = Indexes?

ADD REPLY • link 6.9 years ago by GenoMax 154k

0

Entering edit mode

I have file SampleSheet.csv in Plate1 directory containing information about Sample ID, Sample Name, Sample Plate, Sample well, i7 index id, index, i5 index id, index. But there are only 96 rows, not 350...

ADD REPLY • link 6.9 years ago by misterie ▴ 110

0

Entering edit mode

you say that you have a sample sheet in the Plate1 directory, but should you then not also have a samplesheet in the Plate2 and Plate3 directory?

ADD REPLY • link 6.9 years ago by gb ★ 2.2k

0

Entering edit mode

I have only one sample sheet... maybe Fastq files are demultiplexed and I have to separate sth but I do not know how...

ADD REPLY • link 6.9 years ago by misterie ▴ 110

0

Entering edit mode

If possible you should ask the lab. You can demultiplex with cutadapt or sabre... Mostly demultiplexing is done during basecalling based on illumina tag. But how many files do you have? In your question you say

3 x 96 x 2 fastq files

So that means you have all the samples right?

ADD REPLY • link 6.9 years ago by gb ★ 2.2k

Login before adding your answer.