I'm very new in the study of variants from tumour vs normal samples. In this case, I'm working with WES data from esophageal adenocarcinoma using three tumoural samples and three normal samples from the same patient. For this purpose, I'm using Galaxy (the purpose of the activity is to run the analysis without command line) and the pipeline employed at this moment has been:
- QC of raw reads in fastq format (
- Trimming of raw reads and QC of trimmed reads (
Trimmomaticand the QC tools mentioned above)
- Alignment of trimmed reads (
BWA-MEMagainst hg19, in Set read groups information I employed set_picard and in read group sample name I employed Normal or Tumour)
- Sort bam files (Respect to coordinates)
- Filter sorted bam files (To remove low QMap aligned reads)
- Mark of duplicates (
- Realignment (
- Recalibration of bam files (
- Final filter (To remove aligned and recalibrated reads with QMap greater than 254)
At the moment of using
VarScan Somatic to retrieve the variants from tumoural samples I had a question about how to match the bam files from step 9. Do I need to match tumour_sample_1 with normal_sample_1 and so on to run the analysis? or should I use the function of galaxy to select multiple samples from each group? Currently, I performed the last and got three vcf files from each tumoural sample.
Thanks in advance!