Hi, I am very new to this area, and I am taking a class about bioinformatics. For an independent project assignment, I need to do a GWAS. I am using the bash terminal. I downloaded all the fastq I need, trimmed them, and converted them into bam/sam then vcf then bed/bim/fam etc. However, when I tried to perform GWAS in plink, I realized I dont have phenotype data. It supposed to have two phenotypes.
Basically there are two groups/phenotypes of fastq files, each containing 29 samples. Let's say they are group 1 and 2. For each group, I converted every fastq to sam then bam, then I combined 29 bam to one bam. Then I combined two bams (for the two groups) together to a vcf.gz. Then there is no phenotype data in the following plink files.
Would really appreciate any help! like which step I might have been wrong, or what I should do to incorporate the phenotype data. Ultimately this is only an assignment, so I dont have to be perfect at every detail (like the QC steps), and I am afraid I cannot understand too complicated codes. I just want to go to the end and get a Manhattan plot or something. If there is another pipeline to do so that's also fine.