Hello everyone, so I am a student and I am doing my final project for Next-generation sequencing using Galaxy server. After generating the FASTQC files to detect the quality of my reads, I noticed that I had many reads at the end of the graph in the red region (bad quality), so I used trimmomatic operations (Slidingwindow+ CROP). When I did a FASTQC for the new trimmed reads, I see that most of them are in the green region, but without the yellow boxes, they are all represented with the black whiskers, so I got worried that I have done something wrong or I trimmed a lot of data.
Then I continued with mapping with BWA-MEM, merged the BAM files, and then generated mpileup files so I can call the SNPs, my pileup file also looked weird to me and I wasn't sure if that is correct or I faced some problems. I will show a part how my pileup file turned out at the end of the post.
I would really appreciate if someone help me and tell me that I am in the right track, or I did something wrong. Thank you :)
1 2 3 4 5 6
chr1 9999 N 1 ^!C ;
chr1 10000 N 1 A /
chr1 10001 t 6 .^!.^!.^!.^!.^!. ?AAAA/
chr1 10002 a 18 ......^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!. AAAAAAAA6A/A6A/AAA
chr1 10003 a 21 ..................^!.^!.^!. AAAAAAAAAAAAAAAAAAAA/
chr1 10004 c 30 .....................^!.^!.^!.^!.^!.^!.^!.^!.^!. EAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
chr1 10005 c 31 ..............................^!. EAAAAAAAAAAAAAAAAAAAAAAAAAAAAA>
chr1 10006 c 39 ...............................^!.^!.^!.^!.^!.^!.^!.^!. EEEEEEAAAAAAAAAAA/AAAAAAAAAAAA>AAAAAAAA
chr1 10007 t 42 .......................................^!.^!.^!. AEEEEEEAEAEEEEEEAEAAAAAAAAAAAAAAAAAAAAA6A/
chr1 10008 a 47 ..........................................^!.^!.^!.^!.^!. AEEEAEEEEAEEEEEEEEEEEAAAAAAAAAAAAAAAAAAAAAAA6AA
chr1 10009 a 50 ................................................^!.^!. EAE/EEEAEEEEEEEAEEEEEAEEAAAEEEAAAAAAAAAAAA6AA6A///
chr1 10010 c 66 ..................................................^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!. EEEEAEEEEEEEEEEEAAEEEEAEEEEAEEEAAAAAAAAAAAAAAAA3AA/AAAA/A/AAAAAAAA
chr1 10011 c 69 ..................................................................^!.^!.^!. AEEEEEEEAAEEEAEEEEEEEEEEEEAEEEEEEEEEAEEAAAAAAAA>AAAAAAAAAAAAAAAAAA>>>
chr1 10012 c 81 .....................................................................^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!. AEEEAEAEEEEEEEEEAEEEEEAEAEAAEEEEEEEEEEEAE/AAAAAAAAAAAAAAAAAAAAAAAA>>>AAAAAAA6A6AA
chr1 10013 t 82 .................................................................................^!. AEAEEEEEAAEAEAEEEAEEEA/EAEEAEEAEEEEEEEE/EEAAEAEAAAAAAAAAA/AAAA/A/AAAAAAAAA/A/AAAAA
chr1 10014 a 96 ..................................................................................^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.^!. AEEEEEEEEEEEEEEEEEEEEE/EEEEAEEEEEEEEEEEEEEEAEAECEEAAAAAAAAAAAA/AA/AAAAAAAA/A66A6AAAA6666AAAA/6/A
chr1 10015 a 98 ................................................................................................^!.^!. AEEEEEEEEAEEEEEEEEEAEAEAEAAE6EAEAAEAE6EA6/AEEAECEE/EEAE/E/AAEEAE/AAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAA
chr1 10016 c 108 ..................................................................................................^!.^!.^!.^!.^!.^!.^!.^!.^!.^!.
Here is information about
samtools pileup
data format: http://www.htslib.org/doc/samtools-mpileup.html