Entering edit mode
4.2 years ago
tamara912 • 0
I downloaded from the European Nucleotide Archive raw data (fastq) for human chromosome 21. It was indicated that the library is WCS and that the data is only for chr21. I mapped the data with Bowtie2 in Galaxy and I got the bam file containing all chromosomes, instead only chr 21.
So, my question is why I have all the chromosomes in my bam file?
I tried to filter that bam for only chr21, but the alignment is very bad so I suppose that this approach is not very good.
Thank you in advance.
Please give some details. Which dataset is this? What is WCS, do you mean WGS like whole genome sequencing? Please provide the link to the dataset.
The link to the dataset: https://www.ebi.ac.uk/ena/data/view/DRR000546
No, I mean like WCS: Random sequencing of a whole chromosome or other replicon isolated from a genome. (This is from ENA training modules website)
Can you post a link for tutorial page?
If the data is only for chr 21 then perhaps you should only be aligning against that in first place.
Thank you very much! I didn't consider that using a whole reference genome is a problem...
Have you checked percent aligned reads per chromosome?
Hm, not yet. I will try it if it can be done in Galaxy