I'm a beginner in bioinformatic filed and I want to extract chr.22 from the fastq file. I don't know which tool should I use. Could anyone help me in that?
Hello and welcome to biostars Sakhaa ,
fastq files doesn't contain any information about the origin of the reads. You first have to align those reads to the genome resulting in a bam file. From there you can extract anything specific for a chromosome.
Fastq file contains reads from all the complete genome (at least you sequenced chr22), therefore, you have to align reads against the genome to clasify them by its corresponding genome location, then you can extract reads of your interest.
1.- Read about bowtie2 to align your sequences.
2.- Read about sam/bam files samtools (result of bowtie2 aligner) and how to extract data from specific location.
Is it a fasta or fastq? head your.file please. If fasta:
>name of sequence
>name of next sequence
then use samtools faidx. Details via google and the search function. Asked many times before, e .g. Filtering Fasta Sequences By Chromosomes Names From A Big Fasta File
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy