Question: Fastq file size after splitting bam
0
gravatar for banerjeeshayantan
19 months ago by
banerjeeshayantan140 wrote:

I have a bam file of roughly 75GB size. I wanted to check the read quality and so I split it using bedtools bamtofastq command.
bedtools bamtofastq -i t.bam -fq r1.fq -fq2 r2.fq
Now after checking the sizes of the fastq files, I found out that they are of 1.2GB size each. Isn't this unexpected? I was expecting larger file sizes. Where am I going wrong? Thanks!

sequencing software error • 793 views
ADD COMMENTlink modified 18 months ago by d-cameron2.1k • written 19 months ago by banerjeeshayantan140

You don't need to convert the file. The BAM file includes the quality info.

Most common way to check quality is with FastQC and that will accept BAM files.

ADD REPLYlink written 19 months ago by igor8.8k

I wanted to realign my bam files with hg38 reference. Hence I am splitting

ADD REPLYlink written 19 months ago by banerjeeshayantan140
1

Can you try reformat.sh in=your.bam out1=R1.fq.gz out2=R2.fq.gz to see if you get files of the right size (from BBMap suite)? You could have a lot of secondary alignments etc and the fastq files you get may look much smaller since reads would be present more than once in your BAM file.

ADD REPLYlink written 19 months ago by genomax74k

Thanks for your reply. I got fasta files after using samtools that were of proper sizes.(~108GB each). So those who encounter this problem please try to use samtools instead.

ADD REPLYlink written 19 months ago by banerjeeshayantan140
0
gravatar for d-cameron
18 months ago by
d-cameron2.1k
Australia
d-cameron2.1k wrote:

I was expecting larger file sizes. Where am I going wrong? Thanks!

This bedtools documentation has this to say about it:

When using this option, it is required that the BAM file is sorted/grouped by the read name. This keeps the resulting records in the two output FASTQ files in the same order. One can sort the BAM file by query name with samtools sort -n aln.bam aln.qsort.

It appears you did not follow the bedtools documentation and sort by queryname first.

ADD COMMENTlink written 18 months ago by d-cameron2.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1756 users visited in the last hour