Entering edit mode
4.1 years ago
Bioinfo
▴
20
Hello everyone
please i tried to use bowtie2 to align my reads to reference genome , but my forward and reverse reads have different number of reads (R1.fastq = 39123 R2.fastq =38456 ) when i run bowtie2 it shows this message
Error, fewer reads in file specified with -2 than in file specified with -1
can anyone tell me what to do please
Thank you
First of all find out why there are different read numbers. Did you manipulate the files somehow? Is this paired-end data? You can try and repair with
repair.sh
from BBmap suite.can you please tell me the command line i can use to have two files with same number of reads and one file contain the reads that are not in file 1
Thank you very much
Still, this should not even happen, why are there different read numbers?
i tried it but it shows this error
Output of
head R1.fastq
andhead R2.fastq
? This does not look like fastq files. Rather SAM files.Again, and please answer otherwise this is not productive:
In other words: How did you get these files and what did you do with them?
i'm sorry
yes its fatsq files i checked them , i got them from merging Hiseq data and Miseq data of the same strain but when i checked the number of reads in each file i found that it's different
According to the above error message the second file is not a fastq file. It has SAM header.
Ahhhh , so do i need to delete these lines ?
Files you have are in a completely different format (SAM) which is used to store alignments. This is NOT primary sequence data in fastq format. Edit: It is technically possible to store fastq reads in an unaligned SAM format file.
You can use a different tool from BBMap suite if you want to get fastq format files from SAM files. You will do something like:
Your second file is apparently not in FASTQ but in SAM format. This is a different format as genomax says. Therefore, please post exactly the code you have used to generate these files (and I really mean exactly). How did you obtain these files? Is there anyone in your lab that can help you? You (no offense) lack some essential basics towards NGS data so it is difficult to help from remote.
Hello . i found that the number of reads is higher in R1Hiseq than R2Hiseq , and i usethe command you told me and it works well !! AHh i added -outs option repair.sh in=R1.fastq in2=R2.fastq out=R1_repair.fastq out2=R2_repair.fastq After that i merge the outputs with Mi seq data and i did bowtie2 and it works well
THank you very much for your help , i learned new thing , thank you and yea i ve been working for more than 8 hours and i felt super tired