16 months ago
Hi everyone,

I am having an issue wih reads_first.py in HybPiper.

Here is the command I used python HybPiper/reads_first.py -b all_ref.fas -r P001_WA06_R* --prefix P001_WA06_allref --bwa

Here are the putput with an error message

HybPiper was called with these arguments:
HybPiper/reads_first.py -b all_ref.fas -r P001_WA06_R1.fastq.gz P001_WA06_R2.fastq.gz --prefix P001_WA06_allref --bwa

Making nucleotide bwa index in current directory.
[CMD]: bwa index all_ref.fas
[bwa_index] Pack FASTA... 0.02 sec
[bwa_index] Construct BWT for the packed sequence...
[bwa_index] 0.19 seconds elapse.
[bwa_index] Update BWT... 0.01 sec
[bwa_index] Pack forward-only FASTA... 0.01 sec
[bwa_index] Construct SA from BWT and Occ... 0.09 sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa index all_ref.fas
[main] Real time: 0.624 sec; CPU: 0.323 sec
[CMD]: time bwa mem -t 36 all_ref.fas /mnt/fe32bdda-5cce-4c1c-a233-f07260845af0/ruiqi_data/ExonCapture_2021Apr/test2_exon/P001_WA06_R1.fastq.gz /mnt/fe32bdda-5cce-4c1c-a233-f07260845af0/ruiqi_data/ExonCapture_2021Apr/test2_exon/P001_WA06_R2.fastq.gz  | samtools view -h -b -S - >  P001_WA06_allref.bam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 2347952 sequences (347231024 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 226240, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (144, 179, 218)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 366)
[M::mem_pestat] mean and std.dev: (180.99, 60.10)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 440)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 2347952 reads in 490.429 CPU sec, 13.899 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 36 all_ref.fas /mnt/fe32bdda-5cce-4c1c-a233-f07260845af0/ruiqi_data/ExonCapture_2021Apr/test2_exon/P001_WA06_R1.fastq.gz /mnt/fe32bdda-5cce-4c1c-a233-f07260845af0/ruiqi_data/ExonCapture_2021Apr/test2_exon/P001_WA06_R2.fastq.gz
[main] Real time: 41.718 sec; CPU: 496.587 sec
494.59user 2.04system 0:41.77elapsed 1188%CPU (0avgtext+0avgdata 2455128maxresident)k
0inputs+0outputs (0major+613686minor)pagefaults 0swaps
[CMD] time python /mnt/fe32bdda-5cce-4c1c-a233-f07260845af0/ruiqi_data/ExonCapture_2021Apr/test2_exon/HybPiper/distribute_reads_to_targets_bwa.py P001_WA06_allref.bam /mnt/fe32bdda-5cce-4c1c-a233-f07260845af0/ruiqi_data/ExonCapture_2021Apr/test2_exon/P001_WA06_R1.fastq.gz /mnt/fe32bdda-5cce-4c1c-a233-f07260845af0/ruiqi_data/ExonCapture_2021Apr/test2_exon/P001_WA06_R2.fastq.gz

Unique reads with hits: 848848
Traceback (most recent call last):
  File "/mnt/fe32bdda-5cce-4c1c-a233-f07260845af0/ruiqi_data/ExonCapture_2021Apr/test2_exon/HybPiper/distribute_reads_to_targets_bwa.py", line 111, in <module>
    if __name__ == "__main__":main()
  File "/mnt/fe32bdda-5cce-4c1c-a233-f07260845af0/ruiqi_data/ExonCapture_2021Apr/test2_exon/HybPiper/distribute_reads_to_targets_bwa.py", line 107, in main
  File "/mnt/fe32bdda-5cce-4c1c-a233-f07260845af0/ruiqi_data/ExonCapture_2021Apr/test2_exon/HybPiper/distribute_reads_to_targets_bwa.py", line 83, in distribute_reads
    for ID1_long, Seq1, Qual1 in iterator1:
  File "/home/ruiqi/miniconda3/envs/exon/lib/python3.9/site-packages/Bio/SeqIO/QualityIO.py", line 920, in FastqGeneralIterator
    line = next(handle)
  File "/home/ruiqi/miniconda3/envs/exon/lib/python3.9/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
Command exited with non-zero status 1
7.76user 3.90system 0:06.83elapsed 170%CPU (0avgtext+0avgdata 1159500maxresident)k
0inputs+0outputs (0major+301953minor)pagefaults 0swaps
ERROR: Something went wrong with distributing reads to gene directories.

Does anyone know what is wrong? Thanks in adavnce!

exon sequence bioinfomatics HybPiper phylogeny

