Extract information from txt
0
0
Entering edit mode
4.6 years ago

Dear All!

I have a txt with this kind of information:

query: NB501791:62:HLKV3AFXY:4:21507:10794:16613 1:N:0:AAGCAGA
match:gi|1548994288|gb|CP034492.1| Eukaryotic synthetic construct chromosome 14 >gi|1549096114|gb|CP034517
match:gi|1351431640|ref|XM_024231364.1| PREDICTED: Pongo abelii spectrin repeat containing nuclear envelop
match:gi|1351431635|ref|XM_024231363.1| PREDICTED: Pongo abelii spectrin repeat containing nuclear envelop
match:gi|1351431627|ref|XM_024231362.1| PREDICTED: Pongo abelii spectrin repeat containing nuclear envelop
match:gi|1351431621|ref|XM_024231361.1| PREDICTED: Pongo abelii spectrin repeat containing nuclear envelop
query: NB501791:62:HLKV3AFXY:4:21603:8250:2021 1:N:0:AGGCAGT
match:gi|301129233|ref|NG_023414.1| Homo sapiens tetratricopeptide repeat domain 37 (TTC37), RefSeqGene (L
match:gi|21263230|gb|AC090071.3| Homo sapiens chromosome 5 clone CTD-2538A21, complete sequence
match:gi|262527239|ref|NG_015800.1| Homo sapiens 5'-nucleotidase, cytosolic IIIA (NT5C3A), RefSeqGene on c
match:gi|11465072|gb|AC083863.2|AC083863 Homo sapiens chromosome 7 clone RP11-162O1, complete sequence
match:gi|9558609|gb|AC074338.1|AC074338 Human Chromosome 7 clone RP11-81O10, complete sequence

I need to extract every read, where the the first match isn't homo sapiens/human. Someone have an idea how should I start?

python blast sequence archeogenetics • 780 views
ADD COMMENT
1
Entering edit mode

Check usage of grep -vE with human and Homo Sapiens as query.

ADD REPLY
0
Entering edit mode

Thank you, it worked perfectly! I would have an another question. Can I make a list of sequences with grep? For example:

    Pasteurella multocida 
query: NB501791:62:HLKV3AFXY:3:11601:15774:10353 1:N:0:AAGCAGA
query: NB501791:62:HLKV3AFXY:3:11601:15774:10353 1:N:0:AAGCAGA
query: NB501791:62:HLKV3AFXY:3:11601:15774:10353 1:N:0:AAGCAGA


    Mus musculus
query: NB501791:62:HLKV3AFXY:3:11601:15774:10353 1:N:0:AAGCAGA
query: NB501791:62:HLKV3AFXY:3:11601:15774:10353 1:N:0:AAGCAGA
query: NB501791:62:HLKV3AFXY:3:11601:15774:10353 1:N:0:AAGCAGA

(I know, same queries, jut for the example)

ADD REPLY

Login before adding your answer.

Traffic: 1882 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6