PHI annotation

choosefile  example

PHI database introduction

PHI-basePathogen Host Interactions),a database that catalogues experimentally verified pathogenicity, virulence and effector genes from fungal, Oomycete and bacterial pathogens, which infect animal, plant, fungal and insect hosts. The mission of PHI-base is to provide expertly curated molecular and biological information on genes proven to affect the outcome of pathogen-host interactions. We have PHI-base terms attached to genes in thousands of species in Ensembl Bacteria, Fungi and Protists. Information is also given on the target gene sites of some anti-infective chemistries.

Software DIAMOND was used for sequences mapping and annotation based on database (version 4.6, updated in November, 2018, including 6438 genes, 11340 interactions, 263 pathogens, 194 hosts, 510 diseases), with default parameters.

Input files

Fasta file of nucleic acid or amino acid query sequences.


1. Mapping and annotation results

2. Statistics of alignment

3. Distribution of E-values.

Example File

1. Mapping and annotation results

Query_id :ID of query sequences

Subject_id ID of mapped sequences in database

Query_start the start position of query sequences covered by alignment

Query _end the end position of query sequences covered by alignment

Subject_startthe start position of subject sequences covered by alignment

Subject_endthe end position of subject sequences covered by alignment

Align_lengththe length of sequences covered by alignment

Positive:counts of  positive-scoring matches(Base or amino acid

Gapnumber of gaps

Coveragethe percentage of query covered by alignment to the database sequence

Identity(%) identity of alignment (percentity)

E_value Expcet values of alignment, the lower the better

ScoreScore of alignment, the higher the better

DB_Typethe database of reference genes

Accessionthe ID of reference genes in source database

Gene_namegene name

Pathogen_NCBI _ID:NCBI taxonomy ID of the pathogenic species

Pathogen_speciessystematic name of the pathogenic species

Disease_name:name of the disease caused by the pathogen host interaction

Host_Descriptondescripton of host

Host_NCBI_ ID:NCBI taxonomy ID of the host organism

Experimental_host:common name of the host organism

Functionfunction of proteins

2. Statistics of alignment

showing the number of mapped and unmapped results.

3. Distribution of E-values.

E value is the expected value of the alignment. The smaller the E value, the more the reliability. We divided the E value into five ranges and show the number with a pie chart.