PathoYeastract
C.glabrata
IST INESC-ID IBB
Home > Tutorial Contact Us     Tutorial     Help

 

Tutorial

PathoYeastract (Pathogenic Yeast Search for Transcriptional Regulators And Consensus Tracking; http://pathoyeastract.org/) database presently contains almost 37,000 regulatory associations between the transcription factors and target genes in Candida albicans and C. glabrata, based on 747 bibliographic references. Each regulation has been annotated manually, after examination of the relevant references. The database also contains the description of 107 specific DNA binding sites shared among 100 C. albicans and 34 C. glabrata characterized TFs. Further information about each yeast gene was obtained from Candida Genome Database (CGD), YEASTRACT and Gene Ontology (GO) Consortium.

PathoYeastract database provides assistance in two major issues: prediction of gene transcriptional regulation and global expression analysis according to Candida transcription networks described in the literature. This tutorial presents two case-studies, exemplifying the use of different query options and utilities. Various other ways to exploit available options and utilities are possible.

- Example 1: Identification of the documented and potential regulatory associations for an ORF/Gene

- Example 2: Gene expression analysis based on regulatory associations

Throughout PathoYeastract database and this tutorial, the regulatory associations are denominated "Documented" or "Potential":

  • a documented association between a Transcription Factor (TF) and a target gene is supported by published data showing at least one of the following experimental evidences: i) Change in the expression of the target gene due to a deletion (or mutation) in the gene encoding transcription factor; these evidences may come from detailed gene by gene analysis or genome-wide expression analysis; ii) Binding of the transcription factor to the promoter region of the target gene, as supported by band-shift, foot-printing or Chromatine ImmunoPrecipitation (ChIP) assays. Therefore, the user is urged to check the literature references provided in the database to fully understand the nature of the evidences underlying the identified regulatory associations.
  • a potential association between a TF and a target gene is based on the occurrence of the TF binding site in the promoter region of the target gene. The binding sites associated to each TF in this database are supported by published experimental evidence for the binding of the TF to the specific nucleotide sequence (data coming from foot-printing or ChIP assays). Again, the user is urged to check the literature references provided in the database.

The accuracy and updating of the information gathered, curated and inserted in this database is crucial to PathoYeastract users. Thus, we will value any contribution from the yeast community to achieve this goal.

The results presented for this Tutorial were computed on June 30, 2016. However, due to subsequent updates the current ranking may differ from the presented one.




Example 1: Identification of the documented and potential regulatory associations for an ORF/Gene

The functional analysis of an ORF or gene can be guided through the identification of its documented and potential transcription factors (TF). This example describes one of the possible ways to explore the regulatory associations for the C. glabrata ORF CAGL0G08624g, encoding a Drug:H+ Antiporter of the Major Facilitator Superfamily which remained uncharacterized until recently , using various queries and utilities provided by YEASTRACT.

1.1 - Search for Documented Transcription Factors (TFs)
The use of "Search for Transcription Factors" query allows the identification of TFs which, are Documented and/or Potential transcriptional regulators of a given gene. The search for documented transcription factors acting directly on CAGL0G08624g uncovers a single TF: Pdr1. The associated bibliographic references may be checked by the user to know the experimental basis for these regulatory associations.

According to the CGD description of Pdr1 this regulator is involved in the control of multidrug resistance, with a special importance in the clinical acquisition of azole resistance. Therefore, it may be considered of interest to examine the eventual link of ORF CAGL0G08624g to these biological processes. Indeed, in a recent study CAGL0G08624g was shown to confer azole drug resistance, being involved in azole drug extrusion from within C. glabrata cells (1), and consistently up-regulated in clotrimazole resistant clinical isolates (2). Given its homology to S. cerevisiae QDR2 gene, this ORF was coined C. glabrata QDR2.

1.2 - Search for Potential Transcription Factors (TFs)
The use of the "Search for Transcription Factors" query may also identify the potential regulators of CAGL0G08624g. By default, all of the searched potential transcription factors will be displayed in tabular form. The Promoter link can be followed to see the binding sites for each TF in the promoter sequence of CAGL0G08624g. The distribution of TF binding sites in the promoter region of CAGL0G08624g can be viewed by checking the option image while searching.

Generated Image with TFs binding to gene promoter

Transcription Factor Consensus Position Strand
Target gene: QDR2
Back to top top
Yap6p AATKACV -733 R
Yap7p AGTCATM -191 R
Yap7p MTKASTMA -898 F
Yap7p MTKASTMA -889 R
Yap7p MTKASTMA -177 R
Pdr1p HYCCGTGGR -849 F
Pdr1p TCCACGGA -840 R
Pdr1p TCCRYGGA -848 F
Pdr1p TCCRYGGA -840 R
Ap1p TTACAAA -435 F
Ap1p VDTASTAA -867 R
Ap1p VTTACWAAB -436 F
Amt1p WATHNGCTGW -171 R

The display of potential TFs on the image can be controlled by un-checking their respective box in the color pallet below the image and pressing the Redisplay button. The color pallet displays the color for only those TFs for which binding sites are found in the promoter region of the given gene(s). A close observation of the image, looking for the TF which is the documented regulator of CAGL0G08624g (i.e., Pdr1) reveals that a binding site for Pdr1 is indeed present, although at a relatively long distance from the START codon. However, binding sites for the TFs Yap1, Yap6, Yap7 and Amt1 can be found in the promoter region of the QDR2 gene, suggesting that they may play a role in the regulation of QDR2 expression. The predicted functions of these TFs, as regulators of oxidative stress, osmotic stress response, iron-cluster biogenesis and metallothionein genes, respectively, further hint to a possible role for QDR2 in these processes.

1.3 - Search for Genes
If the ORF/gene under study is predicted to encode a TF, it would be convenient to use the query, "Search Regulated Genes", options Documented or Potential, to retrieve all documented and potential targets for the TF, respectively.

For example, if considering the C. albicans TF Tac1, the master regulator of drug resistance, the obtained result is displayed in the following Table 1.

Transcription Factor Documented
Genes - Reference
Tac1p plus icon ADH1  -  Reference
ATF1  -  Reference
C1_03510C_A  -  Reference
C1_04010C_A  -  Reference
C1_10280C_A  -  Reference
C2_01390W_A  -  Reference
C2_05570C_A  -  Reference
C2_07440C_A  -  Reference
C2_07630C_A  -  Reference
C2_08100W_A  -  Reference
C4_05810W_A  -  Reference
C5_00390C_A  -  Reference
C5_00750C_A  -  Reference
C6_00850W_A  -  Reference
C6_00920W_A  -  Reference
C6_01780C_A  -  Reference
C6_01870C_A  -  Reference
C6_02560W_A  -  Reference
C7_00770W_A  -  Reference
C7_02140W_A  -  Reference
C7_04090C_A  -  Reference
CDC23  -  Reference
CDR1  -  Reference
CDR2  -  Reference
CR_03250C_A  -  Reference
CR_05860W_A  -  Reference
CR_09100C_A  -  Reference
CR_09670C_A  -  Reference
CR_10280W_A  -  Reference
ERG1  -  Reference
ERG11  -  Reference
GPD2  -  Reference
GPX1  -  Reference
HRQ2  -  Reference
HSP12  -  Reference
IFE1  -  Reference
IFU5  -  Reference
LCB4  -  Reference
MNT1  -  Reference
OSM2  -  Reference
PDR16  -  Reference
PEX11  -  Reference
RTA3  -  Reference
SNZ1  -  Reference
ZCF8  -  Reference

Table 1 - Documented target genes of the Candida albicans Tac1 transcription factors.

Interestingly, besides the most commonly known targets of Tac1, the drug efflux pump encoding genes CDR1 and CDR2, it is possible to observe that other genes whose role is apparently unrelated to drug resistance are also Tac1 targets. For example, Adh1 and Snz1 are related to central carbon metabolism and vitamin B synthesis, respectively. This observation, raises the possibility of either Tac1 playing additional roles in C. albicans biology or Adh1 and Snz1 contributing somehow to drug tolerance.




Example 2: Gene expression analysis based on regulatory associations

PathoYeastract provides tools for the classification and grouping of large lists of genes of interest, such as those found up- or down-regulated under a specific environmental cue or genetic mutation, as suggested by genome-wide expression data inspection. These analyses are based on known or algorithmically identified potential regulatory associations, deposited in the PathoYeastract database, or on shared Gene Ontology (GO) terms.

2.2 - Rank by Gene Ontology (GO)
The grouping of genes based on the GO terms they share is a feature common to a number of gene expression analysis software and is also implemented in PathoYeastract. To exemplify this utility, we used the list of proteins whose expression was seen to change in C. glabrata cells exposed to the selenium in a Yap1 dependent manner (3). The grouping of this gene list, based on the Biological Process ontology, and considering only the GO terms associated to more than 5% of the genes in the list, is shown in the following table:

GO ID       GO term       Depth level       % in user set       % in PathoYeastract       p-value       Genes/ORF      
GO:0055114 oxidation-reduction process 4 29.58% 13.82% 0.000000000000000 plus icon GPX2 GRE2(B) ERG3 SCS7 CAGL0G07271g SUR2 GLR1 GCY1 TRR1 CAGL0I02574g CAGL0J00451g CAGL0K02629g TSA1 CCP1 ADI1 CAGL0K09702g CTA1 CAGL0K10890g CAGL0K12958g CAGL0L05258g CAGL0M14047g
GO:0008150 biological_process 1 21.13% 1.45% 0.223887640794402 plus icon HSP31 CAGL0C01749g CAGL0D05236g CAGL0F04521g CAGL0G05632g CAGL0G06182g CAGL0H05951g PWP1 CAGL0J01331g CAGL0J09284g AWP2 CAGL0K07205g CAGL0K08206g CAGL0L05720g CAGL0L10362g
GO:0009405 pathogenesis 3 9.86% 4.02% 0.001478685394801 plus icon ERG3 SKN7 GLR1 CAGL0J09680g CTA1 CNB1 CAGL0M12100g
GO:0034599 cellular response to oxidative stress 5 8.45% 7.32% 0.000069570246801 plus icon SKN7 GLR1 CCP1 CTA1 CAGL0L05258g AHP1
GO:0045454 cell redox homeostasis 5 7.04% 20.00% 0.000000490437465 plus icon GPX2 GLR1 TRX2 TSA1 AHP1
GO:0044011 single-species biofilm formation on inanimate substrate 4 5.63% 6.06% 0.001392877591474 plus icon EPA6 CAGL0C03740g CAGL0F00407g CAGL0M12100g
GO:0042744 hydrogen peroxide catabolic process 4 5.63% 100.00% 0.000000000000000 plus icon GPX2 TRX2 TSA1 CTA1
GO:0035690 cellular response to drug 5 5.63% 1.77% 0.157556176114145 plus icon ERG3 SKN7 FLR1 CNB1

Based on the % of genes associated to each GO term, the first hit is "oxidation-reduction process". Considering, however, a GO term enrichment analysis perspective, the GO term "hydrogen peroxide catabolic process" shows up together with "oxidation-reduction process" as the ones with a lower p-value. Both criteria, however, appear to favor the idea that selenium induces oxidative imbalance in C. glabrata cells.

2.3 - Rank by Transcription Factor
The query "Rank by TF" enables automatic selection and ranking of transcription factors potentially involved in the regulation of the genes in a list of interest. The TFs and their direct targets are presented in a table in decreasing order of a relevance score calculated for each TF, based on either regulations or regulatory paths targeting the genes in the list of interest and deposited in the YEASTRACT database. Different filters can be used in order to steer the search to a particular type of regulatory activity. To exemplify this utility, we used the list of proteins whose expression was seen to change in C. glabrata cells exposed to the antifungal drug clotrimazole (4). The results of this query, based on the Biological Process ontology obtained using PathoYeastract on June 20, 2016, is shown in the following Figure.

TFs predicted to regulate this transcriptional response can be ranked by the % of genes in the list associated to them. Using such a ranking the TF Pdr1 comes on top, regulating 30% of the gene set, while ORF GACL0G08844g regulates 16% of the gene list. The fact that Pdr1 regulates the most genes in response to an azole drug is an expected result, given its know role in this process (5). The second most highly ranked TF is somewhat more surprising, being its closest S. cerevisiae homolog the TF Asg1, characterized as involved in response to stress, particularly cell wall related stress. The appearance of this TF as the regulator of a large fraction of the clotrimazole responsive genes suggests that either this azole drug induces cell wall stress, or that the function of the uncharacterized C. glabrata TF encoded by ORF GACL0G08844g is divergent from that of its homolog in S. cerevisiae.

When ranking by statistical significance of regulations, the TF score is given by a p-value denoting the overrepresentation of regulations of the given TF targeting genes in the list of interest relative to the regulations of that TF targeting genes in the whole YEASTRACT database. The p-value further denotes the probability that the TF regulates at least the number of genes found to be regulated in the list of interest if we were to sample a set of genes of the same size as the list of interest from all the genes in the YEASTRACT database. This probability is modeled by a hypergeometric distribution and the p-value is finally subject to a Bonferroni correction for multiple testing.

Below is the output of the utility "Rank by TF" based on regulation enrichment for the clotrimazole dataset, using the default filtering options. In Table 3, the first column indicates the name of the TF, the second column the % of genes in the list targeted by the TF, the third column the % resulting from the ratio between the number of genes in the list targeted by the TF and the number of genes targeted by the TF in the whole YEASTRACT database, the fourth column the enrichment p-value, and the fifth and final column the genes from the list of interest targeted by the TF.

Transcription Factor       % in user set       % in PathoYeastract       p-value       Target ORF/Genes
CAGL0G08844g 16.22%1.76%0.006041706224774 plus icon EFT2 CAGL0A03388g CAGL0F07073g RPS11B PGK1 CAGL0L08114g
Pdr1p 29.73%2.76%0.000005225178189 plus icon CAGL0F04565g GAS1 CAGL0G01078g QDR2 RIP1 SNQ2 HFD1 CAGL0L01485g CAGL0L12870g CAGL0L12936g CDR1
Ap1p 10.81%1.49%0.030471370646758 plus icon GAS1 RIP1 SNQ2 HFD1
Yap7p 2.70%0.58%0.315094387435406 plus icon CAGL0H03773g
Skn7p 5.41%2.90%0.010307676896314 plus icon RIP1 HFD1
CAGL0I07755g 2.70%0.33%0.606501953842614 plus icon CAGL0I10472g
Ace2p 5.41%3.17%0.008027024825852 plus icon PGK1 CAGL0L12870g
Stb5p 2.70%2.94%0.020860938218443 plus icon CDR1

In this case the enrichment-based ranking of transcription factors reports basically the same two TF as the highest ranking TFs.

References

  1. Costa, C., Pires, C., Cabrito, T.R., Renaudin, A., Ohno, M., Chibana, H., Sá-Correia, I. and Teixeira, M.C. (2013) Candida glabrata Drug:H+ Antiporter CgQdr2 Confers Imidazole Drug Resistance, Being Activated by Transcription Factor CgPdr1. Antimicrob Agents Chemother, 57, 3159-3167.
  2. Costa, C., Ribeiro, J., Miranda, I.M., Silva-Dias, A., Cavalheiro, M., Costa-de-Oliveira, S., Rodrigues, A.G. and Teixeira, M.C. (2016) Clotrimazole Drug Resistance in Candida glabrata Clinical Isolates Correlates with Increased Expression of the Drug:H(+) Antiporters CgAqr1, CgTpo1_1, CgTpo3, and CgQdr2. Front Microbiol, 7, 526.
  3. Merhej, J., Thiebaut, A., Blugeon, C., Pouch, J., Ali Chaouche Mel, A., Camadro, J.M., Le Crom, S., Lelandais, G. and Devaux, F. (2016) A Network of Paralogous Stress Response Transcription Factors in the Human Pathogen Candida glabrata. Front Microbiol, 7, 645.
  4. Pais, P., Costa, C., Pires, C., Shimizu, K., Chibana, H. and Teixeira, M.C. (2016) Membrane Proteome-Wide Response to the Antifungal Drug Clotrimazole in Candida glabrata: Role of the Transcription Factor CgPdr1 and the Drug:H+ Antiporters CgTpo1_1 and CgTpo1_2. Mol Cell Proteomics, 15, 57-72.
  5. Tsai, H.F., Krol, A.A., Sarti, K.E. and Bennett, J.E. (2006) Candida glabrata PDR1, a transcriptional regulator of a pleiotropic drug resistance network, mediates azole resistance in clinical isolates and petite mutants. Antimicrob Agents Chemother, 50, 1384-1392.
w3c xhtml validator w3c css validator