Thursday, 14 April 2016

Navigating the ever-changing ocean of biological knowledge



The removal of annotation from biological databases is often taken to mean that the annotation was wrong in the first place. Why else would diligent biocuators remove information that had been painstakingly added to database entries? In our recent paper, 'GO annotation in InterPro: why stability does not indicate accuracy in a sea of changing annotations', we look at some of the diverse, data-driven changes that can underlie the deletion or update of Gene Ontology annotations in the InterPro database, and highlight some of the consequent effects of these changes on UniProt protein annotations. We also explain why these changes don't necessarily mean that the original annotations were unreliable. Alternatively, we argue that they signify a curation effort committed to annotation accuracy, attempting to navigate an ever-changing ocean of biological knowledge.

Alex Mitchell
on behalf of the InterPro team

Wednesday, 24 February 2016

Zika Virus and Microcephaly



You have probably been as horrified and saddened as me to see the shocking abnormality that affects newborn babies whose mothers have been infected with the Zika virus.  The skulls and brains of the babies have not grown properly, and the babies appear to have small heads, a condition known as "microcephaly".  The standard definition is that the circumference of the head is two (or three) standard deviations below average for age and sex [1,2] (Fig. 1).  

Fig. 1. Diagram to show size of a baby’s head with microcephaly compared to a normal baby’s head.  From  https://prezi.com/iwv4kvehmhbv/microcephaly-then-and-now/.

Origin and spread of the Zika virus

Zika virus has been known since the 1940s, and originally occurred in the equatorial regions of Africa.  It is named after the Zika Forest near the Ugandan capital of Entebbe.  Analysis of the various sequenced genomes has shown an origin in central Africa (a strain from Uganda isolated in 1947 being the oldest), spreading elsewhere in Africa (Senegal (1984), Nigeria (1968)  and the Central African Republic (1976)) and then spread westwards to Malaysia (1966), Cambodia (2010), Micronesia (2007), French Polynesia (2013) and then Suriname and Brazil (2015) [http://virological.org/t/initial-Zika-phylogeography/202].  The virus is transmitted by mosquitoes such as Aedes aegypti (Fig. 2) and A. albopictus.  These mosquitoes are active during the day, mainly at dawn and dusk and when the weather is cloudy, and transmit the virus from patient to patient when the females take a blood meal.  A. aegypti is known as the yellow fever mosquito, and is particularly distinctive with white rings around the leg joints and white markings on the body.  This mosquito originated in Africa but has since spread throughout the tropics [3].  There is also evidence that Zika virus can be transmitted sexually via the semen of an infected man [4].

Fig. 2. An Aedes aegypti mosquito (photo taken by Muhammad Mahdi Karim in Dar es Salaam, Tanzania, 2009).


Zika fever, which has mild influenza-like symptoms, had been thought to be a trivial disease.  Now there are a several questions that require answers.  If there a causal link between microcephaly and viral infection or are the symptoms coincidental?  If the disease causes the symptoms, is this an effect of viral enzymes, or a consequence of the body's own immunological system attacking more than just the virus?

Microcephaly in Brazil

Microcephaly is not a new condition, and can result from chromosomal abnormalities as well as environmental conditions that can affect brain growth.  Mutations in the genes MCPH1, which encodes the protein microcephalin, and ASPM, which encodes abnormal spindle-like microcephaly-associated protein, can cause primary microcephaly when the gene is homozygous [5- 7].  Microcephaly is associated with other viral diseases, such as chickenpox [8], but incidences are rare because women rarely get the disease when pregnant because of the innate immunity they acquired during childhood infection.  It is possible, of course, that the same may be true of Zika virus, which would explain why microcephaly is not prevalent in Africa, because women acquire immunity as girls, and would also explain the dramatic increase in the condition in Brazil, where the disease arrived recently and pregnant women have no immunity.  The rates of Zika infection and microcephaly in Brazil really are alarming.  It has been estimated that 1.5 million cases of Zika fever occurred in Brazil between April 2015 and January 2016, and 3718 cases of microcephaly (38 of which led to death) [9], which is one case per 403 infections, and one case per 793 births (the population of Brazil is 204 million and the annual birth rate is 14.46 per 1000 [https://www.cia.gov/library/publications/the-world-factbook/geos/br.html]).  This is considerably higher than the known incidence of microcephaly in the UK (where the Zika virus is absent): approximately 1 in 10,000 births in the UK [http://www.rightdiagnosis.com/m/microcephaly/basics.htm]. 

Zika virus polyprotein

The Zika virus is a flavivirus, a group that includes the viruses that cause yellow fever, dengue fever, Japanese encephalitis and West Nile fever.  These viruses contain single-stranded RNA as their genetic material, and the RNA encodes a single polyprotein.  This polyprotein consists of several enzymes and structural proteins, and processing by an endogenous serine endopeptidase is required to separate the individual proteins.  By submitting the Zika virus polyprotein to InterProScan, it is possible to identify all the components.  These are shown below.  There is no component with an unknown function or one expected to affect brain development directly.

Fig. 3.  Zika virus polyprotein domains identified by InterProScan.



How polyprotein processing progresses in the Zika vuris polyprotein is unknown, but some of the cleavage sites have been mapped in both the yellow fever virus and West Nile virus [10, 11].  All known cleavages are performed by the endogenous serine endopeptidase, but one cleavage can be performed by unrelated host serine endopeptidases normally responsible for processing host protein precursors [12].  The specificity for both the viral and host endopeptidases is similar: cleavage follows a pair of basic residues (lysine or arginine) and precedes glycine, serine or threonine.  A pairwise alignment of the West Nile and Zika virus polyprotein sequences, shows that the known cleavage sites are conserved (Fig. 4).   

Fig. 4 Conservation of polyprotein cleavage sites
Sites of cleavage are indicated by an arrow.  Residues highlighted in pink are conserved between West Nile virus (W Nile) and Zika virus.  Residue numbers are shown above and below each sequence.
      60        70        80        90       100      110        
W Nile APTRAVLDRWRGVNKQTAMKHLLSFKKELGTLTSAINRRSTKQKKRGGTAGFTILLGLIA
        :. ....::  :.:. :.: : .:: ..::.   :: :.::  :::  .:. ..:.:..
Zika   KPSTGLINRWGKVGKKEAIKILTKFKADVGTMLRIINNRKTK--KRGVETGI-VFLALLV
      60        70        80        90       100          110     

     180       190       200         210 ↓     220       230      
W Nile AAGNDPEDIDCWCTKSSVYVRYGRCTK--TRHSRRSRRSLTVQTHGESTLANKKGAWLDS
           .:::.::::....... :: ::.  : ..::::::.:. .:. . : .....::.:
Zika   EPQYEPEDVDCWCNSTAAWIVYGTCTHKTTGETRRSRRSITLPSHASQKLETRSSTWLES
        180       190       200       210       220       230     

     1370     1380      1390      1400      1410      1420       
W Nile DPNRKRGWPATEVMTAVGLMFAIVGGLAELDIDSMAIPMTIAGLMFVAFVISGKSTDMWI
         ..::.:: .::::::::. ::::::.. ::: :: ::.  ::. :..:.::::.::.:
Zika   TASKKRSWPPSEVMTAVGLICAIVGGLTKTDID-MAGPMAAIGLLVVSYVVSGKSVDMYI
       1370      1380      1390       1400      1410      1420    


     1490       1500    ↓ 1510      1520      1530      1540      
W Nile ILPSVIGFW-ITLQYTKRGGVLWDTPSPKEYKKGDTTTGVYRIMTRGLLGSYQAGAGVMV
       : : . . : . ..  ::.:..:: :::.: :::.::.:::::::: :::: :.:::::
Zika   I-PFAAAAWFVYIKSGKRSGAMWDIPSPREVKKGETTAGVYRIMTRKLLGSTQVGAGVMH
         1490      1500      1510      1520      1530      1540   

      2090      2100      2110      2120       2130      2140     
W Nile ITKLGERKILRPRWADARVYSDHQALKSFKDFASGKRS-QIGLVEVLGRMPEHFMGKTWE
        ::.::.:::.::: :::. ::: .:::::.::.:::.   ::.:..: .: :.  .  :
Zika   WTKFGEKKILKPRWMDARICSDHASLKSFKEFAAGKRTIATGLIEAFGMLPGHMTERFQE
         2090      2100      2110      2120      2130      2140   

            2510      2520        2530      2540      2550 
                                          
W Nile HIMRGGWLSCLSITWTLIKNMEKPGL--KRGGAKGRTLGEVWKERLNHMTKEEFTRYRKE
       .:.::..:.  :. .:. .:    :.  ::::..:.:.:: ::::::.::  ::  :..
Zika   NIFRGSYLAGPSLIYTVTRNA---GIMKKRGGGNGETVGEKWKERLNRMTALEFYAYKRS
    2500      2510      2520         2530      2540      2550     

Is microcephalin a substrate for the Zika virus endopeptidase?

Could it be that the viral endopeptidase is processing host proteins at similar sites?  There are at least 24 human proteins known to be cleaved by viral endopeptidases.  Cleaving eukaryotic translation initiation factors and polyadenylate-binding protein 1 switches off the host cell's own protein synthesis mechanism, ensuring that only viral proteins are made, and the endopeptidases from retroviruses, enteroviruses and foot-and-mouth disease virus all cleave these proteins [13-17].   Nuclear pore glycoprotein p62 is also cleaved by the rhinovirus endopeptidase picornain 2A peptidase, and this disrupts trafficking from the nucleus to the cytoplasm [18].  Both microcephalin (http://www.uniprot.org/uniprot/Q8NEM0) and ASPM (http://www.uniprot.org/uniprot/Q8IZT6) have regions that conform to the specificity of the Zika virus endopeptidase (Fig. 5) so either could be a potential substrate and be inactivated by cleavage.  If cleavage of these proteins has the same effect as mutations in the genes, then cleavage could lead to microcephaly.

Fig. 5 Potential cleavage sites in microcephalin and ASPM
MCPH1  66  QSTWDKAQKR+GVKLVSVLWV
MCPH1 375  PPKEKCKRKR+STRRSIMPRL
MCPH1 379  KCKRKRSTRR+SIMPRLQLCR
MCPH1 467  MSDFSCVGKK+TRTVDITNFT
MCPH1 486  TAKTISSPRK+TGNGEGRATS
MCPH1 639  LIKPHEELKK+SGRGKKPTRT

ASPM  148  NAEEQKKKKR+SLWDTIKKKK
ASPM  243  ATCLPLSVRR+STTYSSLHAS
ASPM  431  VPQSPEDWRK+SEVSPRIPEC
ASPM  576  TTASVARKRK+SDGSMEDANV
ASPM  616  SEPKTSAVKK+TKNVTTPISK
ASPM  639  NREKLNLKKK+TDLSIFRTPI
ASPM  655  RTPISKTNKR+TKPIIAVAQS
ASPM 1081  FLKHTKSIKK+TISLLSCHSD
ASPM 1098  HSDDLINKKK+GKRDSGSFEQ
ASPM 1584  DRVRFLNLKK+TIIKFQAHVR
ASPM 2095  QHKEYLNLKK+TAIKIQSVYR
ASPM 2184  ASFRGVRVRR+TLRKMQTAAT
ASPM 2287  MRRRFLSLKK+TAILIQRKYR
ASPM 2712  RAKVDYETKK+TAIVVIQNYY
ASPM 3081  ERIKYIEFKK+STVILQALVR
ASPM 3252  IREENKLYKR+TALALHYLLT

Conclusions

The incidences of microcephaly in babies born to mothers infected by the Zika virus in Brazil are not only alarmingly high, but much higher than the background mutation rate that causes microcephaly in the UK; there seems to be little doubt that the condition and Zika fever are related.  Whether this relationship is because the disease is new to Brazil, mothers have no immunity and microcephaly results from the body’s own immune response, as has been observed previously in chickenpox, or because of the presence of a viral toxin, is not known.  If the latter, then it is possible that the proteins derived from genes in which mutations are known to cause microcephaly are susceptible to digestion by the Zika virus polyprotein processing enzyme, which is predicted to have a specificity similar to that of host prohormone convertases: inactivating the proteins may have the same results as mutations in the genes.  Further research is required to understand the mechanisms causing microcephaly, which might include characterization of the viral endopeptidase.  If the symptoms are due to the response of the immune system, then microcephaly might be a transitory phenomenon, and once the population builds up immunity, such incidences could become very rare in the future.

References

1. Leviton, A., Holmes, L. B., Allred, E. N. & Vargas, J. (2002). Methodologic issues in epidemiologic studies of congenital microcephaly. Early Hum. Dev. 69:91-105. doi:10.1016/S0378-3782(02)00065-8. PMID:12324187.
2. Opitz, J. M. & Holt, M. C. (1990). Microcephaly: general considerations and aids to nosology. J. Craniofac. Genet. Dev. Biol. 10:75-204. PMID:2211965.
3. Mousson, L.,  Dauga, C., Garrigues, T., Schaffner, F., Vazeille, M.  & Failloux, A. (2005). Phylogeography of Aedes (Stegomyia) aegypti (L.) and Aedes (Stegomyia) albopictus (Skuse) (Diptera: Culicidae) based on mitochondrial DNA variations. Genetics Research 86:1-11. doi:10.1017/S0016672305007627. PMID:16181519.
4. Musso, D., Roche, C., Robin, E., Nhan, T., Teissier, A. & Cao-Lormeau, V.M.  (2015) Potential sexual transmission of Zika virus.  Emerg Infect Dis 21:359-61. doi: 10.3201/eid2102.141363. PMID:25625872.
5. Jackson, A. P., Eastwood, H., Bell, S. M., Adu, J., Toomes, C., Carr, I. M., Roberts, E., Hampshire, Daniel J., et al. (2002). Identification of Microcephalin, a Protein Implicated in Determining the Size of the Human Brain. Am. J. Human Genetics 71:136-142. doi:10.1086/341283. PMC:419993. PMID:12046007.
6. Jackson, A. P., McHale, D. P., Campbell, D. A., Jafri, H., Rashid, Y., Mannan, J., Karbani, G., Corry, P., et al. (1998). Primary Autosomal Recessive Microcephaly (MCPH1) Maps to Chromosome 8p22-pter. Am. J. Human Genetics 63:541-546. doi:10.1086/301966. PMC:1377307. PMID:9683597.
7. Bond, J., Roberts, E., Mochida, G.H., Hampshire, D.J., Scott, S., Askham, J.M., Springell, K., Mahadevan, M., Crow, Y.J., Markham, A.F., Walsh, C.A. & Woods, C.G. (2002) ASPM is a major determinant of cerebral cortical size. Nat. Genet. 32:316-320.  PMID:14574646.
8. Mirlesse V. & Lebon P. (2003 ) [Chickenpox during pregnancy]. Arch. Pediatr. 10:1113-1118. PMID:14643554.
9. World Health Organization (8 January 2016) Microcephaly - Brazil.
10. Chappell, K. J., Stoermer, M. J., Fairlie, D. P. & Young, P. R. (2006) Insights to substrate binding and processing by West Nile Virus NS3 protease through combined modeling, protease mutagenesis, and kinetic studies. J. Biol. Chem. 281:38448-38458. PMID:17052977.
11. Shiryaev, S. A., Ratnikov, B. I., Chekanov, A. V., Sikora, S., Rozanov, D. V., Godzik, A., Wang, J., Smith, J. W., Huang, Z., Lindberg, I., Samuel, M. A., Diamond, M. S. & Strongin, A. Y. (2006) Cleavage targets and the D-arginine-based inhibitors of the West Nile virus NS3 processing proteinase. Biochem. J.  393:503-511. PMID:16229682.
12. Remacle, A. G., Shiryaev, S. A., Oh, E. S., Cieplak, P., Srinivasan, A., Wei, G., Liddington, R. C., Ratnikov, B. I., Parent, A., Desjardins, R., Day, R., Smith, J. W., Lebl, M. & Strongin, A. Y. (2008) Substrate cleavage analysis of furin and related proprotein convertases. A comparative study. J. Biol. Chem. 283:20897-20906. PMID:18505722.
13. Alvarez, E., Menéndez-Arias, L., & Carrasco, L. (20030 The eukaryotic translation initiation factor 4GI is cleaved by different retroviral proteases. J. Virol. 77:12392-12400.
14. Gradi, A., Foeger, N., Strong, R., Svitkin, Y. V., Sonenberg, N., Skern, T., Belsham, G. J. (2004) Cleavage of eukaryotic translation initiation factor 4GII within foot-and-mouth disease virus-infected cells: identification of the L-protease cleavage site in vitro. J. Virol. 78:3271-3278.
15. Gradi, A., Svitkin, Y. V., Sommergruber, W., Imataka, H., Morino, S., Skern, T. & Sonenberg, N. (2003) Human rhinovirus 2A proteinase cleavage sites in eukaryotic initiation factors (eIF) 4GI and eIF4GII are different. J. Virol. 77:5026-5029. PMID:15016848.
16. Foeger, N., Schmid, E. M. & Skern, T. (2003) Human rhinovirus 2 2Apro recognition of eukaryotic initiation factor 4GI. Involvement of an exosite. J. Biol. Chem. 278:33200-33207. PMID:12791690.
17. Kuyumcu-Martinez, N. M., Joachims, M. & Lloyd, R. E. (2002) Efficient cleavage of ribosome-associated poly(A)-binding protein by enterovirus 3C protease. J. Virol. 76:2062-2074. PMID:11836384.
18. Park, N., Skern, T. & Gustin, K. E. (2010) Specific cleavage of the nuclear pore complex protein Nup62 by a viral protease.  J. Biol Chem. 285:28796-805. doi:10.1074/jbc.M110.143404. PMID:20622012.

Tuesday, 7 April 2015

What's new in InterPro release 50.0 and 51.0


Faster InterPro member database processing:
InterPro releases 50.0 and 51.0 have brought some important developments from an InterPro production point of view, which we thought would be worth sharing. Release 50.0 saw the incorporation of a new version of PIRSF, which has importantly been migrated to use the HMMER3.1b analysis algorithm. This version of HMMER runs approximately one thousand times faster than the previous version used by PIRSF (HMMER2.0), helping to ensure that InterPro can continue to calculate UniProtKB match data in a timely manner. In a related development, as part of InterPro release 51.0, we debuted a sequence database pre-filtering heuristic to reduce the amount of time it takes to calculate matches against the HAMAP database (the heuristic is based on HMMER3.0, but the analysis still uses the core HAMAP algorithm, and is all implemented within the InterProScan software).  This again speeds up our protein match generation process and helps to safeguard against future data growth. The PIRSF and HAMAP databases were identified as being the slowest databases to calculate matches at at the start of 2014, but after work from both the database maintainers and the InterPro team, but this is no longer the case.

A leaner UniProtKB: 
At the same time, the number of proteins in UniProtKB has decreased significantly, where some 47 million sequences from highly redundant bacterial proteomes have been deleted (for details, see here, described half way down the page).

Faster and fitter InterPro production:
The majority of these developments have taken place under the hood, so it is unlikely that you will have been aware of our fitter and faster production system. What we hope you will notice, however, are more regular InterPro releases and more frequent member database updates in future, as these and other optimisations come into effect.

Alex Mitchell
on behalf of the InterPro team

Tuesday, 31 March 2015

The sweetest thing

By Hsin-Yu Chang

A famous cola company launched a new product contained in a gleaming green can last year. As a regular cola drinker, I was intrigued by the packaging. After doing some research, I discovered that this variety of cola contains a sweetener called Stevia.
    Figure 1. Stevia rebaudiana
  Ethel Aardvark, Wikimedia

Stevia is extracted from a plant, Stevia rebaudiana, found in Brazil and Paraguay. The leaves of the Stevia plant have been used for hundreds of years in both countries to sweeten local teas and medicines. The sweet taste is mainly from steviol glycoside compounds, which have up to 150 times the sweetness of sugar, but zero calories 1.

The story of Stevia gave me, a protein database curator, the idea to search for the sweetest proteins to date. I found one such protein, thaumatin (IPR001938), produced by Thaumatococcus daniellii (also known as Katemfe), a shrub from West Africa. Thaumatin is around 2,000 times sweeter than sugar 2 !

Similar to Stevia, Katemfe plants have been used by the locals for a long time; they use its leaves for wrapping food and its fruits for sweetening breads, palm wine and sour food. Their sweet proteins, thaumatin I and thaumatin II, were first identified in the 1970s in the search for non-toxic, non-calorific 'natural' sweeteners to replace synthetic ones 3.

Figure 2. Katemfe plant
~from Engler et al. Marantaceae, vol. 48: [Heft 11], p. 40, fig. 8 (1902).

Why do plants like Katemfe produce extremely sweet proteins? The answer may lie in the plant defence systems. Under environmental stresses or pathogen attack, plants can produce proteins that help them stay alive. In the case of Katemfe, attack by a viroid (a sub-viral pathogen) induces thaumatin production. Thaumatin has also been shown to have antifungal activities, which suggests it may be part of a defence mechanism that prevents further pathogen attacks 4.

In fact, thaumatin shares a conserved site (IPR017949) with a group of pathogenesis-related proteins, also known as thaumatin-like proteins (TLPs) 5, including tobacco salt-induced protein osmotin 6 and maize antifungal protein zeamatin 7. Like thaumatin, this group of proteins plays an essential part in plant defence against either environment stress or pathogen attack 8.

Another question is, why does thaumatin taste sweet to us? This is down to the sweetness receptors in our taste buds on our tongues. The sweet molecules (chemicals or proteins) are perceived by G-protein-coupled receptors, consisting of  two subunits, T1R2 and T1R3. Certain amino acid residues in these subunits affect their ability to recognise the sweet molecules 9. Interestingly, apes and Old World monkeys can perceive thaumatin as a sweet protein, while New World monkeys and rodents cannot 2.  In other words, the sweet taste of thaumatin for us humans could be just an evolutionary coincidence.

Figure 3. Sweet receptor, the peptide region involved in the response for thaumatin is shown in red 2.

So far, several chemical sweeteners have been commercialised, such as aspartame, sucralose and saccharin, and many more products may yet emerge. We all know that our sugar consumption causes health problems like obesity, diabetes and tooth decay. To avoid such health issues, scientists have searched far and wide to find alternatives. Natural sweeteners, such as Stevia and thaumatin, have provided new options for us. However, with so many different products on the market, as a consumer, I am still sitting on the fence to see which ones provide the best health benefits.

Figure 4. What should be in yours?
Additional information:
I. Katemfe:
Katemfe is a 3-4 metre tall shrub from the rain forests of West Africa. It bears light purple flowers and a soft fruit containing shiny black seeds. The fruit is covered in a fleshy red aril, the part that contains thaumatin.

II. E numbers:
Thaumatin has been approved by the European as a sweetener, known as E957. It is usually used in processed foods and has a slight licorice aftertaste.

III. Calories:
Despite thaumatin containing 4 calories/gram (3.87 calories/gram for sucrose), the amount needed to be used in food or drink is extremely small, due to its high potency.

IV.Other sweet proteins:
Besides thaumatin, there are a few other sweet proteins such as monellin (IPR015283), pentadin, mabinlin and brazzein 10,11.

V. Further reading:
How did stevia get mainstream?  -By Tom Heyden

Are sweeteners really bad for us? -By Claudia Hammond
http://www.bbc.com/future/story/20150127-are-sweeteners-really-bad-for-us


References:
1. Cardello HM, Da Silva MA, Damasio MH., Measurement of the relative sweetness of stevia extract, aspartame and cyclamate/saccharin blend as compared to sucrose at different concentrations. Plant Foods Hum Nutr. 54(2):119-30., 1999. [PMID:10646559]

2. Masuda T, Taguchi W, Sano A, Ohta K, Kitabatake N, Tani F., Five amino acid residues in cysteine-rich domain of human T1R3 were involved in the response for sweet-tasting protein, thaumatin. Biochimie. 95(7):1502-5., 2013. [PMID:23370115]

3. van der Wel H, Loeve K., Isolation and characterization of thaumatin I and II, the sweet-tasting proteins from Thaumatococcus daniellii Benth. Eur J Biochem. 31(2):221-5., 1972. [PMID:4647176]

4. Rodrigo I, Vera P, Frank R, Conejero V., Identification of the viroid-induced tomato pathogenesis-related (PR) protein P23 as the thaumatin-like tomato protein NP24 associated with osmotic stress. Plant Mol Biol. 16(5):931-4., 1991. [PMID:1859873]

5. Liu JJ, Sturrock R, Ekramoddoullah AK., The superfamily of thaumatin-like proteins: its origin, evolution, and expression towards biological function. Plant Cell Rep. 29(5):419-36., 2010. [PMID:20204373]

6. Subramanyam K, Arun M, Mariashibu TS, Theboral J, Rajesh M, Singh NK, Manickavasagam M, Ganapathi A., Overexpression of tobacco osmotin (Tbosm) in soybean conferred resistance to salinity stress and fungal infections. Planta. 236(6):1909-25., 2012. [PMID:22936305]

7. Schimoler-O'Rourke R, Richardson M, Selitrennikoff CP., Zeamatin inhibits trypsin and alpha-amylase activities. Appl Environ Microbiol. 67(5):2365-6., 2001. [PMID:11319124]

8. Monteiro S, Barakat M, Piçarra-Pereira MA, Teixeira AR, Ferreira RB. Osmotin and thaumatin from grape: a putative general defense mechanism against pathogenic fungi. Phytopathology. 93(12):1505-12, 2003. [PMID:18943614]

9. Masuda T, Mikami B, Tani F., Atomic structure of recombinant thaumatin II reveals flexible conformations in two residues critical for sweetness and three consecutive glycine residues. Biochimie. 106:33-8, 2014. [PMID:25066915]

10. Faus I, Recent developments in the characterization and biotechnological production of sweet-tasting proteins. Appl Microbiol Biotechnol. 53(2):145-51., 2000. [PMID:10709975]

11. Masuda T, Kitabatake N. Developments in biotechnological production of sweet proteins. 102(5):375-89.  J Biosci Bioeng. 2006. [PMID:17189164]

Wednesday, 26 November 2014

In the pipeline – streamlined InterPro production


You may have noticed that InterPro has had fewer releases than usual this year. It is not that we haven’t been working as hard as ever, integrating member database signatures into InterPro entries and adding Gene Ontology terms - we have! But a number of things have been going on behind the scenes, which we thought you might be interested in knowing about.

Sequence growth 
InterPro release 1.0, back in 2000, was built using a version of Swiss-Prot/TrEMBL that contained just over 300 thousand sequences. Our current InterPro release (49.0) is built using over 77 million Swiss-Prot/TrEMBL sequences. That is a massive amount of sequence growth - and even more remarkable is the fact that almost half of these sequences have been added in the last year.

A new InterPro production pipeline
As you might imagine, processing this number of sequences can cause all kinds of problems for computational pipelines that were developed when sequence data volumes were orders of magnitudes smaller. To make sure that we can handle the kind of data volume growth we have been seeing - and expect to see in the future - we have been busy rebuilding our production pipeline. The new system is built entirely on InterProScan, which, for a variety of complicated historical reasons, the previous version was not. This change helps streamline the production process, removes a number of bottlenecks, and generally makes many things associated with data production a lot less complicated.

Further pipeline developments and a new data centre 
To put these changes in place, we have had to focus a lot of our efforts on pipeline development, with knock-on effects on our release schedule. As a consequence, while we have maintained our usual rate of database integrations, these have been squeezed into slightly fewer InterPro releases. And, as a further complication, we have also recently moved all of our data (in the form of hard drives on the back of a truck - no, really!) to a new data centre, as part of EMBL-EBI’s consolidation of its Web infrastructure. This has impacted our release schedule further still. However, we believe that we are now much better placed to calculate and provide match data for our users. We think we are also better prepared for future data production challenges - as the number of protein sequences hits 100 million, and beyond.

Alex Mitchell
on behalf of the InterPro team

Thursday, 6 November 2014

Protein focus: Don’t blame the cat - the toxoplasmosis effect


By Amaia Sangrador and Alex Mitchell





You may have heard about toxoplasmosis, or read about it in a newspaper or magazine. Toxoplasmosis is a condition caused by the protozoan Toxoplasma gondii, an intracellular parasite that infects a wide variety of warm-blooded animals, including humans. T. gondii has attracted the attention of both the scientific and lay communities, and with good reason. It is one of the most successful parasites, infecting over one third of the human population, with rates varying depending upon geographical location1.

Acute toxoplasmosis usually only poses a risk for immunocompromised individuals or pregnant women. However, residual parasites persist lifelong after the acute phase. Though this latent form of the infection was thought to be asymptomatic, a growing body of evidence suggests that this is not the case2. And the long term effects of infection with T. gondii seem related to the most fascinating aspect of this parasite: its ability to modify host behaviour.

Members of the cat family (Felidae) are the only definitive hosts of T. gondii within which the parasite undergoes sexual reproduction. This culminates with the production of oocysts that are shed in the cat’s faeces. Within intermediate hosts (cat’s natural prey, such as rodents and birds)  and other incidental intermediate hosts (such as humans and domestic livestock), the parasite undergoes asexual reproduction, producing bradyzoites that can encyst in the brain and other tissues, where they remain potentially for the host’s lifetime3. Infection can occur following ingestion of oocysts via contaminated soil or water, or ingestion of tissue cysts through raw/undercooked infected meat.


Fig 1. Life-cycle of the parasite Toxoplasma gondii. Sexual reproduction can only be accomplished in felines, and results in the production of sporozoites-containing oocysts that are shed for a limited period. Within intermediate hosts, the parasite undergoes asexual reproduction, producing rapidly dividing tachyzoites - cleared by the immune system - and slowly dividing bradyzoites that can persist as tissue cysts.

Figure by Sebastien Pesseat  



Given that sexual reproduction of T. gondii can be accomplished only in felines, the parasite needs to secure eventual transmission from its intermediate host reservoir, primarily rodents, to its feline definitive host. T. gondii deals with this by manipulating the intermediate host’s behaviour. Toxoplasma infection draws rats to cat odours, increasing activity in limbic regions related to sexual attraction when exposed to cat urine, turning what should be a fear response into a ‘fatal feline attraction4'. The intriguing question is: how does the parasite manage to alter the host behaviour? 



Figure by Sebastien Pesseat


One line of evidence suggests that the parasite alters neurotransmitter signals in the brain through increased dopamine levels, supported by studies showing that parasite-induced behavioural changes can be disrupted with dopamine antagonists5,6. The genome of T. gondii contains two genes encoding an enzyme capable of producing L-DOPA (3,4-dihydroxy-L-phenylalanine), the precursor to dopamine7. One of the genes, TgAaaH1, is constitutively expressed, whilst the other gene, TgAaaH2, is induced during the cyst stages. They encode dual activity amino acid hydrolases, bi-functional enzymes that catabolyse both the amino acids phenylalanine and tyrosine. Thus, they can generate tyrosine from phenylalanine and then use tyrosine to produce L-DOPA. These steps are catalysed in Metazoa by phenylalanine hydroxylase and tyrosine hydroxylase, respectively.
Fig 2. Dopamine biosynthesis pathway


In InterPro, these dual activity amino acid hydrolases are classified as belonging to the aromatic amino acid hydroxylase family (IPR001273). They consist of two domains: an N-terminal ACT domain (IPR002912), and a C-terminal aromatic amino acid hydroxylase domain (IPR019774). The C-terminal domain is responsible for catalysis and the N-terminal domain determines the substrate specificity. You can read more about these proteins and domains on the InterPro website.

Fig 3. InterPro view for aromatic amino acid hydroxylase 1 from T. gondii 
(UniProt protein B2L7T1), the product of gene TgAaaH1.




But if T. gondii can alter the behaviour of cats’ natural prey, what happens when secondary hosts like humans are infected? Altering host behaviour in ‘inappropriate’ hosts seems to be an unnecessary but unavoidable consequence of the parasite’s strategy8. Indeed, studies have revealed a range of subtle behavioural alterations associated with T. gondii infection in humans, many of which may be comparable to those observed in infected rodents – such as increased activity and decreased reaction times. Studies indicate that some of these changes can be sex-specific, as infection has been reported to increase testosterone levels in men, but decrease its levels in women. Consequently, infected men have tendency to disregard rules and are more suspicious and jealous. In women, the shift in these two factors is opposite; they are more warm-hearted, extrovert and easy-going9. More worrying is the link that may exist between infection and psychiatric conditions, such as schizophrenia, in some individuals. This association is supported by several observations, starting with the prevalence of toxoplasmosis in schizophrenic patients10. Furthermore, antipsychotic drugs, known to be effective in schizophrenia, also inhibit T. gondii11. Meanwhile, raised or disrupted dopamine levels have been reported in both rodent and human T. gondii infection and within human patients with schizophrenia12,13

And when you think it cannot get more bizarre, well, it does. Some studies show Toxoplasma’s modification of behaviour persists even after all parasites and cysts have been cleared14. This would contradict the cyst-centric theories, which explain the modification of host behaviour as a result of the parasite’s cysts actively modulating dopamine production or affecting neuronal activity15According to a recent study, an explanation for the persistence of the effects induced by the parasite could reside in its ability to induce epigenetic changes in the host. A change in the  methylation state of the arginine vasopressin promoter has been observed in infected animals, resulting in increased expression of this hormone, and affecting a testosterone-responsive area of the brain known for its role in male sexual behaviour16.  If T. gondii is capable on inducing epigenetic changes, we should consider whether other parasites and pathogens may use similar strategies.

This brings us to an existential question where philosophy and science meet: what is free will if our behaviour can be manipulated? Perhaps, as the philosopher Jose Ortega y Gasset said, we are us and our circumstances. So don’t worry about what might have been. As a future precaution, remember that most people get infected through ingestion or contact with undercooked meat, so make sure that you wash your hands and utensils after handling raw meat. And don’t blame the cat for your behaviour!

Friend or dinner?
Picture from MorgueFile.com, modified by Hsin-Yu Chang


References

1. Flegr J, Prandota J, Sovičková M, Israili ZH. Toxoplasmosis - a global threat. Correlation of latent for the futuretoxoplasmosis with specific disease burden in a set of 88 countries. PLoS One. 9(3):e90203. 2014. [PMID: 24662942]

2. Bhadra R, Cobb DA, Weiss LM, Khan IA. Psychiatric disorders in toxoplasma seropositive patients--the CD8 connection. Schizophr Bull. 39(3):485-9. 2013. [PMID: 23427221]

3. Webster JP, Kaushik M, Bristow GC, McConkey GA. Toxoplasma gondii infection, from predation to schizophrenia: can animal behaviour help us understand human behaviour? J Exp Biol. 216(Pt 1):99-112. 2013. [PMID: 23225872]

4. House PK, Vyas A, Sapolsky R. Predator cat odors activate sexual arousal pathways in brains of Toxoplasma gondii infected rats. PLoS One. 6(8):e23277. 2011. [PMID: 21858053]

5. Prandovszky E, Gaskell E, Martin H, Dubey JP, Webster JP, McConkey GA. The neurotropic parasite Toxoplasma gondii increases dopamine metabolism. PLoS One. 6(9):e23866. 2011. [PMID: 21957440]

6. Webster JP, Lamberton PH, Donnelly CA, Torrey EF. Parasites as causative agents of human affective disorders? The impact of anti-psychotic, mood-stabilizer and anti-parasite medication on Toxoplasma gondii's ability to alter host behaviour. Proc Biol Sci. 273(1589):1023-30. 2006. [PMID: 16627289]

7. Gaskell EA, Smith JE, Pinney JW, Westhead DR, McConkey GA. A unique dual activity amino acid hydroxylase in Toxoplasma gondii. PLoS One. 4(3):e4801. 2014. [PMID: 19277211]

8. Webster JP, Kaushik M, Bristow GC, McConkey GA. Toxoplasma gondii infection, from predation to schizophrenia: can animal behaviour help us understand human behaviour? J Exp Biol. 216:99-112. 2013. [PMID: 23225872]

9. Flegr J. Influence of latent Toxoplasma infection on human personality, physiology and morphology: pros and cons of the Toxoplasma-human model in studying the manipulation hypothesis. J Exp Biol. 216:127-33. 2013. [PMID: 23225875]

10. Torrey EF, Bartko JJ, Lun ZR, Yolken RH. Antibodies to Toxoplasma gondii in patients with schizophrenia: a meta-analysis. Schizophr Bull. 33(3):729-36. 2007. [PMID: 17085743]

11. Jones-Brando L, Torrey EF, Yolken R. Drugs used in the treatment of schizophrenia and bipolar disorder inhibit the replication of Toxoplasma gondii. Schizophr Res. 62(3):237-44. 2003. [PMID: 12837520]

12. Howes OD, Kapur S. The dopamine hypothesis of schizophrenia: version III--the final common pathway. Schizophr Bull. 35(3):549-62. 2009. [PMID: 19325164]

13. Flegr J. How and why Toxoplasma makes us crazy. Trends Parasitol. 29(4):156-63. 2013. [PMID: 23433494]

14. Ingram WM, Goodrich LM, Robey EA, Eisen MB. Mice infected with low-virulence strains of Toxoplasma gondii lose their innate aversion to cat urine, even after extensive parasite clearance. PLoS One. 8(9):e75246. 2013. [PMID: 24058668]

15. McConkey GA, Martin HL, Bristow GC, Webster JP. Toxoplasma gondii infection and behaviour - location, location, location? J Exp Biol. 216(Pt 1):113-9. 2013. [PMID: 23225873]

16. Hari Dass SA, Vyas A. Toxoplasma gondii infection reduces predator aversion in rats through epigenetic modulation in the host medial amygdala. Mol Ecol. 2014. [PMID: 25142402]