Tuesday, 7 April 2015

What's new in InterPro release 50.0 and 51.0

Faster InterPro member database processing:
InterPro releases 50.0 and 51.0 have brought some important developments from an InterPro production point of view, which we thought would be worth sharing. Release 50.0 saw the incorporation of a new version of PIRSF, which has importantly been migrated to use the HMMER3.1b analysis algorithm. This version of HMMER runs approximately one thousand times faster than the previous version used by PIRSF (HMMER2.0), helping to ensure that InterPro can continue to calculate UniProtKB match data in a timely manner. In a related development, as part of InterPro release 51.0, we debuted a sequence database pre-filtering heuristic to reduce the amount of time it takes to calculate matches against the HAMAP database (the heuristic is based on HMMER3.0, but the analysis still uses the core HAMAP algorithm, and is all implemented within the InterProScan software).  This again speeds up our protein match generation process and helps to safeguard against future data growth. The PIRSF and HAMAP databases were identified as being the slowest databases to calculate matches at at the start of 2014, but after work from both the database maintainers and the InterPro team, but this is no longer the case.

A leaner UniProtKB: 
At the same time, the number of proteins in UniProtKB has decreased significantly, where some 47 million sequences from highly redundant bacterial proteomes have been deleted (for details, see here, described half way down the page).

Faster and fitter InterPro production:
The majority of these developments have taken place under the hood, so it is unlikely that you will have been aware of our fitter and faster production system. What we hope you will notice, however, are more regular InterPro releases and more frequent member database updates in future, as these and other optimisations come into effect.

Alex Mitchell
on behalf of the InterPro team

Tuesday, 31 March 2015

The sweetest thing

By Hsin-Yu Chang

A famous cola company launched a new product contained in a gleaming green can last year. As a regular cola drinker, I was intrigued by the packaging. After doing some research, I discovered that this variety of cola contains a sweetener called Stevia.
    Figure 1. Stevia rebaudiana
  Ethel Aardvark, Wikimedia

Stevia is extracted from a plant, Stevia rebaudiana, found in Brazil and Paraguay. The leaves of the Stevia plant have been used for hundreds of years in both countries to sweeten local teas and medicines. The sweet taste is mainly from steviol glycoside compounds, which have up to 150 times the sweetness of sugar, but zero calories 1.

The story of Stevia gave me, a protein database curator, the idea to search for the sweetest proteins to date. I found one such protein, thaumatin (IPR001938), produced by Thaumatococcus daniellii (also known as Katemfe), a shrub from West Africa. Thaumatin is around 2,000 times sweeter than sugar 2 !

Similar to Stevia, Katemfe plants have been used by the locals for a long time; they use its leaves for wrapping food and its fruits for sweetening breads, palm wine and sour food. Their sweet proteins, thaumatin I and thaumatin II, were first identified in the 1970s in the search for non-toxic, non-calorific 'natural' sweeteners to replace synthetic ones 3.

Figure 2. Katemfe plant
~from Engler et al. Marantaceae, vol. 48: [Heft 11], p. 40, fig. 8 (1902).

Why do plants like Katemfe produce extremely sweet proteins? The answer may lie in the plant defence systems. Under environmental stresses or pathogen attack, plants can produce proteins that help them stay alive. In the case of Katemfe, attack by a viroid (a sub-viral pathogen) induces thaumatin production. Thaumatin has also been shown to have antifungal activities, which suggests it may be part of a defence mechanism that prevents further pathogen attacks 4.

In fact, thaumatin shares a conserved site (IPR017949) with a group of pathogenesis-related proteins, also known as thaumatin-like proteins (TLPs) 5, including tobacco salt-induced protein osmotin 6 and maize antifungal protein zeamatin 7. Like thaumatin, this group of proteins plays an essential part in plant defence against either environment stress or pathogen attack 8.

Another question is, why does thaumatin taste sweet to us? This is down to the sweetness receptors in our taste buds on our tongues. The sweet molecules (chemicals or proteins) are perceived by G-protein-coupled receptors, consisting of  two subunits, T1R2 and T1R3. Certain amino acid residues in these subunits affect their ability to recognise the sweet molecules 9. Interestingly, apes and Old World monkeys can perceive thaumatin as a sweet protein, while New World monkeys and rodents cannot 2.  In other words, the sweet taste of thaumatin for us humans could be just an evolutionary coincidence.

Figure 3. Sweet receptor, the peptide region involved in the response for thaumatin is shown in red 2.

So far, several chemical sweeteners have been commercialised, such as aspartame, sucralose and saccharin, and many more products may yet emerge. We all know that our sugar consumption causes health problems like obesity, diabetes and tooth decay. To avoid such health issues, scientists have searched far and wide to find alternatives. Natural sweeteners, such as Stevia and thaumatin, have provided new options for us. However, with so many different products on the market, as a consumer, I am still sitting on the fence to see which ones provide the best health benefits.

Figure 4. What should be in yours?
Additional information:
I. Katemfe:
Katemfe is a 3-4 metre tall shrub from the rain forests of West Africa. It bears light purple flowers and a soft fruit containing shiny black seeds. The fruit is covered in a fleshy red aril, the part that contains thaumatin.

II. E numbers:
Thaumatin has been approved by the European as a sweetener, known as E957. It is usually used in processed foods and has a slight licorice aftertaste.

III. Calories:
Despite thaumatin containing 4 calories/gram (3.87 calories/gram for sucrose), the amount needed to be used in food or drink is extremely small, due to its high potency.

IV.Other sweet proteins:
Besides thaumatin, there are a few other sweet proteins such as monellin (IPR015283), pentadin, mabinlin and brazzein 10,11.

V. Further reading:
How did stevia get mainstream?  -By Tom Heyden

Are sweeteners really bad for us? -By Claudia Hammond

1. Cardello HM, Da Silva MA, Damasio MH., Measurement of the relative sweetness of stevia extract, aspartame and cyclamate/saccharin blend as compared to sucrose at different concentrations. Plant Foods Hum Nutr. 54(2):119-30., 1999. [PMID:10646559]

2. Masuda T, Taguchi W, Sano A, Ohta K, Kitabatake N, Tani F., Five amino acid residues in cysteine-rich domain of human T1R3 were involved in the response for sweet-tasting protein, thaumatin. Biochimie. 95(7):1502-5., 2013. [PMID:23370115]

3. van der Wel H, Loeve K., Isolation and characterization of thaumatin I and II, the sweet-tasting proteins from Thaumatococcus daniellii Benth. Eur J Biochem. 31(2):221-5., 1972. [PMID:4647176]

4. Rodrigo I, Vera P, Frank R, Conejero V., Identification of the viroid-induced tomato pathogenesis-related (PR) protein P23 as the thaumatin-like tomato protein NP24 associated with osmotic stress. Plant Mol Biol. 16(5):931-4., 1991. [PMID:1859873]

5. Liu JJ, Sturrock R, Ekramoddoullah AK., The superfamily of thaumatin-like proteins: its origin, evolution, and expression towards biological function. Plant Cell Rep. 29(5):419-36., 2010. [PMID:20204373]

6. Subramanyam K, Arun M, Mariashibu TS, Theboral J, Rajesh M, Singh NK, Manickavasagam M, Ganapathi A., Overexpression of tobacco osmotin (Tbosm) in soybean conferred resistance to salinity stress and fungal infections. Planta. 236(6):1909-25., 2012. [PMID:22936305]

7. Schimoler-O'Rourke R, Richardson M, Selitrennikoff CP., Zeamatin inhibits trypsin and alpha-amylase activities. Appl Environ Microbiol. 67(5):2365-6., 2001. [PMID:11319124]

8. Monteiro S, Barakat M, Piçarra-Pereira MA, Teixeira AR, Ferreira RB. Osmotin and thaumatin from grape: a putative general defense mechanism against pathogenic fungi. Phytopathology. 93(12):1505-12, 2003. [PMID:18943614]

9. Masuda T, Mikami B, Tani F., Atomic structure of recombinant thaumatin II reveals flexible conformations in two residues critical for sweetness and three consecutive glycine residues. Biochimie. 106:33-8, 2014. [PMID:25066915]

10. Faus I, Recent developments in the characterization and biotechnological production of sweet-tasting proteins. Appl Microbiol Biotechnol. 53(2):145-51., 2000. [PMID:10709975]

11. Masuda T, Kitabatake N. Developments in biotechnological production of sweet proteins. 102(5):375-89.  J Biosci Bioeng. 2006. [PMID:17189164]

Wednesday, 26 November 2014

In the pipeline – streamlined InterPro production

You may have noticed that InterPro has had fewer releases than usual this year. It is not that we haven’t been working as hard as ever, integrating member database signatures into InterPro entries and adding Gene Ontology terms - we have! But a number of things have been going on behind the scenes, which we thought you might be interested in knowing about.

Sequence growth 
InterPro release 1.0, back in 2000, was built using a version of Swiss-Prot/TrEMBL that contained just over 300 thousand sequences. Our current InterPro release (49.0) is built using over 77 million Swiss-Prot/TrEMBL sequences. That is a massive amount of sequence growth - and even more remarkable is the fact that almost half of these sequences have been added in the last year.

A new InterPro production pipeline
As you might imagine, processing this number of sequences can cause all kinds of problems for computational pipelines that were developed when sequence data volumes were orders of magnitudes smaller. To make sure that we can handle the kind of data volume growth we have been seeing - and expect to see in the future - we have been busy rebuilding our production pipeline. The new system is built entirely on InterProScan, which, for a variety of complicated historical reasons, the previous version was not. This change helps streamline the production process, removes a number of bottlenecks, and generally makes many things associated with data production a lot less complicated.

Further pipeline developments and a new data centre 
To put these changes in place, we have had to focus a lot of our efforts on pipeline development, with knock-on effects on our release schedule. As a consequence, while we have maintained our usual rate of database integrations, these have been squeezed into slightly fewer InterPro releases. And, as a further complication, we have also recently moved all of our data (in the form of hard drives on the back of a truck - no, really!) to a new data centre, as part of EMBL-EBI’s consolidation of its Web infrastructure. This has impacted our release schedule further still. However, we believe that we are now much better placed to calculate and provide match data for our users. We think we are also better prepared for future data production challenges - as the number of protein sequences hits 100 million, and beyond.

Alex Mitchell
on behalf of the InterPro team

Thursday, 6 November 2014

Protein focus: Don’t blame the cat - the toxoplasmosis effect

By Amaia Sangrador and Alex Mitchell

You may have heard about toxoplasmosis, or read about it in a newspaper or magazine. Toxoplasmosis is a condition caused by the protozoan Toxoplasma gondii, an intracellular parasite that infects a wide variety of warm-blooded animals, including humans. T. gondii has attracted the attention of both the scientific and lay communities, and with good reason. It is one of the most successful parasites, infecting over one third of the human population, with rates varying depending upon geographical location1.

Acute toxoplasmosis usually only poses a risk for immunocompromised individuals or pregnant women. However, residual parasites persist lifelong after the acute phase. Though this latent form of the infection was thought to be asymptomatic, a growing body of evidence suggests that this is not the case2. And the long term effects of infection with T. gondii seem related to the most fascinating aspect of this parasite: its ability to modify host behaviour.

Members of the cat family (Felidae) are the only definitive hosts of T. gondii within which the parasite undergoes sexual reproduction. This culminates with the production of oocysts that are shed in the cat’s faeces. Within intermediate hosts (cat’s natural prey, such as rodents and birds)  and other incidental intermediate hosts (such as humans and domestic livestock), the parasite undergoes asexual reproduction, producing bradyzoites that can encyst in the brain and other tissues, where they remain potentially for the host’s lifetime3. Infection can occur following ingestion of oocysts via contaminated soil or water, or ingestion of tissue cysts through raw/undercooked infected meat.

Fig 1. Life-cycle of the parasite Toxoplasma gondii. Sexual reproduction can only be accomplished in felines, and results in the production of sporozoites-containing oocysts that are shed for a limited period. Within intermediate hosts, the parasite undergoes asexual reproduction, producing rapidly dividing tachyzoites - cleared by the immune system - and slowly dividing bradyzoites that can persist as tissue cysts.

Figure by Sebastien Pesseat  

Given that sexual reproduction of T. gondii can be accomplished only in felines, the parasite needs to secure eventual transmission from its intermediate host reservoir, primarily rodents, to its feline definitive host. T. gondii deals with this by manipulating the intermediate host’s behaviour. Toxoplasma infection draws rats to cat odours, increasing activity in limbic regions related to sexual attraction when exposed to cat urine, turning what should be a fear response into a ‘fatal feline attraction4'. The intriguing question is: how does the parasite manage to alter the host behaviour? 

Figure by Sebastien Pesseat

One line of evidence suggests that the parasite alters neurotransmitter signals in the brain through increased dopamine levels, supported by studies showing that parasite-induced behavioural changes can be disrupted with dopamine antagonists5,6. The genome of T. gondii contains two genes encoding an enzyme capable of producing L-DOPA (3,4-dihydroxy-L-phenylalanine), the precursor to dopamine7. One of the genes, TgAaaH1, is constitutively expressed, whilst the other gene, TgAaaH2, is induced during the cyst stages. They encode dual activity amino acid hydrolases, bi-functional enzymes that catabolyse both the amino acids phenylalanine and tyrosine. Thus, they can generate tyrosine from phenylalanine and then use tyrosine to produce L-DOPA. These steps are catalysed in Metazoa by phenylalanine hydroxylase and tyrosine hydroxylase, respectively.
Fig 2. Dopamine biosynthesis pathway

In InterPro, these dual activity amino acid hydrolases are classified as belonging to the aromatic amino acid hydroxylase family (IPR001273). They consist of two domains: an N-terminal ACT domain (IPR002912), and a C-terminal aromatic amino acid hydroxylase domain (IPR019774). The C-terminal domain is responsible for catalysis and the N-terminal domain determines the substrate specificity. You can read more about these proteins and domains on the InterPro website.

Fig 3. InterPro view for aromatic amino acid hydroxylase 1 from T. gondii 
(UniProt protein B2L7T1), the product of gene TgAaaH1.

But if T. gondii can alter the behaviour of cats’ natural prey, what happens when secondary hosts like humans are infected? Altering host behaviour in ‘inappropriate’ hosts seems to be an unnecessary but unavoidable consequence of the parasite’s strategy8. Indeed, studies have revealed a range of subtle behavioural alterations associated with T. gondii infection in humans, many of which may be comparable to those observed in infected rodents – such as increased activity and decreased reaction times. Studies indicate that some of these changes can be sex-specific, as infection has been reported to increase testosterone levels in men, but decrease its levels in women. Consequently, infected men have tendency to disregard rules and are more suspicious and jealous. In women, the shift in these two factors is opposite; they are more warm-hearted, extrovert and easy-going9. More worrying is the link that may exist between infection and psychiatric conditions, such as schizophrenia, in some individuals. This association is supported by several observations, starting with the prevalence of toxoplasmosis in schizophrenic patients10. Furthermore, antipsychotic drugs, known to be effective in schizophrenia, also inhibit T. gondii11. Meanwhile, raised or disrupted dopamine levels have been reported in both rodent and human T. gondii infection and within human patients with schizophrenia12,13

And when you think it cannot get more bizarre, well, it does. Some studies show Toxoplasma’s modification of behaviour persists even after all parasites and cysts have been cleared14. This would contradict the cyst-centric theories, which explain the modification of host behaviour as a result of the parasite’s cysts actively modulating dopamine production or affecting neuronal activity15According to a recent study, an explanation for the persistence of the effects induced by the parasite could reside in its ability to induce epigenetic changes in the host. A change in the  methylation state of the arginine vasopressin promoter has been observed in infected animals, resulting in increased expression of this hormone, and affecting a testosterone-responsive area of the brain known for its role in male sexual behaviour16.  If T. gondii is capable on inducing epigenetic changes, we should consider whether other parasites and pathogens may use similar strategies.

This brings us to an existential question where philosophy and science meet: what is free will if our behaviour can be manipulated? Perhaps, as the philosopher Jose Ortega y Gasset said, we are us and our circumstances. So don’t worry about what might have been. As a future precaution, remember that most people get infected through ingestion or contact with undercooked meat, so make sure that you wash your hands and utensils after handling raw meat. And don’t blame the cat for your behaviour!

Friend or dinner?
Picture from MorgueFile.com, modified by Hsin-Yu Chang


1. Flegr J, Prandota J, Sovičková M, Israili ZH. Toxoplasmosis - a global threat. Correlation of latent for the futuretoxoplasmosis with specific disease burden in a set of 88 countries. PLoS One. 9(3):e90203. 2014. [PMID: 24662942]

2. Bhadra R, Cobb DA, Weiss LM, Khan IA. Psychiatric disorders in toxoplasma seropositive patients--the CD8 connection. Schizophr Bull. 39(3):485-9. 2013. [PMID: 23427221]

3. Webster JP, Kaushik M, Bristow GC, McConkey GA. Toxoplasma gondii infection, from predation to schizophrenia: can animal behaviour help us understand human behaviour? J Exp Biol. 216(Pt 1):99-112. 2013. [PMID: 23225872]

4. House PK, Vyas A, Sapolsky R. Predator cat odors activate sexual arousal pathways in brains of Toxoplasma gondii infected rats. PLoS One. 6(8):e23277. 2011. [PMID: 21858053]

5. Prandovszky E, Gaskell E, Martin H, Dubey JP, Webster JP, McConkey GA. The neurotropic parasite Toxoplasma gondii increases dopamine metabolism. PLoS One. 6(9):e23866. 2011. [PMID: 21957440]

6. Webster JP, Lamberton PH, Donnelly CA, Torrey EF. Parasites as causative agents of human affective disorders? The impact of anti-psychotic, mood-stabilizer and anti-parasite medication on Toxoplasma gondii's ability to alter host behaviour. Proc Biol Sci. 273(1589):1023-30. 2006. [PMID: 16627289]

7. Gaskell EA, Smith JE, Pinney JW, Westhead DR, McConkey GA. A unique dual activity amino acid hydroxylase in Toxoplasma gondii. PLoS One. 4(3):e4801. 2014. [PMID: 19277211]

8. Webster JP, Kaushik M, Bristow GC, McConkey GA. Toxoplasma gondii infection, from predation to schizophrenia: can animal behaviour help us understand human behaviour? J Exp Biol. 216:99-112. 2013. [PMID: 23225872]

9. Flegr J. Influence of latent Toxoplasma infection on human personality, physiology and morphology: pros and cons of the Toxoplasma-human model in studying the manipulation hypothesis. J Exp Biol. 216:127-33. 2013. [PMID: 23225875]

10. Torrey EF, Bartko JJ, Lun ZR, Yolken RH. Antibodies to Toxoplasma gondii in patients with schizophrenia: a meta-analysis. Schizophr Bull. 33(3):729-36. 2007. [PMID: 17085743]

11. Jones-Brando L, Torrey EF, Yolken R. Drugs used in the treatment of schizophrenia and bipolar disorder inhibit the replication of Toxoplasma gondii. Schizophr Res. 62(3):237-44. 2003. [PMID: 12837520]

12. Howes OD, Kapur S. The dopamine hypothesis of schizophrenia: version III--the final common pathway. Schizophr Bull. 35(3):549-62. 2009. [PMID: 19325164]

13. Flegr J. How and why Toxoplasma makes us crazy. Trends Parasitol. 29(4):156-63. 2013. [PMID: 23433494]

14. Ingram WM, Goodrich LM, Robey EA, Eisen MB. Mice infected with low-virulence strains of Toxoplasma gondii lose their innate aversion to cat urine, even after extensive parasite clearance. PLoS One. 8(9):e75246. 2013. [PMID: 24058668]

15. McConkey GA, Martin HL, Bristow GC, Webster JP. Toxoplasma gondii infection and behaviour - location, location, location? J Exp Biol. 216(Pt 1):113-9. 2013. [PMID: 23225873]

16. Hari Dass SA, Vyas A. Toxoplasma gondii infection reduces predator aversion in rats through epigenetic modulation in the host medial amygdala. Mol Ecol. 2014. [PMID: 25142402]