Research

In silico characterisation, homology modelling and structure-based functional annotation of blunt snout bream (Megalobrama amblycephala) Hsp70 and Hsc70 proteins

Ngoc Tuan Tran1,2, Ivan Jakovlić1, Wei-Min Wang1,3
Author Information & Copyright
1College of Fisheries, Key Lab of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education/Key Lab of Freshwater Animal Breeding, Ministry of Agriculture, Huazhong Agricultural University, Wuhan, Hubei 430070 China
3Collaborative Innovation Center for Efficient and Health Production of Fisheries in Hunan Province, Changde, 41500 China
2Center for Fish Biology and Fishery Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, 430072 China

© Tran et al. 2015. Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Received: May 15, 2015 ; Accepted: Nov 28, 2015

Published Online: Dec 15, 2015

Abstract

Background

Heat shock proteins play an important role in protection from stress stimuli and metabolic insults in almost all organisms.

Methods

In this study, computational tools were used to deeply analyse the physicochemical characteristics and, using homology modelling, reliably predict the tertiary structure of the blunt snout bream (Ma-) Hsp70 and Hsc70 proteins. Derived three-dimensional models were then used to predict the function of the proteins.

Results

Previously published predictions regarding the protein length, molecular weight, theoretical isoelectric point and total number of positive and negative residues were corroborated. Among the new findings are: the extinction coefficient (33725/33350 and 35090/34840 - Ma-Hsp70/ Ma-Hsc70, respectively), instability index (33.68/35.56 – both stable), aliphatic index (83.44/80.23 – both very stable), half-life estimates (both relatively stable), grand average of hydropathicity (−0.431/-0.473 – both hydrophilic) and amino acid composition (alanine-lysine-glycine/glycine-lysine-aspartic acid were the most abundant, no disulphide bonds, the N-terminal of both proteins was methionine). Homology modelling was performed by SWISS-MODEL program and the proposed model was evaluated as highly reliable based on PROCHECK’s Ramachandran plot, ERRAT, PROVE, Verify 3D, ProQ and ProSA analyses.

Conclusions

The research revealed a high structural similarity to Hsp70 and Hsc70 proteins from several taxonomically distant animal species, corroborating a remarkably high level of evolutionary conservation among the members of this protein family. Functional annotation based on structural similarity provides a reliable additional indirect evidence for a high level of functional conservation of these two genes/proteins in blunt snout bream, but it is not sensitive enough to functionally distinguish the two isoforms.

Keywords: Hsp70; Hsc70; Physicochemical characteristics; Homology modelling; Structural similarity; Functional annotation

Background

Heat shock (or stress) proteins (HSPs) are a family of highly conserved cellular proteins that play an important role in protection from stress stimuli and metabolic insults in almost all organisms [14]. They include three major families: Hsp90 (85–90 kDa), Hsp70 (68–73 kDa) and low molecular-weight Hsps (16–47 kDa) [3]. The Hsp70 family is encoded by two different genes: a constitutive, “housekeeping” heat shock cognate (hsc) 70 gene, which is predominantly associated with physiological processes, and stress-inducible hsp70. Hsc70 protein plays a key role as molecular chaperone in a wide range of cellular processes, such as protein assembly, folding, transport through membrane channels, translocation and denaturation [47]. Hsp70 protein is mainly responsible for the maintenance of cellular homeostasis during the stress response, thus protecting cells from the damage caused by environmental stress agents, such as heat shock, chemical exposure and UV or γ-irradiation [4, 8, 9]. Hsp70 is also a potential activator of the innate immune system mechanisms [1012]. Hsp70 is considered to be by far the most evolutionary conserved protein, found in all organisms from archaebacteria and plants to humans [13]. Both proteins have a modular structure consisting of a highly conserved N-terminal ATPase domain, an adjacent well-conserved substrate-binding domain (SBD) that contains a hydrophobic pocket with a lid-like structure over it, and a conserved but more variable C-terminal domain, which plays an important role in Hsp70 functions required for cell growth. In the ATP-bound state, the substrate-binding pocket is open and rapidly exchanges substrate. ATP hydrolysis induces closing of the lid over the pocket, which stabilises substrate binding. Return to the ATP-bound state restores the open conformation, facilitating substrate release [14, 15].

In aquaculture, fish are often exposed to stressful situations, such as sudden temperature changes, high stocking density, trauma, hypoxia, as well as viral and bacterial infections, which often results in high fish mortality. Both genes have been identified and their expression characterised in many fish species, e.g., rainbow trout (Oncorhynchus mykiss), zebrafish (Danio rerio), Korean rockfish (Sebastes schlegeli), Nile tilapia (Oreochromis niloticus), mandarin fish (Siniperca chuatsi) [1620] and both are known to have a crucial role in response to heat shock, hypoxia, crowding stress and bacterial pathogens in fish [3, 4, 7, 19]. However, according to our best knowledge, a three-dimensional (3-D) model of any fish Hsp70 family protein has not been published so far.

Blunt snout bream (Megalobrama amblycephala Yih, 1955), native to the middle portions of the Yangtze River basin, is becoming an increasingly important freshwater aquaculture species in China. Due to its successful artificial propagation and high economic value, the total output of the blunt snout bream aquaculture industry reached 652 215 tons in 2010 [2123]. Previously, Ming, Xie [7] used bioinformatics tools to analyse some physicochemical characteristics of the two Ma-Hsp70 family proteins, such as molecular weight, isoelectric point, solubility (as hydrophilic property) and richness in B cells antigenic sites. Their results indicated that Ma-Hsp70 shares more than 85 % identity with its homologs in other vertebrates, has no signal peptide or transmembrane region, contains many protein kinase C phosphorylation sites, N-myristoylation sites, casein kinase II phosphorylation sites and N-glycosylation sites, while the predominant elements of the secondary structure are α-helix and random coil. However, as the previous study left many questions regarding the physicochemical and structural properties (particularly regarding the tertiary structure) open, this study, as a successive work, aims to fill this gap. Several different computational tools and available web servers were used to deeply analyse the physicochemical characteristics and, using homology modelling, reliably predict the tertiary structure of the blunt snout bream Hsp70 and Hsc70 proteins. Additionally, rapidly increasing number of known gene sequences in many organisms has prompted the need for new procedures and techniques for the high-throughput functional annotation of genes. While most of those traditionally used remain rather costly and work-intensive, with rapidly growing number of protein structures deposited in the Protein Data Bank (PDB), computational structural genomics is becoming an increasingly promising tool for fast and cheap insight into protein structures, functions and interactions [2427]. As Ming, Xie [7] analysed the expression of Ma-hsp70 and Ma-hsc70 genes in order to gain insight into their functions, this study further aims to provide a supplementary evidence for conserved function of these two genes by in-deep structure analysis and functional annotation of their polypeptide products on the basis of the similarity of their tertiary structures to the available templates from other organisms. As structure-based functional annotation has seldom been used in study of fish proteins, the aim of this study is also to test the applicability of this approach for functional annotation of fish gene sequences.

Methods

Physicochemical characterisation

Amino acid sequences of the blunt snout bream Hsp70 (Accession number: ACG63706.2) and Hsc70 (Accession number: GQ214528.1) [7] were obtained from the NCBI protein database (http://www.ncbi.nlm.nih.gov/) in FASTA format as the target template and used for further analyses. Physicochemical properties of the proteins, including molecular weight, amino acid composition, theoretical isoelectric point (pI), the total number of positive and negative residues, extinction coefficient (EC), instability index (II), aliphatic index (AI) and grand average of hydropathicity (GRAVY) were analysed using Expasy’s ProtParam prediction server [28]. SOSUI server [29] was used to determine whether it is a soluble or a transmembrane protein, while CYS_REC (http://linux1.softberry.com) was used to predict the presence of cysteine residues and their bonding patterns.

Comparative homology modelling

Homology modelling of the proteins was performed by the SWISS-MODEL server [30, 31], which aligns an input target with pre-existing templates to generate a series of predicted models. The most suitable template to build the 3-D model was selected on the basis of sequence identity [32]. Multiple amino acid sequence alignment was performed with ClustalW2 (http://www.ebi.ac.uk/Tools/msa/clustalw2). Stereochemical quality and accuracy of the predicted models were analysed using PROCHECK’s Ramachandran plot analysis, ERRAT, PROVE, Verify3D (all four available from the SAVES server at http://nihserver.mbi.ucla.edu), ProQ [33] and ProSA [34, 35]. Structural analysis was performed and model figures generated by Swiss PDB Viewer [36].

Structural similarity and functional annotation

COFACTOR web server was used to perform the global structure match using TM-align algorithm and render the TM-score was calculated to assess the global structural similarity: values range from 0 to 1, where TM-score = 1 indicates the perfect match between two structures. Scores below 0.17 correspond to randomly chosen unrelated proteins, whereas a score higher than 0.5 implies generally the same fold [37]. Annotations on ligand-binding sites, gene ontology and enzyme commission were performed by the I-TASSER suite, which structurally matches the 3-D model of Ma-Hsp70 and Ma-Hsc70 to the known templates in protein function databases [3840].

Results and discussion

Physicochemical characterisation

In this study, several different computational tools and available web servers were used to deeply analyse the physicochemical characteristics and to reliably predict the tertiary structure of the blunt snout bream Hsp70 and Hsc70 proteins (using homology modelling). This research corroborated the previous predictions regarding the Ma-Hsp70 and Ma-Hsc70 protein length (643 and 649 amino acids), molecular weight (70517.7 and 71240.3 Da), theoretical isoelectric point (pI = 5.36 and 5.31) and the total number of negatively and positively charged residues (94 and 81 and 96 and 82, respectively) [7]. Among the new findings, the computed pI value indicated that the proteins are acidic (pI < 7) in character, implying that they can be purified on a polyacrylamide gel by isoelectric focusing. The calculated extinction coefficient (EC), which is in direct correlation with the cysteine, tryptophan and tyrosine content, of Ma-Hsp70 and Ma-Hsc70 proteins at 280 nm was 33725/33350 (assuming all pairs of cysteine residues form cysteines) and 35090/34840 (assuming all cysteine residues are reduced) M−1cm−1, respectively. The instability index (II) value was 33.68 (Ma-Hsp70) and 35.56 (Ma-Hsc70), implying that both proteins are stable (II < 40) [41]. Similarly, the aliphatic index (AI) of both proteins had very high value of 83.44 (Ma-Hsp70) and 80.23 (Ma-Hsc70), indicating stability over a wide temperature range [42]. The N-terminal of both proteins was methionine. Estimated half-life values were also the same for both proteins: 30 h in mammalian reticulocytes (in vitro), >20 h in yeast (in vivo) and >10 h in Escherichia coli (in vivo). The grand average hydropathicity (GRAVY) of Ma-Hsp70 and Ma-Hsc70 was −0.431 and −0.473, respectively, corroborating that proteins are hydrophilic and highly soluble in water. Amino acid composition analysis revealed high amounts of alanine (8.4 %), lysine (8.2 %) and glycine (8.1 %) in Ma-Hsp70, whereas glycine (8.5 %), lysine (8.2 %) and aspartic acid (7.6 %) were the most abundant in Ma-Hsc70 (Table 1).

Table 1. Amino acid composition of Ma-Hsp70 and Ma-Hsc70
Amino acidMa-Hsp70Ma-Hsc70Amino acidMa-Hsp70Ma-Hsc70
N%N%N%N%
Alanine548.4477.2Lysine538.2538.2
Arginine284.42945Methionine142.2152.3
Asparagine365.6345.2Phenylalanine223.4233.5
Aspartic acid477.3497.6Proline203.1253.9
Cysteine60.940.6Serine365.6375.7
Glutamine284.4253.9Threonine416.4487.4
Glutamic acid477.3477.2Tryptophan20.320.3
Glycine528.1558.5Tyrosine152.3162.5
Histidine71.171.1Valine446.8456.9
Isoleucine457.0467.1Pyrrolysine00.000.0
Leucine467.2426.5Selenocysteine00.000.0

N represents the total number and % the numeric percentage of each amino acid

Download Excel Table

Though six and four cysteine residues were found in Ma-Hsp70 and Ma-Hsc70 sequences, respectively, no evidence was found for the existence of disulphide bonds, which are essential for the folding of proteins and responsible for stabilisation of protein structure [43] (Table 2).

Table 2. Cysteine occurrence pattern and probability of cysteine residue pairing in Ma-Hsp70 and Ma-Hsc70 proteins
ProteinPositionStatusScore
Ma-Hsp70Cys19no SS-bond−58.8
Cys269no SS-bond−34.4
Cys308no SS-bond−24.9
Cys576no SS-bond−29.6
Cys605no SS-bond−27.5
Cys622probably no SS-bond−8.00
Ma-Hsc70Cys17no SS-bond−55.6
Cys267no SS-bond−37.8
Cys574no SS-bond−32.2
Cys603no SS-bond−23.6
Download Excel Table

Comparative homology modelling

Bovine Hsc70 (PDB ID: 4 fl9.1.A) at 1.9 Å resolution was chosen as the best available template to build a 3-D model for both Ma-Hsp70 (90.24 % sequence identity) and Ma-Hsc70 (95.85 % seq. identity) proteins using homology modelling. Template-target sequence alignments and 3-D structure of the predicted models are shown in Figs. 1 and 2, respectively. Verification of the results, using different tools, invariably indicated a good quality of the proposed models (Table 3). So the Ramachandran plot analysis, where a good model would be expected to have over 90 % of residues in the most favoured regions, suggested a good quality of both homology models (Ma-Hsp70 - 92.7 % and Ma-Hsc70 - 92.4 %). The overall G-factor of the two models, where a value > −0.5 indicates a good model [44], was 0.16. Verify3D analysis of the models revealed that 92.55 % (Ma-Hsp70) and 92.36 % (Ma-Hsc70) of the residues had an average 3D-1D score ≥0.2, while 98.36 % and 99.82 % were >0, respectively. As the cut-off score was ≥0, this implies the predicted models are valid [45]. Overall ERRAT quality factor value, expressed as the percentage of the protein for which the calculated value falls below the 95 % rejection limit, was 95.547 % and 92.963 %, respectively (Fig. 2). Good high resolution structures generally produce values around 95 % or higher [46]. LGscore and MaxSub index indicated “very good” (value >5.0) and “correct” quality (value >0.1) [33] for Ma-Hsp70 and Ma-Hsc70 protein models, respectively (Table 3). Z-scores for Ma-Hsp70 (−11.01) and Ma-Hsc70 (−11.19) models were within the range of scores typically found for the native proteins of similar size, while the plot of residue energies, where positive values correspond to problematic or erroneous parts of the input structure, revealed that most of the calculated values were negative [35] (Fig. 2). All of these validation tools strongly suggested that both proposed 3-D models could be accepted as reliable with high confidence.

jast-57-0-44-g1
Fig. 1. Alignment of the deduced Ma-Hsp70 a and Ma-Hsc70 b with bovine Hsc70 (PDB ID: 4 fl9.1.A) amino acid sequence. Amino acid positions are numbered on the right, conserved substitutions are indicated by (:), semi-conserved by (.) and deletions by (−)
Download Original Figure
jast-57-0-44-g2
Fig. 2. Ma-Hsp70 a and Ma-Hsc70 b protein tertiary structure model and validation results: a 3-D homology model rendered by the SWISS-MODEL program. b Ramachandran plot analysis, indicating residues in the favoured regions (red), allowed regions (yellow), generously allowed regions (light yellow) and disallowed regions (white). c Overall quality of the model evaluated by the ERRAT program. On the error axis, two lines (95 and 99 %) indicate the confidence with which it is possible to reject regions that exceed that error value. Regions of the structure highlighted in grey and black can be rejected at 95 % and 99 % confidence level, respectively. d Z-score (highlighted as a black dot) is displayed in a plot that contains the Z-scores of all experimentally determined protein chains currently available in the Protein Data Bank. Groups of structures from different sources (X-ray and NMR) are distinguished by different colours (light- and dark-blue, respectively). e Plot of single residue energies, where window sizes of 40 and 10 residues are distinguished by dark- and light-green lines, respectively. Positive values indicate problematic or erroneous parts of the structure
Download Original Figure
Table 3. Assessment of the predicted three-dimensional structures of Ma-Hsp70 and Ma-Hsc70 proteins
Validation IndexMa-Hsp70Ma-Hsc70
Ramachandran plot
 Residues in most favoured regions92.792.4
 Residues in additional allowed regions6.77.1
 Residues in generously allowed regions0.40.2
 Residues in disallowed regions0.20.2
 Overall G-factor0.160.16
ProQ
 Lgscore5.4175.467
 MaxSub0.4480.427
ProSA Z-Score−11.01−11.19
ERRAT95.54792.63
Download Excel Table

Structure similarity analysis

As TM-scores >0.5 indicate that two proteins generally have the same fold, the results (TM > 0.66) implied a very high level of structural conservation between Ma-Hsp70 and related rat, mouse, human and yeast proteins (Table 4). TM-score value of 0.986 between both Ma-Hsp70 and Ma-Hsc70 and Bos taurus Hsc70 3-D protein models indicates that they are structurally almost identical (Fig. 3). Somewhat surprisingly, TM-scores were identical for both protein models, while RMSDa IDENa and Cov. values were different between the two models. This could be explained by the fact that their tertiary structure is more similar than their primary and secondary structure, as well as by the absence of fish Hsp/Hsc70 templates in the PDB. The fact that the same PDB protein models (three Hsc and two Hsp) were indicated as the best available templates for both tested models reflects a very high level of conservation among the two proteins and other vertebrate Hsp70 family proteins (Ma-Hsp70 - 86 % identity and Ma-Hsc70 - 93 %), as well as between Ma-Hsp70 and Ma-Hsc70 (86.5 % identity) [7]. High structural similarity to Hsp70 protein family members from a wide range of taxonomically distant organisms additionally corroborated a remarkably high level of evolutionary conservation among the members of this protein family and provided an indirect evidence for a high level of functional conservation as well.

Table 4. Top five identified structural analogs in the Protein Data Bank (PDB) library
PDB IDProteinSpeciesTM-scoreRMSDaIDENaCov.
Ma-Hsp701yuwAHsc70Bos taurus0.9860.950.8960.996
4j8fAHsc70Rattus norvegicus0.6881.780.6760.711
3cqxBHsc70Mus musculus0.6780.900.9020.685
3iucCHsp70Homo sapiens0.6770.950.6870.685
3qmlAHsp70Saccharomyces cerevisiae0.6691.170.6560.682
Ma-Hsc701yuwAHsc70Bos taurus0.9861.240.9581.000
4j8fAHsc70Rattus norvegicus0.6881.810.6540.711
3cqxBHsc70Mus musculus0.6780.900.9600.685
3iucCHsp70Homo sapiens0.6770.950.6980.685
3qmlAHsp70Saccharomyces cerevisiae0.6691.170.6590.682

Analogs were inferred by COFACTOR analysis, based on the TM-score of the structural alignment between the query structure and known structures in the PDB. RMSDa is the average root mean square deviation between residues that are structurally aligned by TM-align; IDENa is the percentage sequence identity in the structurally aligned region; Cov. represents the coverage of the alignment by TM-align and is equal to the number of structurally aligned residues divided by length of the query protein

Download Excel Table
jast-57-0-44-g3
Fig. 3. Bos taurus Hsc70 (PDB ID: 1yuw-A) structural analog (backbone trace) superimposed upon the Ma-Hsp70 a and Ma-Hsc70 b proteins (shown in cartoon), rendered by COFACTOR server
Download Original Figure

Function prediction on the basis of structural similarity

An ATP-binding site was predicted with a very high confidence, on the basis of structure similarity with bovine Hsc70 (PDB ID = 1kax-A; C-score = 0.98) and yeast actin (1yag-A; 0.96) for Ma-Hsp70 and Ma-Hsc70, respectively. High structural identity with yeast actin [47] reflects structural and mechanistic similarities between ATP hydrolytic mechanisms in proteins with different functions. In line with its ATPase function and previous findings [48], a phosphate ion (PO4)-binding site was also predicted with a relatively significant confidence on the basis of structural similarity with the bovine Hsc70 II (PDB ID = 1hpm-A; C-score = 0.22) for Ma-Hsp70 and the N-terminal domain of the Cryptosporidium parvum Hsp70 (PDB ID = 3l4i-B; C-score = 0.23) for Ma-Hsc70 (Table 5).

Table 5. Residue-specific ligand binding probability
PDB IDC-scoreClust. sizeLigandLigand-binding site residues
Ma-Hsp701kax-A0.98176ATP14, 15, 16, 17, 203, 204, 205, 206, 232, 270, 273, 274, 277, 340, 341, 342, 344, 345, 368
1hpm-A0.2233PO414, 15, 73, 149, 177, 231
Ma-Hsc701yag-A0.96146ATP12, 13, 14, 15, 17, 201, 202, 203, 204, 230, 268, 271, 272, 338, 339, 340, 342, 343, 366
3l4i-B0.2332PO412, 13, 71, 147, 175, 229

Predicted by COACH analysis, based on the C-score of the structural alignment between the query structure and known structures in the PDB. C-score is the confidence score of the prediction, where a higher score (min-0, max-1) indicates a more reliable prediction; Clust. size is the total number of templates in a cluster; ligand is the name of a possible binding ligand

Download Excel Table

In accordance with the predicted ligands, Enzyme Commission analysis (performed by I-TASSER program) predicted, with relatively similar confidence scores, both Ma-Hsp70 and Ma-Hsc70 to be either of the isozymes hexokinase and glucokinase, both of which can transfer an inorganic phosphate group from ATP to a substrate. Somewhat confusingly, deoxyribonuclease function was also predicted for both proteins (Table 6), however, upon closer inspection of the 3cjc PDB entry, it became obvious that this prediction was the result of an error in the data retrieval from the PDB, as the 3cjc-A chain represents actin, which is an ATPase, with an adenosine-binding site, while deoxyribonuclease is the 3cjc-D chain in the entry.

Table 6. Enzyme Commission (EC) predictions for Ma-Hsp70 and Ma-Hsc70 proteins
CscECPDB IDTM-scRMSDaIDENaCovEC No.EC Name
Ma-Hsp700.2413cjc-A0.4214.080.1370.4933.1.21.1Deoxyribonuclease I
0.2301qha-A0.4055.20.080.5092.7.1.1Hexokinase
0.2223f9m-A0.3864.080.0990.4572.7.1.2Glucokinase
0.2213hm8-A0.3814.250.1060.4562.7.1.1Hexokinase
0.2212e2o-A0.3623.80.1030.4172.7.1.1Hexokinase
Ma-Hsc700.2211qha-A0.3905.270.0890.4912.7.1.1Hexokinase
0.2163hm8-A0.3784.120.1110.4472.7.1.1Hexokinase
0.2153f9m-A0.3824.080.1100.4522.7.1.2Glucokinase
0.1851v4t-A0.2814.950.0640.3472.7.1.2 2.7.1.1Glucokinase Hexokinase
0.1683cjc-A0.4114.130.1310.4843.1.21.1Deoxyribonuclease I

CscoreEC is the confidence score for the enzyme commission number (EC No.) prediction (0–1), TM-sc is the TM-score, Cov represents the coverage of global structural alignment and is equal to the number of structurally aligned residues divided by length of the query protein. See Table 4 for other term explanations

Download Excel Table

Similarly, consensus prediction of GO terms also suggested ATP binding, interacting selectively and non-covalently with adenosine 5'-triphosphate (GO = 0005524; ontology = molecular function) as the main function for both proteins, with very high GO-scores (0.98 for Hsc70 and 0.99 for Hsp70).

All these results are in accordance with the previously described functioning mechanisms: Hsp70s in the ATP-bound state catch and release their substrates rapidly, while Hsp70s in the ADP-bound state seize them firmly. By cycling between the ATP- and ADP-bound states, Hsp70s exert their chaperone activity [14, 49]. However, the analysis was not sensitive enough to distinguish between the functions of the constitutive (Hsc) and inducible (Hsp) isoforms. This is not a major setback, as it has been suggested that the functional difference appears to lie more in regulation of the SBD-substrate interactions than in the physical properties of the two ATPase domains [15].

Conclusions

To help better understand the functional biology of Hsp70 and Hsc70 in blunt snout bream, several computational tools were used to analyse the physicochemical properties, generate valid homology models of both proteins and predict their functions on the basis of structural similarity to other protein templates. Apart from presenting the first published homology models of Hsp70 and Hsc70 proteins in fish, this research also revealed a high structural similarity to Hsp70/Hsc70 proteins from several taxonomically distant animal species, corroborating a remarkably high level of evolutionary conservation among the members of this protein family. Functional annotation based on structure similarity provides a reliable additional indirect evidence for a high level of functional conservation of these two genes/proteins in blunt snout bream, but it is not sensitive enough to distinguish between the two isoforms. In conclusion, even though gene function assignment based on protein structure similarity is at present somewhat limited by the number of available protein structures deposited in the PDB, it has a strong potential to become a very fast, cheap and relatively reliable technique for high-throughput gene function assignment in fish.

Abbreviations

3-D

three-dimensional

AI

aliphatic index

EC

extinction coefficient

GO

gene ontology

GRAVY

grand average of hydropathicity

Hsc

heat shock cognate

Hsp

heat shock protein

II

instability index

PDB

protein data bank

Notes

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

NTT and IJ designed the study, analysed data and wrote the manuscript. WMW reviewed the manuscript and provided guidance. All authors read and approved the final manuscript.

Authors’ information

NTT has completed his PhD degree from the Huazhong Agricultural University (China) and is presently working as a Postdoctoral Fellow at the Institute of Hydrobiology, Chinese Academy of Sciences (China). IJ is a Postdoctoral Fellow, while WMW is a Professor at College of Fisheries, Huazhong Agricultural University (China).

Acknowledgements

The first author Tran Ngoc Tuan would like to thank the China Scholarship Council for providing scholarship of doctoral program in Huazhong Agricultural University, Wuhan, Hubei, P.R. China.

References

1.

Lindquist S, Craig EA. The Heat-Shock Proteins. Annu Rev Genet. 1988; 22(1):631-677.

2.

Mestril R, Dillmann WH. Heat shock proteins and protection against myocardial ischemia. J Mol Cell Cardiol. 1995; 27(1):45-52.

3.

Basu N, Todgham AE, Ackerman PA, Bibeau MR, Nakano K, Schulte PM, et al. Heat shock protein genes and their functional significance in fish. Gene. 2002; 295(2):173-183.

4.

Yamashita M, Yabu T, Ojima N. Stress Protein HSP70 in Fish. Aqua-BioScience Monographs. 2010; 3(4):111-141.

5.

Basu N, Nakano T, Grau EG, Iwama GK. The Effects of Cortisol on Heat Shock Protein 70 Levels in Two Fish Species. Gen Comp Endocrinol. 2001; 124(1):97-105.

6.

Boutet I, Tanguy A, Rousseau S, Auffret M, Moraga D. Molecular identification and expression of heat shock cognate 70 (hsc70) and heat shock protein 70 (hsp70) genes in the Pacific oyster Crassostrea gigas. Cell Stress Chaperones. 2003; 8(1):76-85.

7.

Ming J, Xie J, Xu P, Liu W, Ge X, Liu B, et al. Molecular cloning and expression of two HSP70 genes in the Wuchang bream (Megalobrama amblycephala Yih). Fish Shellfish Immunology. 2010; 28(3):407-418.

8.

Iwama G, Thomas P, Forsyth R, Vijayan M. Heat shock protein expression in fish. Rev Fish Biol Fish. 1998; 8(1):35-56.

9.

Padmini E, Usha RM. Impact of seasonal variation on HSP70 expression quantitated in stressed fish hepatocytes. Comp Biochem Physiol B Biochem Mol Biol. 2008; 151(3):278-285.

10.

Welch W, Feramisco J. Disruption of the three cytoskeletal networks in mammalian cells does not affect transcription, translation, or protein translocation changes induced by heat shock. Mol Cell Biol. 1985; 5(7):1571-1581.

11.

Gething M-J, Sambrook J. Protein folding in the cell. Nature. 1992; 355(6355):33-45.

12.

Wallin RP, Lundqvist A, Moré SH, Von Bonin A, Kiessling R, Ljunggren H-G. Heat-shock proteins as activators of the innate immune system. Trends Immunol. 2002; 23(3):130-135.

13.

Daugaard M, Rohde M, Jäättelä M. The heat shock protein 70 family: Highly homologous proteins with overlapping and distinct functions. FEBS Lett. 2007; 581(19):3702-3710.

14.

Bukau B, Horwich AL. The Hsp70 and Hsp60 chaperone machines. Cell. 1998; 92(3):351-366.

15.

Tutar Y, Song Y, Masison DC. Primate chaperones Hsc70 (constitutive) and Hsp70 (induced) differ functionally in supporting growth and prion propagation in Saccharomyces cerevisiae. Genetics. 2006; 172(2):851-861.

16.

Lückstädt C, Schill RO, Focken U, Köhler H-R, Becker K. Stress protein HSP70 response of Nile Tilapia Oreochromis niloticus (Linnaeus, 1758) to induced hypoxia and recovery. Verhandlungen der Gesellschaft für Ichthyologie Band. 2004; 4:137-141.

17.

Yamashita M, Hojo M. Generation of a transgenic zebrafish model overexpressing heat shock protein HSP70. Mar Biotechnol. 2004; 6:S1-S7.

18.

Ojima N, Yamashita M, Watabe S. Comparative expression analysis of two paralogous Hsp70s in rainbow trout cells exposed to heat stress. Biochimica et Biophysica Acta (BBA)-Gene Structure and Expression. 2005; 1681(2):99-106.

19.

Mu W, Wen H, Li J, He F. Cloning and expression analysis of a HSP70 gene from Korean rockfish (Sebastes schlegeli). Fish shellfish immunology. 2013; 35(4):1111-1121.

20.

Wang P, Zeng S, Xu P, Zhou L, Zeng L, Lu X, et al. Identification and expression analysis of two HSP70 isoforms in mandarin fish Siniperca chuatsi. Fish Sci. 2014; 80(4):803-817.

21.

Zhou Z, Ren Z, Zeng H, Yao B. Apparent digestibility of various feedstuffs for bluntnose black bream Megalobrama amblycephala Yih. Aquac Nutr. 2008; 14(2):153-165.

22.

CAFS. Fishery Statistic Data: Chinese Academy of Fishery Sciences, Beijing. 2010.

23.

MAPRC. Chinese fisheries yearbook: Chinese Agricultural Press, Beijing. 2010.

24.

Martí-Renom MA, Stuart AC, Fiser A, Sánchez R, Melo F, Šali A. Comparative Protein Structure Modeling of Genes and Genomes. Annu Rev Biophys Biomol Struct. 2000; 29(1):291-325.

25.

Skolnick J, Fetrow JS, Kolinski A. Structural genomics and its importance for gene function analysis. Nat Biotechnol. 2000; 18(3):283-287.

26.

Teichmann SA, Murzin AG, Chothia C. Determination of protein function, evolution and interactions by structural genomics. Curr Opin Struct Biol. 2001; 11(3):354-363.

27.

Radivojac P, Clark WT, Oron TR, Schnoes AM, Wittkop T, Sokolov A, et al. A large-scale evaluation of computational protein function prediction. Nat Methods. 2013; 10(3):221-227.

28.

Gasteiger E, Hoogland C, Gattiker A, Wilkins MR, Appel RD, Bairoch A. Protein identification and analysis tools on the ExPASy server. The proteomics protocols handbook: Springer. 2005; p. 571-607.

29.

Hirokawa T, Boon-Chieng S, Mitaku S. SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics. 1998; 14(4):378-379.

30.

Schwede T, Kopp J, Guex N, Peitsch MC. SWISS-MODEL: an automated protein homology-modeling server. Nucleic Acids Res. 2003; 31(13):3381-3385.

31.

Arnold K, Bordoli L, Kopp J, Schwede T. The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling. Bioinformatics. 2006; 22(2):195-201.

32.

Fiser A. In: Fenyo D, editor. Template-based protein structure modeling. Computational Biology: Humana Press. 2004.

33.

Cristobal S, Zemla A, Fischer D, Rychlewski L, Elofsson A. A study of quality measures for protein threading models. BMC bioinformatics. 2001; 2(1):5.

34.

Sippl MJ. Recognition of errors in three‐dimensional structures of proteins. Proteins: Structure Function, and Bioinformatics. 1993; 17(4):355-362.

35.

Wiederstein M, Sippl MJ. ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Res. 2007; 35(suppl 2):W407-W410.

36.

Guex N, Peitsch MC. SWISS-MODEL and the Swiss-Pdb Viewer: An environment for comparative protein modeling. ELECTROPHORESIS. 1997; 18(15):2714-2723.

37.

Zhang Y, Skolnick J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 2005; 33(7):2302-2309.

38.

Zhang Y. I-TASSER server for protein 3D structure prediction. BMC bioinformatics. 2008; 9(1):40.

39.

Roy A, Kucukural A, Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc. 2010; 5(4):725-738.

40.

Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y. The I-TASSER Suite: protein structure and function prediction. Nat Methods. 2015; 12(1):7-8.

41.

Guruprasad K, Reddy BB, Pandit MW. Correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence. Protein Eng. 1990; 4(2):155-161.

42.

Ikai A. Thermostability and aliphatic index of globular proteins. J Biochem. 1980; 88(6):1895-1898.

43.

Hogg PJ. Disulfide bonds as switches for protein function. Trends Biochem Sci. 2003; 28(4):210-214.

44.

Ramachandran G, Ramakrishnan C, Sasisekhran V. Stereochemistry of polypeptide chain configuarations. J Mol Biol. 1963; 7:95-99.

45.

Liithy R, Bowie J, Eisenberg D. Assessment of protein models with three-dimensional profiles. Nature. 1992; 356(6364):83-85.

46.

Colovos C, Yeates TO. Verification of protein structures: patterns of nonbonded atomic interactions. Protein Sci. 1993; 2(9):1511-1519.

47.

Vorobiev S, Strokopytov B, Drubin D, Frieden C, Ono S, Condeelis J, et al. The structure of nonvertebrate actin: implications for the ATP hydrolytic mechanism. Proc Natl Acad Sci. 2003; 100(10):5760-5765.

48.

Zhang Z, Cellitti J, Teriete P, Pellecchia M, Stec B. New crystal structures of HSC-70 ATP binding domain confirm the role of individual binding pockets and suggest a new method of inhibition. Biochimie. 2015; 108:186-192.

49.

Arakawa A, Handa N, Ohsawa N, Shida M, Kigawa T, Hayashi F, et al. The C-terminal BAG domain of BAG5 induces conformational changes of the Hsp70 nucleotide-binding domain for ADP-ATP exchange. Structure. 2010; 18(3):309-319.