Chitosanases: Molecular sequence data


So far, a number of chitosanases have been sequenced (mostly by deducing the amino acid sequence from the respective gene sequence). The majority of them come from Gram-positive microorganisms.

The sequenced chitosanases belong to six glycoside hydrolase families:GH5, GH7, GH8, GH46, GH75 and GH80.

NOTE: Throughout this website, as well as in most of our publications, the numbering of amino acid residues in the N174 chitosanase begins with the first a.a. of the mature, secreted protein. This will differ from the numbering used in databases on the internet where numbering begins usually with the first a.a. of the immature protein. To convert database numbering into our numbering you should simply subtract 40 residues (the length of the signal peptide) from the database number.



THE FAMILY 46

The most studied chitosanases adhering to the Enzyme Commission definition belong to the family 46 of glycoside hydrolases (see also ref. 11). Below we show the alignment of chitosanases belonging to this family.


CLUSTAL W alignment of chitosanase sequences:
* - residue conserved in all sequences;

STR_N174, Streptomyces sp. strain N174 (12)
NOC_N106, Nocardioides sp. strain N106 (13)
AMY_CS02, Amycolatopsis sp. strain CsO-2 (GenBank locus AB041775.1)
STR_CoelA, Streptomyces coelicolor A3(2), SCO0677 or SCF91.37, (54)
STR_CoelB, Streptomyces coelicolor A3(2),SCO2024 or SC3A3.02, (44)
BAC_SUBT, Bacillus subtilis (16) (60)
BAC_AMYL, Bacillus amyloliquefaciens (18)
BAC_KFB, Bacillus KFB-C04, thermostable chitosanase, (53) alias Bacillus sp. strain CK4 (55)
PBCV-1, Chlorella virus PBCV-1 (14, 50)
CVK2, Chlorella virus CVK2 (15)
BAC_EHIM, Bacillus ehimensis (19)(43)
BAC_COA, Bacillus coagulans (64). CAUTION: (March 2003) There are several discrepancies between the sequence published in reference 64 and the sequence presented as GenBank locus AAL40906
BUR_GLAD, Burkholderia gladioli (45), (58)
BAC_CIRC, Bacillus circulans MH-K1 (17) corrected according to (49)

Numbering on the top of the alignment refers to residue position in mature chitosanase from Streptomyces sp. strain N174.
Numbering on the bottom of the alignment refers to residue position in mature chitosanase from Bacillus circulans MH-K1.
To simplify the alignment, N-terminal signal peptide sequences from all sequences as well as N-terminal extensions of unknown function from PBCV-1 and CVK2 as well as Burkholderia gladioli and some bacilli chitosanases have been ignored.
The raw data obtained with ClustalW have been corrected according to the structural alignment between the chitosanase from Streptomyces sp. strain N174 and the chitosanase from Bacillus circulans MH-K1 published by Saito et al. (49).


                                                                            
                1       10        20        30        40        50        55
STR_N174        AGAGLDDPHKKEIAMELVSSAENSSLDWKAQYKYIEDIGDGRGYTGGIIGFCSG-----T
NOC_N106        AAVGLDDPHKKDIAMQLVSSAENSSLDWKSQYKYIEDIKDGRGYTAGIIGFCSG-----T
AMY_CSO2        ASVGLDDPAKKEIAMELVSSAENSSLDWKAQYKYIEDIGDGRGYTAGIIGFCSG-----T
STR_CoelA       RATGLDDPAKKEIAMQLVSSAENSSLDWKAQYRYIEDIGDGRGYTAGIIGFCSG-----T
STR_CoelB       LPPGLAAPAKKELAQQLVSSAENSTTKWRTAYGSIEDVGDGDGYTAGIIGFCTG-----T
BAC_SUBT        --AGLNKDQKRRA-EQLTSIFENG--TTEIQYGYVERLDDGRGYTCGRAGFTTA-----T
BAC_AMYL        --AGLNKDQKRRA-EQLTSIFENG--KTEIQYGYVEALDDGRGYTCGRAGFTTA-----T
BAC_KFB         --AGLNKDQKRRA-EQLRRICEDG--TTEMRYPYVARLDDARPSTCGPAGVTTA-----T
PBCV-1          KKLGFNTTNADTI-LSLIALPENSTTQWWKNYNYASCLKDGRGWTVTIYGACSG-----T
CVK2            KKLGFNTTNADTI-LSLIALPENSTTQWWKNYNYASCLKDGRGWTVTIYGACSG-----T
BAC_EHIM        DRTGLDGEQWNNI-MKLINKPEQDDLNWIKYYGYCEDINDERGYSIGIFGATTGGPRDTH
BUR_GLAD        DNTGLDGEQWDNI-MKLVNKPEQDSLDWTKFYGYCEDIGDDRGYTMGIFGATTGGPNDGG
BAC_COA         ARTGLDAGRWHDI-MTLMNRAEPDHWNCITYSGYCEDTNDQPAYPFAIGGASADGGRDTH
BAC_CIRC        NNTGLDGEQWNNI-MKLINKPEQDDLNWIKYYGYCEDIEDERGYTIGLFGATTGGSRDTH
                   *:    .      *    * .               *    .    *  :       
               17 20         30        40        50        60        70   75

               56  60        70        80        90              100    107
STR_N174        GDMLELVQHYTDLEPGNILAKYLPALKKVNGSASH--SGL-----GTPFTKDWATAAKD-
NOC_N106        GDMLDLVADYTDLKPGNILAKYLPALRKVNGTESH--AGL-----ASAFEKDWATAAKD-
AMY_CSO2        GDMLELVQHYTDLKPGNVLAKYLPALKKVNGTDSH--SGL-----GSAFVNDWRTAAKD-
STR_CoelA       GDMLDLVELYGERSPGNVLAPYLPALRRVDGSDSH--EGL-----DPGFPDDWRRAADQD
STR_CoelB       HDLLMLVERYTEDHPDNGLAEYLPALREVDGSDSH--EGL-----DPGFTAAWKAEAEV-
BAC_SUBT        GDALEVVEVYTKAVPNNKLKKYLPELRRLAKEESDDTSNL------KGFASAWKSLAND-
BAC_AMYL        GDALEVVEVYTKAVPNNKLKKYLPELRRLAKDESDDISNL------KGFASAWRSLGND-
BAC_KFB         RDGFEVVPVYKQAVANKKLPNYLAGLRRLEKEASDDTSKL------KGPASAWKSLADD-
PBCV-1          GDLLMVLESLQKINPNHPLVKFIPAMRKTKGDDIRGLENL---------GKVINGLGDD-
CVK2            GDLLMVLESLQKINPNHPLVKFIPAMRKTKGDDIRGLENL---------GKVINGLGDD-
BAC_EHIM        PDGPELFKAYDAAKGAGNPSVEGALKRLGINGKMKG-SILEIKDSEKVFCGKIKKLQND-
BUR_GLAD        PDGPALFKAYDAASGASNPSVQGGLARIGAHGSMQG-SILKITDSEKVFCGKVKGLQND-
BAC_COA         PDGPEVFKAYDPAKAAGNASLEAALRRLGMNGKLTG-SILCIKDSETVVCGKIKGLQPD-
BAC_CIRC        PDGPDLFKAYDAAKGASNPSADGALKRLGINGKMKG-SILEIKDSEKVFCGKIKKLQND-
                 *   :.                    :           *                    
               76  80        90       100       110        120       130

                110       120       130                 140       150    157
STR_N174        TVFQQAQNDERDRVYFDPAVSQAKADG---------LR-ALGQFAYYDAIVMHGPGNDPT
NOC_N106        SVFQQAQNDERDRSYFNPAVNQAKAS----------LR-ALGQFAYYDAIVMHGPGDSSD
AMY_CSO2        TVFQRAQNDERDRVYFNPAVKQAKAER---------LR-ALGQFVYYDAIVMHGPGSSSD
STR_CoelA       PQFRRAQDDERDRVYFDPAVRRGKEDG---------LR-TLGQFAYYDAMVMHGDGGGLG
STR_CoelB       PAFRAAQEAERDRVYFEPAVRLAKLDG---------LG-TLGQFVYYDAMVFHGPDTDAE
BAC_SUBT        KEFRAAQDKVNDHLYYQPAMKRSDNAG---------LKTALARAVMYDTVIQHGDGDDPD
BAC_AMYL        KAFRAAQDKVNDSLYYQPRNKRSENAG---------LKTALAKAVMYDTVIQHGDGDDPD
BAC_KFB         KAFRAAQDGVNDQVYYQPAMERSDNAG---------LTTALARAVMYHTVRQRGDGDDGD
PBCV-1          KEWQTAVWDIYVKLYWTFAADFSDKTGSAKNRPGPVMTSPLTRGFMVDVALNHGSNMES-
CVK2            KEWQTAVWDIYVKLYWTFAADFSDKTGSAKNRPGPVMTSPLTRGFMVDVALNHGSNMES-
BAC_EHIM        PAWRKAMWETFYNVYIRYSVEQARQRG---------FTSALTIGSFVDTALNQGATGDSN
BUR_GLAD        AAWREAMWRTFYSVYIQYSVQQARSRG---------FGSALTIGSFVDTALNQGADGGSN
BAC_COA         RAWRPAIWPTFYKVYMGYSVGQARPRG---------FTSALTIGSLVDTAGNQPASGASG
BAC_CIRC        AAWRKAMWETFYNVYIRYSVEQARQRG---------FTSAVTIGSFVDTALNQGATGGSD
                  :: *        *       .             :  .:      ..   :        
                    140       150       160                170       180

                160       170        180       190       200                 210 
STR_N174        SFGGIRKTAMKKAR-TPAQGGDETTYLNAFLDARKAAMLTEAAHD--D--------TSRVDTEQR
NOC_N106        SFGGIRKAAMKKAK-TPAQGRDEATYLKAFLAARKTVMLKEEAHS--D--------TSRVDTEQT
AMY_CSO2        SFGGIRAAAMKKAK-TPAQGGDEATYLNAFLDARKVIMKQEEAHA--D--------TSRVDTEQR
STR_CoelA       -FGSIRERALGRAR-PPAQGGDEVAYLHAFLDERVWAMKQEQAHS--D--------TSRVDTAQR
STR_CoelB       GFYGLRERAMAEAR-TPGQGGSEKAYLETFLDVRKQAMEAKRPGI--D--------TSRVDTAQR
BAC_SUBT        SFYALIKRTNKKAGGSPKDGIDEKKWLNKFLDVRYDDLMNP-ANH--DTRDEWRESVARVD-VLR
BAC_AMYL        SFYALIKRTNKKMGGSPKDGTDEKKWLNKFLDVRYDDLMNP-ADE--DTQDEWRESVARVD-VFR
BAC_KFB         SRYALIKRTPKGAGGSPKEGIDEQKCLNKFSHVRYDDLMNG-ANH--DRRDEWRESVGRVH-VLR
PBCV-1          -FSDILKRM-KNREE-----KDEAKWFLDFCETRRKLLKAGFQDL--DT----SKTGDRCT-LWA
CVK2            -FSDILKRM-KNKDE-----KDEAKWFLDFCETRRKLLKSGFQDL--DT----SKTGDRAI-LWS
BAC_EHIM        TLQGLLARS---GS-S----TNEKTFLKKFHAKRTLVVDTNEYNQPPNG-------KNRVK-QWD
BUR_GLAD        TLQGLLSRS---GN-S----TDEKTFMTSFYAQRTKVVDTHDFNQPPNG-------KNRVK-QWS
BAC_COA         TLPGLLARS---GS-R----TNEGTFLKGFHAKR-------------------------------
BAC_CIRC        TLQGLLARS---GS-S----SNEKTFMKNFHAKRTLVVDTNKYNKPPNG-------KNRVK-QWD
                    :                .*   :  *   *                        *
                    190              200       210       220              230    


              212     220       230     238
STR_N174        VFLKAGNLDLNPPLKWKTYGDPYVINS-
NOC_N106        VFLNAKNFDLNPPLKWKVYGDSYAINS-
AMY_CSO2        VFLNAKNFDLNPPLKWKVYGDPYQING-
STR_CoelA       VFLNEGNLDLEPPLDWHVYGDAYHIG--
STR_CoelB       RFLTAGNLKLATPLVWEMYGDTYRVP--
BAC_SUBT        SIAKENNYNLNGPIHVRSNEYGNFVIK-
BAC_AMYL        DIVKEKNYNLNGPIHVRSSEYGNFTIQ-
BAC_KFB         SIANQNNYNLNGGIHVRSHEYGNGVIK-
PBCV-1          NIFKEGNVGLKRPIKCYNGYWGKNIVIS
CVK2            ELFKTGNVGLKRPIKCYNGYWGKNIVIS
BAC_EHIM        TLLDMGKMNLKNVDAEIAQVT-NWEMK-
BUR_GLAD        TLMSQGITSLKNCDADIVKVT-SWTMK-
BAC_COA         TLLGMGQMNLRNVDGEIAHVAANWQMK-
BAC_CIRC        TLVDMGKMNLKNVDSEIAQVT-DWEMK-
                 :       *                  
                    240       250       259



THE FAMILY 75


Another chitosanase sequence was published in 1996 by Shimosaka et al (21). The producing microorganism first described as Fusarium solani is now called Nectria haematococca var. brevicona. This chitosanase has no detectable homology with family 46 enzymes.

The same research group has determined recently the sequence of a new fungal chitosanase from Aspergillus oryzae (published only in database, reference 51). These two chitosanases are also similar to the enzyme from Aspergillus fumigatus (published only in database, reference 52), and to the chitosanase from the entomopathogenic fungus Metarhizium anisopliae (59).
These four enzymes are now members of the Family 75 of glycoside hydrolases. They are significantly homologous to each other, as shown by the following alignment:



A_ORYZAE        ------------------YDLPENLKQIYE-KHKSGKCSKELQGGYDNGHSHDGKSFSYC
A_FUMIGA        ------------------------------------KCSKVLAKGFTNGDASQGKSFSYC
M_ANISOP        MRSTSLFAVVTLGAVASAYQLPANLKKIYD-QHKAGTCSNKLSGTFSGG-------ATYC
F_SOLANI        ------------------RDVPANVKTFKDSIIKQGSCKSTLATGFFSSDGDSG-TYSYC
                                                    .*.. *   : ..        :**

A_ORYZAE        GDIP---NAIYLHSSKNGGQYADMDIDCDGANRH---AGKCSNDHSGQGETRWKDEVQKL
A_FUMIGA        GDIP---GAIFISSSK-G--YTNMDIDCDGANNS---AGKCANDPSGQGETAFKSDVKKF
M_ANISOP        GDLP---NAIFLKGSN-GN-YDNMDIDCDGANNS---AGGCANDPTGQGQTAFKDTVKTY
F_SOLANI        GDHVKDYNVIYLQGKN-GK-LVNMDIDCDGVQGSPADDGRCGSSGDTQSITSFQWVLESY
                **     ..*:: ..: *    :*******.:      * *...   *. * ::  ::. 

A_ORYZAE        GID--DLDANIHPYVVFGNENDDGDDPEFDPRKHGMEPLSVMAVVCNKKLFYGIWGDTNG
A_FUMIGA        GIS--DLDANIHPYVVFGNED---HSPKFKPQSHGMQPLSVMAVVCNGQLHYGIWGDTNG
M_ANISOP        GIP--DLDANLHPYVVFGNEG---ASPSFNPQSKGMKPLSVMAVVCNNQVFYGVWGDTNG
F_SOLANI        GTSQKDLDANIHPYIVFGNEGTKKGWKTFDPEKHGIKPLSVMAVVCGNKMFYGIWGDENG
                *    *****:***:*****.       *.*..:*::*********. ::.**:*** **

A_ORYZAE        ----HTATGEASLSMAELCFPEEDPSGDSGHEPNDVLYIGFTGKEAVPGKS-ADWKADST
A_FUMIGA        ----GVSTGEASISLADLCFPNEHLDGNHGHDPNDVLFIGFTSKDAVPGAT-AKWKAKNA
M_ANISOP        ----FTSTGEASLALGKLCFPNEGLSGDNGHDPKDVHYIGFTEGDTVPGKSGANWKAKKT
F_SOLANI        DDGDQPMVGEASISLATACFG-KSMNGNFGHSDDDVLYIAFPGADAVPGAKGAKWNAKNF
                       .****:::.  **  :  .*: **. .** :*.*.  ::*** . *.*:*.. 

A_ORYZAE        ESFEESIKELGDKLVAGLKA----------------------------------------
A_FUMIGA        KEFEDSIKSIGDKLVAGLKA----------------------------------------
M_ANISOP        ADFEASIKALGDKLVARL------------------------------------------
F_SOLANI        DEFQTSITSLGDKLIKRIGGTNNGGGDTGGGSGNTCSWPGHCQGAACKTGDDCSDDLICT
                 .*: **. :****:  :                                          

A_ORYZAE        -------
A_FUMIGA        -------
M_ANISOP        -------
F_SOLANI        KGKCSPP
                                                                  
The chitosanase from Fusarium solani ends with a cysteine-rich module which is not present in the enzyme isolated from Aspergillus oryzae. We can only speculate about the structural properties of Family 75 enzymes. However, prediction of the secondary structure of the chitosanase from Fusarium solani by two independent algorithms (123D, PredictProtein) revealed a high propensity to form b-sheet-like structures, while the family 46 chitosanases from Streptomyces sp. strain N174 and Bacillus circulans have essentially an a-helical fold. The chitosanase from Aspergillus sp. (61) should also belong to this family, as its N-terminal sequence (YNLPNNLKQIUDDHK) is highly similar to fragments from the sequences of the other fungal chitosanases.



THE FAMILY 80

A new chitosanase from Matsuebacter chitosanotabidus (20) has been characterized. Its sequence is highly similar to that of the chitosanase from Sphingobacterium multivorum (48). These two enzymes have been classified in the family 80 of glycoside hydrolases. An alignment analysis performed in our laboratory has shown that, most probably, these two enzymes share a common, highly conserved "active site" module with the enzymes belonging to Family 46 (56).


BAC_CIRC      77 KPEQDDLN---------WIKYYGYCEDIEDERGYTIGLFGATTGGSRDTHPDGPDLF
BAC_EHIM      78 KPEQDDLN---------WIKYYGYCEDINDERGYSIGIFGATTGGPRDTHPDGPELF
BUR_GLAD     131 KPEQDSLD---------WTKFYGYCEDIGDDRGYTMGIFGATTGGPNDGGPDGPALF
BAC_SUBT      52 IFENGTT-----------EIQYGYVERLDDGRGYTCGRAGFTTA-----TGDALEVV
BAC_AMYL      53 IFENGKT-----------EIQYGYVEALDDGRGYTCGRAGFTTA-----TGDALEVV
BAC_KFB       48 ICEDGTTE-----------MRYPYVARLDDARPSTCGPAGVTTA-----TRDGFEVV
PBCV-1       105 LPENSTTQ---------WWKNYNYASCLKDGRGWTVTIYGACSG-----TGDLLMVL
CVK2         105 LPENSTTQ---------WWKNYNYASCLKDGRGWTVTIYGACSG-----TGDLLMVL
STR_N174      60 SAENSSLD---------WKAQYKYIEDIGDGRGYTGGIIGFCSG-----TGDMLELV
STR_COEL2     63 SAENSSLD---------WKAQYRYIEDIGDGRGYTAGIIGFCSG-----TGDMLDLV
NOC_N106      61 SAENSSLD---------WKSQYKYIEDIKDGRGYTAGIIGFCSG-----TGDMLDLV
STR_COEL1     88 SAENSTTK---------WRTAYGSIEDVGDGDGYTAGIIGFCTG-----THDLLMLV
MAT_CHIT     119 YPENGTTNYQEVGPWRYCEVDYEAAQGISDYRGDTFGPVGVTTV------GDFPDYF
SPH_MULT     111 YPENGTTNYQDPEPWRYCEVDYEASEGISDYRGNTFGPVGVTTV------GDFPDYF
                   E                  Y       D         G           D


In 2002, Shimono and coworkers (62) proposed another alignment for this region from some chitosanases belonging to families 46 and 80:



MAT_CHIT     119 YPENGTTNYQEVGPWRYCEVDYEAAQGISDYRGDTFGPVGVTTV-GD 164
STR_N174      60 SAENSSLDWKAQ--YKYIED-------IGDGRGYTGGIIGFCSGTGD  97
NOC_N106      61 SAENSSLDWKSQ--YKYIED-------IKDGRGYTAGIIGFCSGTGD  98
BAC_CIRC      77 KPEQDDLNWIKY--YGYCED-------IEDERGYTIGLFGATTG-GS 113
PBCV-1       105 LPENSTTQWWKN--YNYASC-------LKDGRGWTVTIYGACSGTGR 142
                   E             Y            D RG T        G

This alignment highlights an important difference between family 80 and 46 discovered by Shimono et al (62): the best candidate for a catalytic residue in the chitosanase from Matsuebacter chitosanotabidus is Glu141, an amino acid that does not seem to align with the catalytic residue from family 46 chitosanases (Asp80 in Streptomyces sp. N174). Without any evidence from crystallography. it is difficult, at present, to decide which alignment is a better representation of chitosanase sequence relationships.



THE FAMILY 8

This family includes enzymes with more diversified substrate specificities than the other above listed families. Besides chitosanases, it includes cellulases, licheninases and endo-xylanases. The best characterized chitosanase from this family is the enzyme from Bacillus sp. No.7M (27), the only known chitosanase which cleavage specificity is restricted to the GlcN-GlcN linkage. Other extensively studied chitosanases belonging to this family are produced by Bacillus circulans WL-12 (8) and Paenibacillus fukuinensis strain D2 (63).
THE FAMILY 5

In 2003, Tanabe et coworkers (65) characterized two chitosanases from Streptomyces griseus HUT 6037. These chitosanases (ChoI and ChoII) hydrolyzed not only chitosan but also carboxymethylcellulose with retention of the anomeric form. Both chitosanases catalyzed also a transglycosylation reaction. The chitosanase II-encoding gene (choI) has been cloned and sequenced. The sequence is highly homologous to the endoglucanase E-5 of Thermomonospora fusca. Both enzymes belong to the family 5 of glycoside hydrolases.


Return to previous section | Go to the next section | Return to index.

This page was created by Ryszard Brzezinski and Andrzej Neugebauer.
Questions? Proposals? Comments? Write to Ryszard.Brzezinski@USherbrooke.ca
Last updated: 2008/08/29