
So far, a number of chitosanases have been sequenced (mostly by deducing the amino acid sequence from the respective gene sequence). The majority of them come from Gram-positive microorganisms.
The sequenced chitosanases belong to six glycoside hydrolase families:GH5, GH7, GH8, GH46, GH75 and GH80.| NOTE: Throughout this website, as well as in most of our publications, the numbering of amino acid residues in the N174 chitosanase begins with the first a.a. of the mature, secreted protein. This will differ from the numbering used in databases on the internet where numbering begins usually with the first a.a. of the immature protein. To convert database numbering into our numbering you should simply subtract 40 residues (the length of the signal peptide) from the database number. |
1 10 20 30 40 50 55
STR_N174 AGAGLDDPHKKEIAMELVSSAENSSLDWKAQYKYIEDIGDGRGYTGGIIGFCSG-----T
NOC_N106 AAVGLDDPHKKDIAMQLVSSAENSSLDWKSQYKYIEDIKDGRGYTAGIIGFCSG-----T
AMY_CSO2 ASVGLDDPAKKEIAMELVSSAENSSLDWKAQYKYIEDIGDGRGYTAGIIGFCSG-----T
STR_CoelA RATGLDDPAKKEIAMQLVSSAENSSLDWKAQYRYIEDIGDGRGYTAGIIGFCSG-----T
STR_CoelB LPPGLAAPAKKELAQQLVSSAENSTTKWRTAYGSIEDVGDGDGYTAGIIGFCTG-----T
BAC_SUBT --AGLNKDQKRRA-EQLTSIFENG--TTEIQYGYVERLDDGRGYTCGRAGFTTA-----T
BAC_AMYL --AGLNKDQKRRA-EQLTSIFENG--KTEIQYGYVEALDDGRGYTCGRAGFTTA-----T
BAC_KFB --AGLNKDQKRRA-EQLRRICEDG--TTEMRYPYVARLDDARPSTCGPAGVTTA-----T
PBCV-1 KKLGFNTTNADTI-LSLIALPENSTTQWWKNYNYASCLKDGRGWTVTIYGACSG-----T
CVK2 KKLGFNTTNADTI-LSLIALPENSTTQWWKNYNYASCLKDGRGWTVTIYGACSG-----T
BAC_EHIM DRTGLDGEQWNNI-MKLINKPEQDDLNWIKYYGYCEDINDERGYSIGIFGATTGGPRDTH
BUR_GLAD DNTGLDGEQWDNI-MKLVNKPEQDSLDWTKFYGYCEDIGDDRGYTMGIFGATTGGPNDGG
BAC_COA ARTGLDAGRWHDI-MTLMNRAEPDHWNCITYSGYCEDTNDQPAYPFAIGGASADGGRDTH
BAC_CIRC NNTGLDGEQWNNI-MKLINKPEQDDLNWIKYYGYCEDIEDERGYTIGLFGATTGGSRDTH
*: . * * . * . * :
17 20 30 40 50 60 70 75
56 60 70 80 90 100 107
STR_N174 GDMLELVQHYTDLEPGNILAKYLPALKKVNGSASH--SGL-----GTPFTKDWATAAKD-
NOC_N106 GDMLDLVADYTDLKPGNILAKYLPALRKVNGTESH--AGL-----ASAFEKDWATAAKD-
AMY_CSO2 GDMLELVQHYTDLKPGNVLAKYLPALKKVNGTDSH--SGL-----GSAFVNDWRTAAKD-
STR_CoelA GDMLDLVELYGERSPGNVLAPYLPALRRVDGSDSH--EGL-----DPGFPDDWRRAADQD
STR_CoelB HDLLMLVERYTEDHPDNGLAEYLPALREVDGSDSH--EGL-----DPGFTAAWKAEAEV-
BAC_SUBT GDALEVVEVYTKAVPNNKLKKYLPELRRLAKEESDDTSNL------KGFASAWKSLAND-
BAC_AMYL GDALEVVEVYTKAVPNNKLKKYLPELRRLAKDESDDISNL------KGFASAWRSLGND-
BAC_KFB RDGFEVVPVYKQAVANKKLPNYLAGLRRLEKEASDDTSKL------KGPASAWKSLADD-
PBCV-1 GDLLMVLESLQKINPNHPLVKFIPAMRKTKGDDIRGLENL---------GKVINGLGDD-
CVK2 GDLLMVLESLQKINPNHPLVKFIPAMRKTKGDDIRGLENL---------GKVINGLGDD-
BAC_EHIM PDGPELFKAYDAAKGAGNPSVEGALKRLGINGKMKG-SILEIKDSEKVFCGKIKKLQND-
BUR_GLAD PDGPALFKAYDAASGASNPSVQGGLARIGAHGSMQG-SILKITDSEKVFCGKVKGLQND-
BAC_COA PDGPEVFKAYDPAKAAGNASLEAALRRLGMNGKLTG-SILCIKDSETVVCGKIKGLQPD-
BAC_CIRC PDGPDLFKAYDAAKGASNPSADGALKRLGINGKMKG-SILEIKDSEKVFCGKIKKLQND-
* :. : *
76 80 90 100 110 120 130
110 120 130 140 150 157
STR_N174 TVFQQAQNDERDRVYFDPAVSQAKADG---------LR-ALGQFAYYDAIVMHGPGNDPT
NOC_N106 SVFQQAQNDERDRSYFNPAVNQAKAS----------LR-ALGQFAYYDAIVMHGPGDSSD
AMY_CSO2 TVFQRAQNDERDRVYFNPAVKQAKAER---------LR-ALGQFVYYDAIVMHGPGSSSD
STR_CoelA PQFRRAQDDERDRVYFDPAVRRGKEDG---------LR-TLGQFAYYDAMVMHGDGGGLG
STR_CoelB PAFRAAQEAERDRVYFEPAVRLAKLDG---------LG-TLGQFVYYDAMVFHGPDTDAE
BAC_SUBT KEFRAAQDKVNDHLYYQPAMKRSDNAG---------LKTALARAVMYDTVIQHGDGDDPD
BAC_AMYL KAFRAAQDKVNDSLYYQPRNKRSENAG---------LKTALAKAVMYDTVIQHGDGDDPD
BAC_KFB KAFRAAQDGVNDQVYYQPAMERSDNAG---------LTTALARAVMYHTVRQRGDGDDGD
PBCV-1 KEWQTAVWDIYVKLYWTFAADFSDKTGSAKNRPGPVMTSPLTRGFMVDVALNHGSNMES-
CVK2 KEWQTAVWDIYVKLYWTFAADFSDKTGSAKNRPGPVMTSPLTRGFMVDVALNHGSNMES-
BAC_EHIM PAWRKAMWETFYNVYIRYSVEQARQRG---------FTSALTIGSFVDTALNQGATGDSN
BUR_GLAD AAWREAMWRTFYSVYIQYSVQQARSRG---------FGSALTIGSFVDTALNQGADGGSN
BAC_COA RAWRPAIWPTFYKVYMGYSVGQARPRG---------FTSALTIGSLVDTAGNQPASGASG
BAC_CIRC AAWRKAMWETFYNVYIRYSVEQARQRG---------FTSAVTIGSFVDTALNQGATGGSD
:: * * . : .: .. :
140 150 160 170 180
160 170 180 190 200 210
STR_N174 SFGGIRKTAMKKAR-TPAQGGDETTYLNAFLDARKAAMLTEAAHD--D--------TSRVDTEQR
NOC_N106 SFGGIRKAAMKKAK-TPAQGRDEATYLKAFLAARKTVMLKEEAHS--D--------TSRVDTEQT
AMY_CSO2 SFGGIRAAAMKKAK-TPAQGGDEATYLNAFLDARKVIMKQEEAHA--D--------TSRVDTEQR
STR_CoelA -FGSIRERALGRAR-PPAQGGDEVAYLHAFLDERVWAMKQEQAHS--D--------TSRVDTAQR
STR_CoelB GFYGLRERAMAEAR-TPGQGGSEKAYLETFLDVRKQAMEAKRPGI--D--------TSRVDTAQR
BAC_SUBT SFYALIKRTNKKAGGSPKDGIDEKKWLNKFLDVRYDDLMNP-ANH--DTRDEWRESVARVD-VLR
BAC_AMYL SFYALIKRTNKKMGGSPKDGTDEKKWLNKFLDVRYDDLMNP-ADE--DTQDEWRESVARVD-VFR
BAC_KFB SRYALIKRTPKGAGGSPKEGIDEQKCLNKFSHVRYDDLMNG-ANH--DRRDEWRESVGRVH-VLR
PBCV-1 -FSDILKRM-KNREE-----KDEAKWFLDFCETRRKLLKAGFQDL--DT----SKTGDRCT-LWA
CVK2 -FSDILKRM-KNKDE-----KDEAKWFLDFCETRRKLLKSGFQDL--DT----SKTGDRAI-LWS
BAC_EHIM TLQGLLARS---GS-S----TNEKTFLKKFHAKRTLVVDTNEYNQPPNG-------KNRVK-QWD
BUR_GLAD TLQGLLSRS---GN-S----TDEKTFMTSFYAQRTKVVDTHDFNQPPNG-------KNRVK-QWS
BAC_COA TLPGLLARS---GS-R----TNEGTFLKGFHAKR-------------------------------
BAC_CIRC TLQGLLARS---GS-S----SNEKTFMKNFHAKRTLVVDTNKYNKPPNG-------KNRVK-QWD
: .* : * * *
190 200 210 220 230
212 220 230 238
STR_N174 VFLKAGNLDLNPPLKWKTYGDPYVINS-
NOC_N106 VFLNAKNFDLNPPLKWKVYGDSYAINS-
AMY_CSO2 VFLNAKNFDLNPPLKWKVYGDPYQING-
STR_CoelA VFLNEGNLDLEPPLDWHVYGDAYHIG--
STR_CoelB RFLTAGNLKLATPLVWEMYGDTYRVP--
BAC_SUBT SIAKENNYNLNGPIHVRSNEYGNFVIK-
BAC_AMYL DIVKEKNYNLNGPIHVRSSEYGNFTIQ-
BAC_KFB SIANQNNYNLNGGIHVRSHEYGNGVIK-
PBCV-1 NIFKEGNVGLKRPIKCYNGYWGKNIVIS
CVK2 ELFKTGNVGLKRPIKCYNGYWGKNIVIS
BAC_EHIM TLLDMGKMNLKNVDAEIAQVT-NWEMK-
BUR_GLAD TLMSQGITSLKNCDADIVKVT-SWTMK-
BAC_COA TLLGMGQMNLRNVDGEIAHVAANWQMK-
BAC_CIRC TLVDMGKMNLKNVDSEIAQVT-DWEMK-
: *
240 250 259
A_ORYZAE ------------------YDLPENLKQIYE-KHKSGKCSKELQGGYDNGHSHDGKSFSYC
A_FUMIGA ------------------------------------KCSKVLAKGFTNGDASQGKSFSYC
M_ANISOP MRSTSLFAVVTLGAVASAYQLPANLKKIYD-QHKAGTCSNKLSGTFSGG-------ATYC
F_SOLANI ------------------RDVPANVKTFKDSIIKQGSCKSTLATGFFSSDGDSG-TYSYC
.*.. * : .. :**
A_ORYZAE GDIP---NAIYLHSSKNGGQYADMDIDCDGANRH---AGKCSNDHSGQGETRWKDEVQKL
A_FUMIGA GDIP---GAIFISSSK-G--YTNMDIDCDGANNS---AGKCANDPSGQGETAFKSDVKKF
M_ANISOP GDLP---NAIFLKGSN-GN-YDNMDIDCDGANNS---AGGCANDPTGQGQTAFKDTVKTY
F_SOLANI GDHVKDYNVIYLQGKN-GK-LVNMDIDCDGVQGSPADDGRCGSSGDTQSITSFQWVLESY
** ..*:: ..: * :*******.: * *... *. * :: ::.
A_ORYZAE GID--DLDANIHPYVVFGNENDDGDDPEFDPRKHGMEPLSVMAVVCNKKLFYGIWGDTNG
A_FUMIGA GIS--DLDANIHPYVVFGNED---HSPKFKPQSHGMQPLSVMAVVCNGQLHYGIWGDTNG
M_ANISOP GIP--DLDANLHPYVVFGNEG---ASPSFNPQSKGMKPLSVMAVVCNNQVFYGVWGDTNG
F_SOLANI GTSQKDLDANIHPYIVFGNEGTKKGWKTFDPEKHGIKPLSVMAVVCGNKMFYGIWGDENG
* *****:***:*****. *.*..:*::*********. ::.**:*** **
A_ORYZAE ----HTATGEASLSMAELCFPEEDPSGDSGHEPNDVLYIGFTGKEAVPGKS-ADWKADST
A_FUMIGA ----GVSTGEASISLADLCFPNEHLDGNHGHDPNDVLFIGFTSKDAVPGAT-AKWKAKNA
M_ANISOP ----FTSTGEASLALGKLCFPNEGLSGDNGHDPKDVHYIGFTEGDTVPGKSGANWKAKKT
F_SOLANI DDGDQPMVGEASISLATACFG-KSMNGNFGHSDDDVLYIAFPGADAVPGAKGAKWNAKNF
.****:::. ** : .*: **. .** :*.*. ::*** . *.*:*..
A_ORYZAE ESFEESIKELGDKLVAGLKA----------------------------------------
A_FUMIGA KEFEDSIKSIGDKLVAGLKA----------------------------------------
M_ANISOP ADFEASIKALGDKLVARL------------------------------------------
F_SOLANI DEFQTSITSLGDKLIKRIGGTNNGGGDTGGGSGNTCSWPGHCQGAACKTGDDCSDDLICT
.*: **. :****: :
A_ORYZAE -------
A_FUMIGA -------
M_ANISOP -------
F_SOLANI KGKCSPP
The chitosanase from Fusarium solani ends with a cysteine-rich module which is not present in the enzyme isolated
from Aspergillus oryzae.
We can only speculate about the structural properties of Family 75 enzymes. However, prediction of the secondary structure
of the chitosanase from Fusarium solani by two independent algorithms (123D, PredictProtein) revealed a high propensity
to form b-sheet-like structures, while the family 46 chitosanases from Streptomyces sp. strain N174
and Bacillus circulans have essentially an a-helical fold.
The chitosanase from Aspergillus sp. (61) should also belong to this family, as its N-terminal
sequence (YNLPNNLKQIUDDHK) is highly similar to fragments from the sequences of the other fungal chitosanases.
BAC_CIRC 77 KPEQDDLN---------WIKYYGYCEDIEDERGYTIGLFGATTGGSRDTHPDGPDLF
BAC_EHIM 78 KPEQDDLN---------WIKYYGYCEDINDERGYSIGIFGATTGGPRDTHPDGPELF
BUR_GLAD 131 KPEQDSLD---------WTKFYGYCEDIGDDRGYTMGIFGATTGGPNDGGPDGPALF
BAC_SUBT 52 IFENGTT-----------EIQYGYVERLDDGRGYTCGRAGFTTA-----TGDALEVV
BAC_AMYL 53 IFENGKT-----------EIQYGYVEALDDGRGYTCGRAGFTTA-----TGDALEVV
BAC_KFB 48 ICEDGTTE-----------MRYPYVARLDDARPSTCGPAGVTTA-----TRDGFEVV
PBCV-1 105 LPENSTTQ---------WWKNYNYASCLKDGRGWTVTIYGACSG-----TGDLLMVL
CVK2 105 LPENSTTQ---------WWKNYNYASCLKDGRGWTVTIYGACSG-----TGDLLMVL
STR_N174 60 SAENSSLD---------WKAQYKYIEDIGDGRGYTGGIIGFCSG-----TGDMLELV
STR_COEL2 63 SAENSSLD---------WKAQYRYIEDIGDGRGYTAGIIGFCSG-----TGDMLDLV
NOC_N106 61 SAENSSLD---------WKSQYKYIEDIKDGRGYTAGIIGFCSG-----TGDMLDLV
STR_COEL1 88 SAENSTTK---------WRTAYGSIEDVGDGDGYTAGIIGFCTG-----THDLLMLV
MAT_CHIT 119 YPENGTTNYQEVGPWRYCEVDYEAAQGISDYRGDTFGPVGVTTV------GDFPDYF
SPH_MULT 111 YPENGTTNYQDPEPWRYCEVDYEASEGISDYRGNTFGPVGVTTV------GDFPDYF
E Y D G D
In 2002, Shimono and coworkers (62) proposed another alignment for this region from some chitosanases
belonging to families 46 and 80:
MAT_CHIT 119 YPENGTTNYQEVGPWRYCEVDYEAAQGISDYRGDTFGPVGVTTV-GD 164
STR_N174 60 SAENSSLDWKAQ--YKYIED-------IGDGRGYTGGIIGFCSGTGD 97
NOC_N106 61 SAENSSLDWKSQ--YKYIED-------IKDGRGYTAGIIGFCSGTGD 98
BAC_CIRC 77 KPEQDDLNWIKY--YGYCED-------IEDERGYTIGLFGATTG-GS 113
PBCV-1 105 LPENSTTQWWKN--YNYASC-------LKDGRGWTVTIYGACSGTGR 142
E Y D RG T G
This alignment highlights an important difference between family 80 and 46 discovered by Shimono et al (62):
the best candidate for a catalytic residue in the chitosanase from Matsuebacter chitosanotabidus is Glu141, an amino acid
that does not seem to align with the catalytic residue from family 46 chitosanases (Asp80 in Streptomyces sp. N174). Without
any evidence from crystallography. it is difficult, at present, to decide which alignment is a better representation of chitosanase
sequence relationships.