Carbohydrate-binding module
Encyclopedia
In molecular biology, a carbohydrate-binding module (CBM) is a protein domain
Protein domain
A protein domain is a part of protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain. Each domain forms a compact three-dimensional structure and often can be independently stable and folded. Many proteins consist of several structural...

 found in carbohydrate
Carbohydrate
A carbohydrate is an organic compound with the empirical formula ; that is, consists only of carbon, hydrogen, and oxygen, with a hydrogen:oxygen atom ratio of 2:1 . However, there are exceptions to this. One common example would be deoxyribose, a component of DNA, which has the empirical...

-active enzymes (for example glycoside hydrolases). The majority of these domains have carbohydrate-binding activity. Some of these domains are found on cellulosomal scaffoldin proteins. CBMs were previously known as cellulose
Cellulose
Cellulose is an organic compound with the formula , a polysaccharide consisting of a linear chain of several hundred to over ten thousand β linked D-glucose units....

-binding domains. CBMs are classified into numerous families, based on amino acid
Amino acid
Amino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...

 sequence similarity. There are currently (June 2011) 64 families of CBM in the CAZy database.

CBMs of microbial glycoside hydrolase
Glycoside hydrolase
Glycoside hydrolases catalyze the hydrolysis of the glycosidic linkage to release smaller sugars...

s play a central role in the recycling of photosynthetically
Photosynthesis
Photosynthesis is a chemical process that converts carbon dioxide into organic compounds, especially sugars, using the energy from sunlight. Photosynthesis occurs in plants, algae, and many species of bacteria, but not in archaea. Photosynthetic organisms are called photoautotrophs, since they can...

 fixed carbon
Carbon
Carbon is the chemical element with symbol C and atomic number 6. As a member of group 14 on the periodic table, it is nonmetallic and tetravalent—making four electrons available to form covalent chemical bonds...

 through their binding
Binding (molecular)
Molecular binding is an attractive interaction between two molecules which results in a stable association in which the molecules are in close proximity to each other...

 to specific plant
Plant
Plants are living organisms belonging to the kingdom Plantae. Precise definitions of the kingdom vary, but as the term is used here, plants include familiar organisms such as trees, flowers, herbs, bushes, grasses, vines, ferns, mosses, and green algae. The group is also called green plants or...

 structural polysaccharide
Polysaccharide
Polysaccharides are long carbohydrate molecules, of repeated monomer units joined together by glycosidic bonds. They range in structure from linear to highly branched. Polysaccharides are often quite heterogeneous, containing slight modifications of the repeating unit. Depending on the structure,...

s. CBMs can recognise both crystalline and amorphous cellulose forms. CBMs are the most common non-catalytic modules associated with enzyme
Enzyme
Enzymes are proteins that catalyze chemical reactions. In enzymatic reactions, the molecules at the beginning of the process, called substrates, are converted into different molecules, called products. Almost all chemical reactions in a biological cell need enzymes in order to occur at rates...

s active in plant cell-wall hydrolysis
Hydrolysis
Hydrolysis is a chemical reaction during which molecules of water are split into hydrogen cations and hydroxide anions in the process of a chemical mechanism. It is the type of reaction that is used to break down certain polymers, especially those made by condensation polymerization...

. Many putative CBMs have been identified by amino acid sequence alignment
Sequence alignment
In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are...

s but only a few representatives have been show experimentally to have a carbohydrate-binding function.

CBM1

Carbohydrate-binding module family 1 (CBM1) consists of 36 amino acids. This domain contains 4 conserved cysteine
Cysteine
Cysteine is an α-amino acid with the chemical formula HO2CCHCH2SH. It is a non-essential amino acid, which means that it is biosynthesized in humans. Its codons are UGU and UGC. The side chain on cysteine is thiol, which is polar and thus cysteine is usually classified as a hydrophilic amino acid...

 residues which are involved in the formation of two disulfide bonds.

CBM2

Carbohydrate-binding module family 2 (CBM2) contains two conserved cysteine
Cysteine
Cysteine is an α-amino acid with the chemical formula HO2CCHCH2SH. It is a non-essential amino acid, which means that it is biosynthesized in humans. Its codons are UGU and UGC. The side chain on cysteine is thiol, which is polar and thus cysteine is usually classified as a hydrophilic amino acid...

s - one at each extremity of the domain - which have been shown to be involved in a disulfide bond
Disulfide bond
In chemistry, a disulfide bond is a covalent bond, usually derived by the coupling of two thiol groups. The linkage is also called an SS-bond or disulfide bridge. The overall connectivity is therefore R-S-S-R. The terminology is widely used in biochemistry...

. There are also four conserved tryptophans
Tryptophan
Tryptophan is one of the 20 standard amino acids, as well as an essential amino acid in the human diet. It is encoded in the standard genetic code as the codon UGG...

, two of which are involved in cellulose binding.

CBM3

Carbohydrate-binding module family 3 (CBM3) is involved in cellulose
Cellulose
Cellulose is an organic compound with the formula , a polysaccharide consisting of a linear chain of several hundred to over ten thousand β linked D-glucose units....

 binding and is found associated with a wide range of bacterial
Bacteria
Bacteria are a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria have a wide range of shapes, ranging from spheres to rods and spirals...

 glycosyl hydrolases. The structure
Secondary structure
In biochemistry and structural biology, secondary structure is the general three-dimensional form of local segments of biopolymers such as proteins and nucleic acids...

 of this domain is known; it forms a beta sandwich.

CBM4

Carbohydrate-binding module family 4 (CBM4) includes the two cellulose-binding domains, CBD(N1) and CBD(N2), arranged in tandem at the N terminus of the 1,4-beta-glucanase, CenC, from Cellulomonas fimi. These homologous CBMs are distinct in their selectivity for binding amorphous and not crystalline cellulose. Multidimensional heteronuclear nuclear magnetic resonance
Nuclear magnetic resonance
Nuclear magnetic resonance is a physical phenomenon in which magnetic nuclei in a magnetic field absorb and re-emit electromagnetic radiation...

 (NMR) spectroscopy was used to determine the tertiary structure of the 152 amino acid
Amino acid
Amino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...

 N-terminal cellulose-binding domain
Domain (biology)
In biological taxonomy, a domain is the highest taxonomic rank of organisms, higher than a kingdom. According to the three-domain system of Carl Woese, introduced in 1990, the Tree of Life consists of three domains: Archaea, Bacteria and Eukarya...

 from C. fimi 1,4-beta-glucanase CenC (CBDN1). The tertiary structure
Secondary structure
In biochemistry and structural biology, secondary structure is the general three-dimensional form of local segments of biopolymers such as proteins and nucleic acids...

 of CBDN1 is strikingly similar to that of the bacterial 1,3-1,4-beta-glucanases, as well as other sugar-binding protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

s with jelly-roll fold
Protein folding
Protein folding is the process by which a protein structure assumes its functional shape or conformation. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil....

s. CBM4 and CBM9 are closely related.

CBM5

Carbohydrate-binding module family 5 (CBM5) binds chitin. CBM5 and CBM12 are distantly related.

CBM6

Carbohydrate-binding module family 6 (CBM6) is unusual in that is contains two substrate-binding sites, cleft A and cleft B. Cellvibrio mixtus endoglucanase 5A contains two CBM6 domains, the CBM6 domain at the C-terminus displays distinct ligand binding specificities in each of the sustrate-binding clefts. Both cleft A and cleft B can bind cello-oligosaccharides, laminarin
Laminarin
The molecule laminarin is a storage glucan found in brown algae. It is used as a carbohydrate food reserve in the same way that chrysolaminarin is used by phytoplankton. It is created by photosynthesis and is made up of β-glucan with β-linkages. It is a linear polysaccharide, with a β:β ratio...

 preferentially binds in cleft A, xylooligosaccharides only bind in cleft A and beta1,4,-beta1,3-mixed linked glucans only bind in cleft B.

CBM9

Carbohydrate-binding module family 9 (CBM9) binds to crystalline cellulose. CBM4 and CBM9 are closely related.

CBM10

Carbohydrate-binding module family 10 (CBM10) is found in two distinct sets of protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

s with different functions. Those found in aerobic bacteria bind cellulose (or other carbohydrates); but in anaerobic fungi they are protein binding domains, referred to as dockerin domain
Dockerin
Dockerin is a protein domain found in the Cellulosome cellular structure. It is part of endoglucanase enzymes. The dockerin's binding partner is the cohesin domain. This interaction is essential to the construction of the Cellulosome complex . The Dockerin domain has two in-tandem repeats of a...

s. The dockerin domains are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria.

In anaerobic bacteria
Bacteria
Bacteria are a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria have a wide range of shapes, ranging from spheres to rods and spirals...

 that degrade plant cell walls, exemplified by Clostridium thermocellum
Clostridium thermocellum
Clostridium thermocellum is an anaerobic, thermophilic bacterium. C. thermocellum has garnered research interest due to its cellulolytic and ethanologenic abilities, being capable of directly converting a cellulosic substrate into ethanol. This makes it useful in converting biomass into a usable...

, the dockerin domains of the catalytic polypeptides can bind equally well to any cohesin
Cohesin
Cohesin is a protein complex that regulates the separation of sister chromatids during cell division, either mitosis or meiosis.- Structure :...

 from the same organism
Organism
In biology, an organism is any contiguous living system . In at least some form, all organisms are capable of response to stimuli, reproduction, growth and development, and maintenance of homoeostasis as a stable whole.An organism may either be unicellular or, as in the case of humans, comprise...

. More recently, anaerobic fungi, typified by Piromyces equi, have been suggested to also synthesise a cellulosome complex, although the dockerin sequences of the bacterial
Bacteria
Bacteria are a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria have a wide range of shapes, ranging from spheres to rods and spirals...

 and fungal
Fungus
A fungus is a member of a large group of eukaryotic organisms that includes microorganisms such as yeasts and molds , as well as the more familiar mushrooms. These organisms are classified as a kingdom, Fungi, which is separate from plants, animals, and bacteria...

 enzymes are completely different. For example, the fungal enzymes contain one, two or three copies of the dockerin sequence
Sequence
In mathematics, a sequence is an ordered list of objects . Like a set, it contains members , and the number of terms is called the length of the sequence. Unlike a set, order matters, and exactly the same elements can appear multiple times at different positions in the sequence...

 in tandem within the catalytic polypeptide. In contrast, all the C. thermocellum cellulosome catalytic components contain a single dockerin domain. The anaerobic bacterial dockerins are homologous to EF hands
EF hand
The EF hand is a helix-loop-helix structural domain found in a large family of calcium-binding proteins. The EF-hand motif contains a helix-loop-helix topology, much like the spread thumb and forefinger of the human hand, in which the Ca2+ ions are coordinated by ligands within the loop...

 (calcium-binding motifs) and require calcium for activity whereas the fungal dockerin does not require calcium. Finally, the interaction between cohesin and dockerin appears to be species
Species
In biology, a species is one of the basic units of biological classification and a taxonomic rank. A species is often defined as a group of organisms capable of interbreeding and producing fertile offspring. While in many cases this definition is adequate, more precise or differing measures are...

 specific in bacteria, there is almost no species specificity of binding within fungal species and no identified sites that distinguish different species.

The of dockerin from P. equi contains two helical stretches and four short beta-strands which form an antiparallel
Antiparallel (biochemistry)
In biochemistry, two molecules are antiparallel if they run side-by-side in opposite directions or when both strands are complimentary to each other....

 sheet structure adjacent to an additional short twisted parallel strand. The N- and C-termini are adjacent to each other.

CBM11

Carbohydrate-binding module family 11 (CBM11) is found in a number of bacterial cellulases. One example is the CBM11 of Clotridium thermocellum Cel26A-Cel5E, this domain has been shown to bind both β-1,4-glucan and β-1,3-1,4-mixed linked glucans. CBM11 has beta-sandwich structure with a concave side forming a substrate-binding cleft.

CBM12

Carbohydrate-binding module family 12 (CBM12) comprises two beta-sheets, consisting of two and three antiparallel beta strands respectively. It binds chitin via the aromatic rings of tryptophan
Tryptophan
Tryptophan is one of the 20 standard amino acids, as well as an essential amino acid in the human diet. It is encoded in the standard genetic code as the codon UGG...

 residues. CBM5 and CBM12 are distantly related.

CBM14

Carbohydrate-binding module family 14 (CBM14) is also known as the peritrophin-A domain. It is found in chitin
Chitin
Chitin n is a long-chain polymer of a N-acetylglucosamine, a derivative of glucose, and is found in many places throughout the natural world...

 binding proteins, particularly the peritrophic matrix
Peritrophic matrix
The peritrophic matrix or peritrophic membrane is a semi-permeable, non-cellular structure which surrounds the food bolus in an organism’s midgut. Although they are often found in insects, peritrophic matrixes are found in seven different phyla...

 proteins of insects and animal chitinases
Chitinase
Chitinases are hydrolytic enzymes that break down glycosidic bonds in chitin. As chitin is a component of the cell walls of fungi and exoskeletal elements of some animals , chitinases are generally found in organisms that either need to reshape their own chitin or dissolve and digest the chitin of...

. Copies of the domain are also found in some baculoviruses. It is an extracellular
Extracellular
In cell biology, molecular biology and related fields, the word extracellular means "outside the cell". This space is usually taken to be outside the plasma membranes, and occupied by fluid...

 domain that contains six conserved cysteine
Cysteine
Cysteine is an α-amino acid with the chemical formula HO2CCHCH2SH. It is a non-essential amino acid, which means that it is biosynthesized in humans. Its codons are UGU and UGC. The side chain on cysteine is thiol, which is polar and thus cysteine is usually classified as a hydrophilic amino acid...

s that probably form three disulfide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains.

CBM15

Carbohydrate-binding module family 15 (CBM15), found in bacterial enzymes, has been shown to bind to xylan
Xylan
Xylan is a generic term used to describe a wide variety of highly complex polysaccharides that are found in plant cell walls and some algae. Xylans are polysaccharides made from units of xylose ....

 and xylooligosaccharides. It has a beta-jelly roll fold, with a groove on the concave surface of one of the beta-sheets.

CBM17

Carbohydrate-binding module family 17 (CBM17) appears to have a very shallow binding cleft that may be more accessible to cellulose chain
Polymer
A polymer is a large molecule composed of repeating structural units. These subunits are typically connected by covalent chemical bonds...

s in non-crystalline cellulose than the deeper binding clefts of family 4 CBMs. Sequence and structural conservation in families CBM17 and CBM28 suggests that they have evolved
Evolution
Evolution is any change across successive generations in the heritable characteristics of biological populations. Evolutionary processes give rise to diversity at every level of biological organisation, including species, individual organisms and molecules such as DNA and proteins.Life on Earth...

 through gene duplication
Gene duplication
Gene duplication is any duplication of a region of DNA that contains a gene; it may occur as an error in homologous recombination, a retrotransposition event, or duplication of an entire chromosome.The second copy of the gene is often free from selective pressure — that is, mutations of it have no...

 and subsequent divergence. CBM17 does not compete with CBM28 modules when binding to non-crystalline cellulose. Different CBMs have been shown to bind to different sirtes in amorphous cellulose, CBM17 and CBM28 recognise distinct non-overlapping sites in amorphous cellulose.

CBM18

Carbohydrate-binding module family 18 (CBM18) (also known as chitin binding 1 or chitin recognition protein) is found in a number of plant
Plant
Plants are living organisms belonging to the kingdom Plantae. Precise definitions of the kingdom vary, but as the term is used here, plants include familiar organisms such as trees, flowers, herbs, bushes, grasses, vines, ferns, mosses, and green algae. The group is also called green plants or...

 and fungal
Fungus
A fungus is a member of a large group of eukaryotic organisms that includes microorganisms such as yeasts and molds , as well as the more familiar mushrooms. These organisms are classified as a kingdom, Fungi, which is separate from plants, animals, and bacteria...

 protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

s that bind N-acetylglucosamine
N-Acetylglucosamine
N-Acetylglucosamine is a monosaccharide derivative of glucose. It is an amide between glucosamine and acetic acid...

 (e.g. solanaceous lectins of tomato and potato, plant endochitinases, the wound-induced proteins: hevein, win1 and win2, and the Kluyveromyces lactis
Kluyveromyces lactis
Kluyveromyces lactis is a Kluyveromyces yeast commonly used for genetic studies and industrial applications. Its name comes from the ability to assimilate lactose and convert it into lactic acid.- Use :...

killer toxin
Toxin
A toxin is a poisonous substance produced within living cells or organisms; man-made substances created by artificial processes are thus excluded...

 alpha subunit). The domain may occur in one or more copies and is thought to be involved in recognition or binding of chitin
Chitin
Chitin n is a long-chain polymer of a N-acetylglucosamine, a derivative of glucose, and is found in many places throughout the natural world...

 subunits. In chitinases, as well as in the potato
Potato
The potato is a starchy, tuberous crop from the perennial Solanum tuberosum of the Solanaceae family . The word potato may refer to the plant itself as well as the edible tuber. In the region of the Andes, there are some other closely related cultivated potato species...

 wound-induced proteins, this 43-residue domain directly follows the signal sequence
Signal sequence
Signal sequence can refer to:*Protein targeting*Signal peptide*DNA uptake signal sequence...

 and is therefore at the N terminus of the mature protein; in the killer toxin alpha subunit it is located in the central section of the protein.

CBM19

Carbohydrate-binding module family 19 (CBM19), found in fungal chitinases
Chitinase
Chitinases are hydrolytic enzymes that break down glycosidic bonds in chitin. As chitin is a component of the cell walls of fungi and exoskeletal elements of some animals , chitinases are generally found in organisms that either need to reshape their own chitin or dissolve and digest the chitin of...

, binds chitin
Chitin
Chitin n is a long-chain polymer of a N-acetylglucosamine, a derivative of glucose, and is found in many places throughout the natural world...

.

CBM21

Carbohydrate-binding module family 21 (CBM21), found in many eukaryotic proteins involved in glycogen
Glycogen
Glycogen is a molecule that serves as the secondary long-term energy storage in animal and fungal cells, with the primary energy stores being held in adipose tissue...

 metabolism, binds to glycogen.

CBM25

Carbohydrate-binding module family 25 (CBM25) binds alpha-glucooligosaccharides, particularly those containing alpha-1,6 linkages, and granular starch.

CBM27

Carbohydrate-binding module family 27 (CBM27) binds to beta-1,4-mannooligosaccharides, carob galactomannan
Galactomannan
Galactomannans are polysaccharides consisting of a mannose backbone with galactose side groups Galactomannans are polysaccharides consisting of a mannose backbone with galactose side groups Galactomannans are polysaccharides consisting of a mannose backbone with galactose side groups (more...

, and konjac
Konjac
Konjac , also known as konjak, konjaku, konnyaku potato, devil's tongue, voodoo lily, snake palm, or elephant yam , is a plant of the genus Amorphophallus...

 glucomannan, but not to cellulose (insoluble and soluble) or soluble birchwood xylan. CBM27 adopts a beta sandwich structure comprising 13 beta strands with a single, small alpha-helix and a single metal atom
Atom
The atom is a basic unit of matter that consists of a dense central nucleus surrounded by a cloud of negatively charged electrons. The atomic nucleus contains a mix of positively charged protons and electrically neutral neutrons...

.

CBM28

Carbohydrate-binding module family 28 (CBM28) does not compete with CBM17 modules when binding to non-crystalline cellulose. Different CBMs have been shown to bind to different sirtes in amorphous cellulose, CBM17 and CBM28 recognise distinct non-overlapping sites in amorphous cellulose. CBM28 has a "beta-jelly roll" topology, which is similar in structure to the CBM17 domains. Sequence and structural conservation in families CBM17 and CBM28 suggests that they have evolved
Evolution
Evolution is any change across successive generations in the heritable characteristics of biological populations. Evolutionary processes give rise to diversity at every level of biological organisation, including species, individual organisms and molecules such as DNA and proteins.Life on Earth...

 through gene duplication
Gene duplication
Gene duplication is any duplication of a region of DNA that contains a gene; it may occur as an error in homologous recombination, a retrotransposition event, or duplication of an entire chromosome.The second copy of the gene is often free from selective pressure — that is, mutations of it have no...

 and subsequent divergence.

CBM33

Carbohydrate-binding module family 33 (CBM33) is a chitin-binding domain. It has a budded fibronectin type III fold consisting of two beta-sheets, arranged as a beta-sheet sandwich and a bud consisting of three short helices, located between beta-strands 1 and 2. It binds chitin via conserved polar amino acids. This domain is found in isolation in baculoviral
Baculovirus
The baculoviruses are a family of large rod-shaped viruses that can be divided to two genera: nucleopolyhedroviruses and granuloviruses . While GVs contain only one nucleocapsid per envelope, NPVs contain either single or multiple nucleocapsids per envelope. The enveloped virions are further...

 spheroidin and spindolin proteins.

CBM48

Carbohydrate-binding module family 48 (CBM48) is often found in enzymes containing glycosyl hydrolase family 13 catalytic domains. It is found in a range of enzymes that act on branched substrates
Substrate (biochemistry)
In biochemistry, a substrate is a molecule upon which an enzyme acts. Enzymes catalyze chemical reactions involving the substrate. In the case of a single substrate, the substrate binds with the enzyme active site, and an enzyme-substrate complex is formed. The substrate is transformed into one or...

 ie. isoamylase, pullulanase
Pullulanase
Pullulanase is a specific kind of glucanase, an amylolytic exoenzyme, that degrades pullulan. It is produced as an extracellular, cell surface-anchored lipoprotein by Gram-negative bacteria of the genus Klebsiella. Type I pullulanases specifically attack α-1,6 linkages, while type II pullulanases...

 and branching enzyme. Isoamylase hydrolyses 1,6-alpha-D-glucosidic branch linkages in glycogen, amylopectin
Amylopectin
Amylopectin is a soluble polysaccharide and highly branched polymer of glucose found in plants. It is one of the two components of starch, the other being amylose.Glucose units are linked in a linear way with α glycosidic bonds...

 and dextrin
Dextrin
Dextrins are a group of low-molecular-weight carbohydrates produced by the hydrolysis of starch or glycogen. Dextrins are mixtures of polymers of D-glucose units linked by α- or α- glycosidic bonds....

; 1,4-alpha-glucan branching enzyme functions in the formation of 1,6-glucosidic linkages of glycogen; and pullulanase is a starch-debranching enzyme. CBM48 binds glycogen.

CBM49

Carbohydrate-binding module family 49 (CBM49) is found at the C-terminal of cellulases and in vitro
In vitro
In vitro refers to studies in experimental biology that are conducted using components of an organism that have been isolated from their usual biological context in order to permit a more detailed or more convenient analysis than can be done with whole organisms. Colloquially, these experiments...

binding studies have shown it to binds to crystalline cellulose.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK