CD44 is encoded by a single gene located on human chromosome 11 and mouse chromosome 2. The genomic organization of the CD44 gene involves 20 exons in both mouse and humans (Figure lb). The first five exons coding for the extracellular domain are constant, whereas the next ten exons are subjected to differential alternative splicing, resulting in the generation of the variable region. Note that exon 6 (or VI) is not expressed in humans. Constant exons 16 and 17, together with part of exon 15, encode the membrane-proximal region; constant exon 18 encodes the transmembrane domain. Differential utilization of exons 19 and 20 generates the short version (3 amino acids) and the long version (70 amino acids) of the cytoplasmic tail, respectively.

Standard CD44 (CD44s), which lacks the entire variable region (Figure lc), is preferentially expressed on hematopoietic cells, and is therefore also designated CD44H. Northern blot analysis of RNA isolated from CD44 of different hematopoietic cells revealed three major transcripts in humans (-1.6, 2.2 and 4.8 kb) and mice (-1.6, 3.3 and 4.6 kb). Human CD44s mRNA is translated into 361 (mouse 363) amino acids. The predicted size of the core protein is 37-38 kDa. Post-translational modification doubles the molecular size of human and mouse primary protein, bringing it to 80-95 kDa. The mature CD44s is a single-chain molecule composed of a distal extracellular domain, a membrane-proximal region (together containing 248 amino acids in humans and 250 amino acids in the mouse), a transmembrane-spanning domain (23 amino acids), and a short (3 amino acids) or longer (70 amino acids), and much more abundant, cytoplasmic tail. The N-terminal domain contains six cysteine residues, which are possibly used to form one or more globular domains. This region, which includes the ligand (HA) binding sites of the molecule, displays -30% homology with cartilage link protein and proteoglycan core protein, which also both bind HA. The N-terminus, transmembrane domain and cytoplasmic tail, but not the membrane-proximal region, exhibit high (-80-90%) interspecies homology. The cytoplasmic tail contains several optional phosphorylation sites (Figure la).

Differential utilization of the variable region exons yields at least 20 different CD44 isoforms; those expressing alternatively spliced exons are designated CD44v (v stands for variant) (Figure lc). The CD44s and CD44v repertoire is further enriched by N- and O-glycosylation and glycosaminoglycanation (by heparan sulfate and chondroitin sulfate). The multi-structural nature of CD44 (its molecular weight ranging from 85 to 230 kDa) may extend the ligand inventory of this molecule and further increase its optional functions.

