| Project Description |
|
The primary goal of this project is to demonstrate the utility of a novel technology called 'massively parallel signature sequencing' (MPSS) for the quantification of gene expression in plants. MPSS is a rapid method to produce 17 base pair sequence tags that are precisely representative of the population of messenger RNAs in a given tissue. The 17-bp tag is derived from the 3' end of a messenger RNA or 'transcript' and provides a virtually unique, experimentally derived identifier for each expressed gene. The number of identical tags in a library for a given gene is precisely indicative of the level of expression of that gene. The MPSS sequence data provide quantitative or 'digital' expression information for the entire 'transcriptome', avoiding problems inherent in microarray analysis such as cross-hybridization, pre-selection of probe sequences and low signal. Statistical methods for the analysis of quantitative expression data have demonstrated that these data are robust. The MPSS sequence data is most informative when the tags are compared to either a completely sequenced genome or to large collections of ESTs. One library has been generated from the grape cultivar Cabernet Sauvignon (Stage II berries). To take full advantage of the MPSS technology, the MPSS tags were compared to the EST databases produced by UC Davis (Cook lab). The comparison identifies the individual ESTs from which the tags are most likely derived; it is possible that some tags are derived from or duplicated in genes which have not been identified among the ESTs. The MPSS data can be used to quantify and confirm EST expression in grape berries. |
| In Vitro Cloning On Beads |
|
Solexa (www.solexa.com; Hayward, CA) has invented technologies to clone cDNAs on beads and sequence, in parallel, hundreds of thousands of tags from these cDNAs. A complex mix of cDNAs is cloned onto microbeads, with the representation of molecules on the beads identical to that in the original sample. The beads can then be queried, sorted and sequenced. To clone cDNAs on beads, the Solexa "Megaclone" technology uses a system of uniquely hybridizing tags and anti-tags. The anti-tags are 32-mers synthesized directly on 5-µm microbeads by "mix-and-divide" combinatorial synthesis, adding 4 bp "words" in 8 steps (Brenner et al., 2000b). Each bead then contains ~106 copies of a single, unique oligomer ("anti-tag"). The combinatorial synthesis produces up to 17 million unique oligomers. In a separate cloning step, complementary "tags" are appended to molecules in a cDNA library, which is then amplified by PCR. Each cDNA molecule receives a unique tag; the amplification product contains thousands of copies of each cDNA-tag hybrid. Importantly, cDNA-tag molecules are proportional to their representation in the original tissue. The 32-mers which comprise the tags and anti-tags are carefully designed such that there is a single Tm for the entire set of ~17 million unique tags. At 68°C, mismatched tag:anti-tag pairs dissociate, while perfect matches do not. When the amplified cDNA and the beads are mixed at 68° C, thousands of copies of each tagged cDNA hybridize, and are later ligated, to their cognate anti-tagged bead. A library of millions of cDNAs is transferred to beads, with each bead containing thousands of copies of a single cDNA (Brenner et al., 2000b). |
| Massively Parallel Signature Sequencing (MPSS) |
|
A critical advance in bead-based expression analysis is the ability to simultaneously sequence 'tags' on hundreds of thousands of microbeads. This 'Massively Parallel Signature Sequencing' (MPSS) technology has the potential to define all the expressed genes in a given tissue; the data precisely reflect individual gene expression levels and represent an essentially unbiased and precise count of mRNAs in a given tissue. The results are similar to SAGE expression analysis, although the tags are longer (more specific). The number of tags obtained per library is extremely large (>1,000,000), making the technology sensitive to genes expressed at low levels. From the MPSS procedure, a sequence signature of ~16-20 bp is identified from each bead; routinely, 17 bp of high quality sequence is obtained (Brenner et al., 2000a). This is performed in parallel, and approximately 1,000,000 sequence signatures are obtained per experiment (Brenner et al., 2000a; Solexa, unpublished). Although also a 'tag', the signature is different from the tag/anti-tag that is attached when cloning the cDNA on a bead; the signature is derived from the 3'-most Sau3A site 5' to the poly-A site of the original cDNA molecule (the words "tag" and "signature" will be used interchangeably from this point; the tag/anti-tag described above is only used in the cloning step and is irrelevant for the rest of the proposal). The sequencing reaction proceeds by a process that identifies sets of four bases by hybridization to one of 256 fluorescently labeled linker-probes, and then removal of that set of four bases by a type IIS restriction enzyme site contained in the linker (Brenner et al., 2000a). The restriction enzyme binds to the linker-probe but cleaves within the cDNA, exposing the next four bases for decoding. These reactions occur in consecutive steps while the beads are immobilized in a flow-cell underneath a high-power microscope, so that the reagents flow over and around the beads, and there are no gels or capillaries. The image files of the fluorescence from each step are then processed to derive the complete signature for each bead. The procedure is completely parallel, facilitating large-scale sequencing; the efficiency for some steps may be low, but starting with several million beads, enough full-length tags are produced to approach saturation of a given library in just a few days of sequencing (e.g. ~1,000,000 tags). For an animated explanation of MPSS and cloning on beads, please see the 'technology' web page at Solexa, Inc. Like the sequence data contained in ESTs, data derived from MPSS experiments have many uses. The expression level of particular genes can be quantitatively determined; the counted frequency of tags is representative of the expression level of the gene in the analyzed tissue. The completion of genomes such as those of yeast, C. elegans, and Rice permits the direct comparison of tags to genomic sequence and further extends the utility of MPSS data. Identification of genes and assessments of transcriptional activity are performed by aligning the tags to genomic sequence. The location of the polyadenylation site for each transcript can be determined within ~256 bp (the tag is derived from a 4 bp restriction site immediately 5' to the poly-A site). Several distinct tags matching different sites within a single gene is indicative of alternative 3' termination. With MPSS, differential expression may be detected simply by sequencing entire libraries and comparing them - without hybridization or the sorting of beads. Libraries are derived from distinct tissues or treatments. Each library of cDNAs-on-beads is sequenced to such depth that the transcript count for a given gene is compared among libraries. Basic statistics is then used to determine genes that are present in significantly different amounts in two libraries. Quantitative methods for the analysis of tag frequencies and detection of differences among libraries have been published and incorporated into public databases for SAGE data (Audic and Claverie, 1997; Greller and Tobin, 1999; Lash et al., 2000; Stekel et al., 2000). |
|
References
|
|
Audic, S. and Claverie, J.-M. (1997) The significance of digital gene expression profiles. Genome Res. 7: 986-995. Brenner, S., Johnson, M., Bridgham, J., Golda, G., Lloyd, D.H., Johnson, D., Luo, S., McCurdy, S., Foy, M., Ewan, M., Roth, R., George, D., Eletr, S., Albrecht, G., Vermaas, E., Williams, S.R., Moon, K., Burcham, T., Pallas, M., DuBridge, R.B., Kirchner, J., Fearon, K., Mao, J., and K. Corcoran. (2000a) Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat. Biotechnol. 18:630-634. Brenner, S., Williams, S.R., Vermaas, E.H., Storck, T., Moon, K., McCollum, C., Mao, J.I., Luo, S., Kirchner, J.J., Eletr, S., DuBridge, R.B., Burcham, T., and G. Albrecht. (2000b) In vitro cloning of complex mixtures of DNA on microbeads: physical separation of differentially expressed cDNAs. Proc. Natl. Acad. Sci. USA. 97:1665-1670. Greller, L.D. and F.L. Tobin (1999). Detecting selective expression of genes and proteins. Genome Res. 9:282-296. Lash, A. E., Tolstoshev, C. M., Wagner, L., Schuler, G. D., Strausberg, R. L., Riggins, G. J., and S. F. Altschul. (2000). SAGEmap: a public gene expression resource. Genome Res. 10:1051-1060. Stekel, D.J., Git, Y. and F. Falciani. (2000) The comparison of gene expression from multiple cDNA libraries. Genome Res. 10:2055-2061. |