For each contig, the cDNA contain ing the largest transcript was identified. These, together often with all singleton cDNAs were used to construct a unigene set of 8,950 sequences. The relative contribu tion of each cDNA library to the pool of identified ESTs is summarized in Table 2. It is notable that the distribution of ESTs across the original cDNA libraries was not uniform. The highest proportion of the sequences could be associated with endosperm tissue, the lowest with 8 days old embryo. EST sequences were analyzed with the BLAST2GO software. In a first phase, homology searches using public domain non redundant databases identified sig nificantly homologous sequences for 48. 4% of the ESTs considered. These ESTs represented 3,090 single hit and 1,240 multiple hit sequences.
In a second phase, an attempt was made to associate biological processes to each of the ESTs showing sequence homology using the gene ontology and KEGG databases. Approximately 85% of these unigenes could be assigned a functional annota tion, with the remainder having an ambiguous or unknown function. Figure 2 summarizes the assign ment of the biological processes and molecular func tions. Twenty four distinct groups were identified to establish the complex regulatory hierarchies that exist to orchestrate the dynamic metabolic, transport, and con trol processes occurring in developing endosperm. This classification is consistent with the many functions of maize endosperm and is comparable with that reported by other workers.
It appears that our maize endosperm gene set is rather comprehensive and pro vides a good representation of the entire transcriptome including genes linked to accumulation of storage pro ducts and energy supply. More specifically, a large num ber of transcripts appeared to be involved in carbohydrate metabolism, followed by those par ticipating in storage protein synthesis, translation and transcription, nucleotide metabolism, and RNA processing. Among physiologi cal processes, those transcripts implicated in protein turnover, energy metabolism, electron transport, amino acid metabolism, amino acid and sugar transport, the latter being intrinsi cally linked to the accumulation of storage protein and starch, nucleic acid metabolism, lipid and fatty acid metabolism, and secondary metabolites were represented in our EST collection.
More over, genes encoding for protein involved in cell wall, cytoskeleton, and stress and defence appear related to relevant cellular processes assigned in the functional classification. Finally, the assignment of other important classes of transcripts, such as DNA and protein folding, tran scription regulators, and signal transducers provides new perspectives for data Dacomitinib mining and for studies of coordinated gene regulation in developing maize endo sperm.