This is just a post for disseminating information. Researching another post, I came across this paper released in December of last year (2003). The results were pretty interesting, 8,000 mapped pseudogenes in all, compared to about 30,000 functional genes in the human geneome. According to the abstract, the results can be seen at
Pseudogene.org
Abstract:
Genome Res. 2003 Dec;13(12):2541-58.
Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome.
Zhang Z, Harrison PM, Liu Y, Gerstein M.
Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520-8114, USA.
Processed pseudogenes were created by reverse-transcription of mRNAs; they provide snapshots of ancient genes existing millions of years ago in the genome. To find them in the present-day human, we developed a pipeline using features such as intron-absence, frame-disruption, polyadenylation, and truncation. This has enabled us to identify in recent genome drafts approximately 8000 processed pseudogenes (distributed from
Pseudogene.org). Overall, processed pseudogenes are very similar to their closest corresponding human gene, being 94% complete in coding regions, with sequence similarity of 75% for amino acids and 86% for nucleotides. Their chromosomal distribution appears random and dispersed, with the numbers on chromosomes proportional to length, suggesting sustained "bombardment" over evolution. However, it does vary with GC-content: Processed pseudogenes occur mostly in intermediate GC-content regions. This is similar to Alus but contrasts with functional genes and L1-repeats. Pseudogenes, moreover, have age profiles similar to Alus. The number of pseudogenes associated with a given gene follows a power-law relationship, with a few genes giving rise to many pseudogenes and most giving rise to few. The prevalence of processed pseudogenes agrees well with germ-line gene expression. Highly expressed ribosomal proteins account for approximately 20% of the total. Other notables include cyclophilin-A, keratin, GAPDH, and cytochrome c.