EvC Forum: DNA is not English

Email This Thread

Newer Topic | Older Topic

Author

Topic: DNA is not English

AnswersInGenitals

Member (Idle past 173 days)

Posts: 673

Joined: 07-20-2006

Normal Thread Display

Message 9 of 26 (372389)
12-27-2006 2:16 AM

Reply to: Message 1 by platypus
12-26-2006 1:55 AM

Useful analogy, but just an analogy.

Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht frist and lsat ltteer is at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae we do not raed ervey lteter by itslef but the wrod as a wlohe.'

This is a demonstration that (written) language is actually very 'sloppy'. It is highly redundant and full of extraneous elements. This is actually very important to the utility of the language, particularly in the face of typos, transpositions and other insults. This is equally true of the DNA code and translation machinery. Creationists are fond of stating or implying that a protein has to have an exact sequence of amino acids to perform its function and that any deviation from this sequence renders the protein ineffectual.

I often wonder how many of these creationists are diabetics. If they are diabetics, they most likely take daily insulin shots. The insulin they inject comes from cows or sheep from the slaughter house. (I also wonder how many animal rights activists are diabetic and use insulin from slaughtered animals.) The insulin from sheep and cows does not have the exact same sequence as human insulin and each differs from human insulin, as well as from each other, in several amino acids. But they are perfectly functional for their intended purpose. Insulin is used in control of glucose metabolism across the entire range of animal species, and if you compare the AA sequences between several distantly related species, what molecular biologists call a 'homology map', you find that only a small fraction of the the amino acids are common. There is even a considerable range of lengths to the insulin chain amongst the various species.

In very strong support of the evolutionary hypothesis is that the differences in the insulin sequences between species (and that of all other proteins) is comparable to their 'evolutionary' distance based on morphology and paleontology. In fact, these differences are use to estimate when species diverged along different evolutionary branches, and these estimates are in good general agreement with other dating methods.

This message is a reply to:
	Message 1 by platypus, posted 12-26-2006 1:55 AM		platypus has not replied

Replies to this message:
	Message 14 by 12345, posted 12-28-2006 5:26 PM		AnswersInGenitals has replied

AnswersInGenitals

Member (Idle past 173 days)

Posts: 673

Joined: 07-20-2006

Normal Thread Display

Message 15 of 26 (372639)
12-28-2006 6:43 PM

Reply to: Message 13 by Fosdick
12-27-2006 7:57 PM

Re: Genetic "language" and syntax

Maynard Smith & Ers Szathmáry writes:

In language, the meanings of sentences depend on the rules of syntax. These rules are formal and logical. In contrast, the ”meaning’ of the genetic message cannot be derived by logical reasoning.

I have to disagree with Maynard Smith & Ers Szathmáry, although I do so with trepidation since i have an almost reverential respect for these two. One of the fastest growing and most important fields in molecular biology and genetics is called bio-informatics. One of the major thrusts of this field is to use a set of very high power computer programs to scan through DNA codes and do exactly what Maynard Smith & Ers Szathmáry say cannot be done: derive by logical reasoning the genetic content of that code, the gene interactions, and to some extent the structure of the encoded proteins of the DNA sequence. This is possible precisely because the DNA code does have a fairly rigorous syntax and a lot of the current work in genetics is devoted to deciphering that syntax.

The human genome has about six billion base pairs (the As, Ts, Gs, and Cs) encoding about 20,000 genes. The genes average about 1000 base pairs to encode the typical 300 amino acid protein. So, only 20 million of the six billion base pairs, or about 0.3 % of the DNA is used to encode proteins. Well, that isn't quite true. Many of the genes occur multiple times, there is also DNA sequences near these genes that regulate when the gene is expressed, several RNA molecules of various types are also encoded, and there are a few other short functional sequences. Still, only about 1.5 % of the genome is used. Researchers have been able to excise DNA segments several million base pairs long from the genomes of fertilized mice eggs and the resultant adult mice appeared normal in all respects. In addition, that typical 1000 base pair gene sequence is hardly ever in one contiguous piece. It is almost always broken into several pieces, called exons because they are the expressed parts (that is, they provide the code for the protein that results from that gene) , separated by non-coding base pair sequences, called introns, that can be up to several 1000s of base pairs long. (Note that the OP is in error on this point. An intron is not like a period. That is the function of the tree stop codons. An intron is like what you get when your cat walks across your keyboard while you're typing.)

If you read an original paper on the sequencing of the DNA for some critter, once you get past the pages listing all the authors of the paper, which often number over 100, you will find charts and diagrams listing the number of genes, the numbers of the various types of proteins and RNAs encoded, how they are controlled, where they are located in the sequence, where the entrons are, and a lot of other information. How was this determined? By understanding(applying logical reasoning to) the grammar and syntax of the code. The entrons have specific sequences at either end the couple to special complexes called splicosomes so they can be removed from the each gene and the exons joined to make the final messenger RNA. Genes have special initiator and terminator sequences and are controlled by regulatory regions with specific characteristics (that, for example, allow certain proteins called transcription factors to latch onto them to initiate or inhibit gene transcription).

You will also find enumerated a list of "pseudogenes" that have the structure of genes but no initiator and so are never expressed, as well as genes that were spliced into the genome by a virus or a bacteria at some time in the species history. (The human genome has 139 such insertions.) How can they determine that? Amongst other indicators, virus and bacterial genomes tend to be rich in C-G pairs (more that 50% and less than 50% A-T pairs) while eukariotic genomes are slightly rich in A-T pairs. So, while all species speak the same 'language', there are definite dialectic differences.

Much more can be said about this exciting field, such as how an experienced researcher can determine a great deal about the structure of the encoded protein by examining the DNA code, but the point is that the 'language' analogy goes very deep and includes syntax and grammar as well as alphabet and words. These rules are so functional and powerful that a genome is more poetry than prose.

Finally, I must confess that I don't have any background in this area, I'm mere a fascinated bystander with his nose pressed against the window, so I would greatly appreciate any one more knowledgeable correcting or amplifying what I have posted.

This message is a reply to:
	Message 13 by Fosdick, posted 12-27-2006 7:57 PM		Fosdick has replied

Replies to this message:
	Message 24 by Fosdick, posted 12-29-2006 1:24 PM		AnswersInGenitals has not replied

AnswersInGenitals

Member (Idle past 173 days)

Posts: 673

Joined: 07-20-2006

Normal Thread Display

Message 16 of 26 (372640)
12-28-2006 6:46 PM

Reply to: Message 14 by 12345
12-28-2006 5:26 PM

Re: Useful analogy, but just an analogy.

Thank you for the correction and update. Has bactrial and synthetic insulin totally replaced that derived from animals? In either case, my point is still made, even amplified, that the genomic language has a lot of flexibility and nuances in it, a charactoristic of any 'real' language.

This message is a reply to:
	Message 14 by 12345, posted 12-28-2006 5:26 PM		12345 has replied

Replies to this message:
	Message 17 by 12345, posted 12-28-2006 7:26 PM		AnswersInGenitals has replied

AnswersInGenitals

Member (Idle past 173 days)

Posts: 673

Joined: 07-20-2006

Normal Thread Display

Message 19 of 26 (372691)
12-29-2006 2:09 AM

Reply to: Message 18 by Chiroptera
12-28-2006 10:21 PM

What WOULD linguists say?

But the real question here, Chiro, is what, if anything, would linguists say? What would they add to our understanding of the usefulness of the analogy between genetic codes and languages: where it is enlightening; where it is deceptive? Do you have any inputs from or references to the linguists take on this issue? In particular, can the linguists shed any light on the very confusing and confused issue of discerning information content of the code or a language? Whoops! I mentioned information content, which means we'll soon get a post from Brad McF. That's usually about when I bow out of a thread. Brad certainly adds some interesting insights to the evolution of language.

This message is a reply to:
	Message 18 by Chiroptera, posted 12-28-2006 10:21 PM		Chiroptera has not replied

AnswersInGenitals

Member (Idle past 173 days)

Posts: 673

Joined: 07-20-2006

Normal Thread Display

Message 20 of 26 (372693)
12-29-2006 2:16 AM

Reply to: Message 17 by 12345
12-28-2006 7:26 PM

Re: Useful analogy, but just an analogy.

12345 writes:

The apple falls to the ground. Do we say that the mass of the apple is the code or word that is translated by the laws of physics into an acceleration?

But the apple doesn't fall to the ground. It just follows its world line through space as that space is warped by the mass of the ground (earth), until it is stopped by its interaction with the ground (or some Englishman's head). So the question is: are the geometric constraints of space analogous to lanquage syntax? Or maybe the real question is: have we gotten off the road to total enlightenment and onto the road to total silliness?

This message is a reply to:
	Message 17 by 12345, posted 12-28-2006 7:26 PM		12345 has not replied

Replies to this message:
	Message 21 by alexcj, posted 12-29-2006 6:22 AM		AnswersInGenitals has replied

AnswersInGenitals

Member (Idle past 173 days)

Posts: 673

Joined: 07-20-2006

Normal Thread Display

Message 22 of 26 (372762)
12-29-2006 12:37 PM

Reply to: Message 21 by alexcj
12-29-2006 6:22 AM

Re: Useful analogy, but just an analogy.

To pre-empt brad

I know Brad McF. I have read Brad McF posts. And you, Sir, are no Brad McF. I understand everything you say in your post. In fact, I agree with everything you say.

This message is a reply to:
	Message 21 by alexcj, posted 12-29-2006 6:22 AM		alexcj has not replied

Date format: mm-dd-yyyy

Timezone: ET (US)

Newer Topic | Older Topic

Do Nothing Button

3 online now:	LamarkNewAge, Minnemooseus (Adminnemooseus), Rahvin
Newest Member:	ChatGPT
Post Volume:	Total: 916,483 Year: 3,740/9,624 Month: 611/974 Week: 224/276 Day: 64/34 Hour: 1/2