Register | Sign In


Understanding through Discussion


EvC Forum active members: 65 (9162 total)
7 online now:
Newest Member: popoi
Post Volume: Total: 915,816 Year: 3,073/9,624 Month: 918/1,588 Week: 101/223 Day: 12/17 Hour: 1/0


Thread  Details

Email This Thread
Newer Topic | Older Topic
  
Author Topic:   How do "novel" features evolve?
Percy
Member
Posts: 22392
From: New Hampshire
Joined: 12-23-2000
Member Rating: 5.2


Message 256 of 314 (662184)
05-13-2012 7:57 AM
Reply to: Message 255 by zaius137
05-12-2012 1:22 PM


Re: Information
Hi Zaius,
Your quoted definition is telling you the same thing we've already been telling you.
Your inability to understand these definitions and explanations of Shannon entropy stems from your assumption of an inverse relationship between randomness and information content. You think that the greater the randomness the less the information content. In fact the reverse is true.
An informally stated law of information theory is that you can't tell someone something he already knows. An analogy would be that if I already know it's lunchtime that you have communicated no information when you tell me it's lunchtime.
If you want to know the exact positions of all the atoms in a fixed amount of gas in a square container at a point in time, because they are randomly oriented it would take a great many bits to communicate this information. The entropy is very high and so the amount of information needed to communicate all this random information is also very high.
But if you want to know the exact positions of the same number of atoms in a square crystal all you need to communicate is the position of one corner and the orientation. This is because the crystal has a regular non-random structure. Its entropy is very low, and the information communicated by sending the position of each and every atom is redundant and unnecessary, communicating little that was not already known.
More generally, if I already know the next bit is going to be a 1, then you have communicated no information when you tell me the next bit is 1. The information content is 0 bits and the entropy is 0 bits.
But if I have no idea whether the next bit will be 1 or 0 and you tell me it is 1, then the information content and the entropy is 1 bit.
If you're communicating the letters of words then the probability of the next letter depends upon the previous letter. If the previous letter is "b" and there are 9 letters that can follow "b", then I have a 1/9 chance of guessing the next letter, and so the entropy is high and the amount of information communicated is high when I receive the next letter.
Or say the previous letter was "z" and there are only 6 letters that can follow "z", then I have a 1/6 chance of guessing the next letter, and so the entropy is lower and the amount of information communicated is lower when I receive the next letter
But if the previous letter is "q" then there is only one letter that can follow "q", and I have a 1/1 chance of guessing that next letter correctly. The entropy is 0 and the amount of information communicated when I receive "u" as the next letter is also 0.
As my odds of guessing the next letter have risen from 1/9 to 1/6 to 1/1, in other words as the randomness has declined, the entropy of each next letter and the information communicated has also declined. It is a direct relationship.
--Percy
Edited by Percy, : Typo.

This message is a reply to:
 Message 255 by zaius137, posted 05-12-2012 1:22 PM zaius137 has replied

Replies to this message:
 Message 257 by zaius137, posted 05-14-2012 12:59 AM Percy has replied

  
zaius137
Member (Idle past 3409 days)
Posts: 407
Joined: 05-08-2012


Message 257 of 314 (662238)
05-14-2012 12:59 AM
Reply to: Message 256 by Percy
05-13-2012 7:57 AM


Re: Information
As my odds of guessing the next letter have risen from 1/9 to 1/6 to 1/1, in other words as the randomness has declined, the entropy of each next letter and the information communicated has also declined. It is a direct relationship.
I think we are viewing the same elephant from different angles. When you say randomness of the system declined, I say innate information of the system has increased. Yes, then the number of bits needed to quantify the system would decrease. Information of the system increases entropy decreases (it is an inverse relationship from that perspective).
I believe I can say we agree here

This message is a reply to:
 Message 256 by Percy, posted 05-13-2012 7:57 AM Percy has replied

Replies to this message:
 Message 258 by PaulK, posted 05-14-2012 2:23 AM zaius137 has not replied
 Message 259 by Dr Adequate, posted 05-14-2012 2:25 AM zaius137 has not replied
 Message 261 by Percy, posted 05-14-2012 9:00 AM zaius137 has replied

  
PaulK
Member
Posts: 17822
Joined: 01-10-2003
Member Rating: 2.2


Message 258 of 314 (662253)
05-14-2012 2:23 AM
Reply to: Message 257 by zaius137
05-14-2012 12:59 AM


Re: Information
quote:
I think we are viewing the same elephant from different angles. When you say randomness of the system declined, I say innate information of the system has increased. Yes, then the number of bits needed to quantify the system would decrease. Information of the system increases entropy decreases (it is an inverse relationship from that perspective).
I don't think that that is true at all.
If I understand correctly your claim is that the less information in the message, the more information in the source of the message ("the system"). It certainly makes no sense to say that the information in the message goes up as the information in the message declines !
But this seems obviously false. A system that is only capable of producing one message can be very simple. Simpler than a system which produces two distinct messages. How can we then say that the first system has more information in it than the second ?
I would argue that the important distinction is between meaningful messages and random noise. But assuming the production of meaningful messages, we come back to the relationship that higher entropy = more information. Shannon information does not deal with the issue of meaning so it seems that the entropy of the signal is the only useful measure of information that it has to offer.

This message is a reply to:
 Message 257 by zaius137, posted 05-14-2012 12:59 AM zaius137 has not replied

  
Dr Adequate
Member (Idle past 284 days)
Posts: 16113
Joined: 07-20-2006


Message 259 of 314 (662254)
05-14-2012 2:25 AM
Reply to: Message 257 by zaius137
05-14-2012 12:59 AM


Re: Information
I think we are viewing the same elephant from different angles. When you say randomness of the system declined, I say innate information of the system has increased. Yes, then the number of bits needed to quantify the system would decrease. Information of the system increases entropy decreases (it is an inverse relationship from that perspective).
Well in that case you're using words like information and entropy in the exact opposite way to Shannon. And if that doesn't perturb you, consider this: according to your way of doing things, a string consisting of no bases of DNA would have maximal information --- surely this can't be what you intend?
Shannon was a genius, he didn't invent information theory one evening when he was drunk, perhaps you should consider following his lead.

This message is a reply to:
 Message 257 by zaius137, posted 05-14-2012 12:59 AM zaius137 has not replied

  
Dr Adequate
Member (Idle past 284 days)
Posts: 16113
Joined: 07-20-2006


Message 260 of 314 (662256)
05-14-2012 2:31 AM
Reply to: Message 255 by zaius137
05-12-2012 1:22 PM


Re: Information
Remember where Shannon entropy is most appropriate. It gives insight to how many bits needed to covey an independent variable by communications. As the randomness of that variable increases (less innate information) ...
More information. Look at your citation:
As an example consider some English text, encoded as a string of letters, spaces and punctuation (so our signal is a string of characters). Since some characters are not very likely (e.g. 'z') while others are very common (e.g. 'e') the string of characters is not really as random as it might be. On the other hand, since we cannot predict what the next character will be, it does have some 'randomness'. Entropy is a measure of this randomness, suggested by Claude E. Shannon in his 1949 paper A Mathematical Theory of Communication.
So according to Shannon, an encyclopedia has more 'randomness' (entropy, information) than a book of the same length consisting only of millions of instances of the letter A. And the former is indeed a great deal more informative than the latter.

This message is a reply to:
 Message 255 by zaius137, posted 05-12-2012 1:22 PM zaius137 has not replied

  
Percy
Member
Posts: 22392
From: New Hampshire
Joined: 12-23-2000
Member Rating: 5.2


Message 261 of 314 (662271)
05-14-2012 9:00 AM
Reply to: Message 257 by zaius137
05-14-2012 12:59 AM


Re: Information
zaius137 writes:
I believe I can say we agree here
Except for the fact that your understanding is backwards, sure, we agree. Why don't you respond to the very specific examples I provided. They reveal how precisely backwards your understanding is. Shannon entropy is a measure of the predictability of the next bit. As that predictability declines the entropy increases and the amount of information also increases.
Here's an example of how you're thinking about information. We have a book on our computer that contains information. We run the book through a program that randomly scrambles all the characters. You think the book now has less information, and that's where you've gone wrong.
The fact of the matter is that the book now has more information than it had before because we're less able to predict the next character. For example, if I saw the letter "q" in the original book I would know that the next letter was "u". When I find out that the next letter is "u" I haven't learned anything. No information has been communicated.
But if I saw the letter "q" in the scrambled book I would have no idea what the next letter could be. When I find out the next letter is "f" I have learned something I could not possibly have known. Information has definitely been communicated.
Your original point was that "creationists like Myers" have defined "information in the genome", but you have as yet offered no evidence whatsoever of this, and the fact that you yourself misunderstand information underscores this point.
--Percy

This message is a reply to:
 Message 257 by zaius137, posted 05-14-2012 12:59 AM zaius137 has replied

Replies to this message:
 Message 262 by zaius137, posted 05-14-2012 3:20 PM Percy has replied

  
zaius137
Member (Idle past 3409 days)
Posts: 407
Joined: 05-08-2012


Message 262 of 314 (662306)
05-14-2012 3:20 PM
Reply to: Message 261 by Percy
05-14-2012 9:00 AM


Re: Information
Percy my friend we are making headway.
Here's an example of how you're thinking about information. We have a book on our computer that contains information. We run the book through a program that randomly scrambles all the characters. You think the book now has less information, and that's where you've gone wrong.
The fact of the matter is that the book now has more information than it had before because we're less able to predict the next character. For example, if I saw the letter "q" in the original book I would know that the next letter was "u". When I find out that the next letter is "u" I haven't learned anything. No information has been communicated.
But if I saw the letter "q" in the scrambled book I would have no idea what the next letter could be. When I find out the next letter is "f" I have learned something I could not possibly have known. Information has definitely been communicated.
Your original point was that "creationists like Myers" have defined "information in the genome", but you have as yet offered no evidence whatsoever of this, and the fact that you yourself misunderstand information underscores this point.
quote:
Although entropy is often used as a characterization of the information content of a data source, this information content is not absolute: it depends crucially on the probabilistic model.
http://turing.une.edu.au/~cwatson7/I/ConditionalEntropy.html.
I still think that you are confusing the information in the data source with the methode and result of Shannon Entropy (the amount of information needed to transmit that information). Remember I said the entire exercise of using shannon entropy was to expose a system containing innate information to the power of statistics. Also I mentioned that the principle use of maximum entropy was to avoid the problem of not knowing what exactly that information is (contained in the source data).
I don’t think you are argueing that the use of Shannon Entropy can not be used to infer information. But you are rather unshue about what the problemistic model might be assesing.
The entire validity for using Shannon Entropy is how you define the probability.
In your book example, scrambling the letters of the entire book will not change the entropy if your probability is broad enough. You might ask where the actual information is in that book. The information in that book would be conveyed in the order of the characters or letters forming the words and sentences. Therefore, if you set the probability to the level order in the words then your entropy would certainly change. Please read the following citation and you will see that order is a consideration in the genome.
http://pnylab.com/pny/papers/cdna/cdna/index.html

This message is a reply to:
 Message 261 by Percy, posted 05-14-2012 9:00 AM Percy has replied

Replies to this message:
 Message 263 by PaulK, posted 05-14-2012 3:42 PM zaius137 has not replied
 Message 264 by Percy, posted 05-14-2012 4:36 PM zaius137 has replied
 Message 265 by Dr Adequate, posted 05-14-2012 6:59 PM zaius137 has not replied

  
PaulK
Member
Posts: 17822
Joined: 01-10-2003
Member Rating: 2.2


Message 263 of 314 (662310)
05-14-2012 3:42 PM
Reply to: Message 262 by zaius137
05-14-2012 3:20 PM


Re: Information
quote:
I still think that you are confusing the information in the data source with the methode and result of Shannon Entropy (the amount of information needed to transmit that information). Remember I said the entire exercise of using shannon entropy was to expose a system containing innate information to the power of statistics.
I would say that the confusion is on your part. Complex communication requires high entropy. You say that LOW entropy is a measure of innate information. But that would mean that the INABILITY to communicate complex information would indicate the presence of complex information ! That's absurd.
quote:
The entire validity for using Shannon Entropy is how you define the probability.
Usually it's defined on the basis of the predictability of the next term in the sequence based on the previous terms and the structure of the messages. That was the basis used in calculating the entropy of English. That's useful (very useful).
quote:
In your book example, scrambling the letters of the entire book will not change the entropy if your probability is broad enough.
Actually it would increase it using the basis I suggest above. For instance, in English there is a high probability of 'u' following 'q'. You'd lose that if you scrambled the letters.
But where's your measure of probability that supports your idea that low entropy - high information ? A source that can only give one message has zero entropy by any reasonable standard. But how does that indicate a high information content ?

This message is a reply to:
 Message 262 by zaius137, posted 05-14-2012 3:20 PM zaius137 has not replied

  
Percy
Member
Posts: 22392
From: New Hampshire
Joined: 12-23-2000
Member Rating: 5.2


Message 264 of 314 (662318)
05-14-2012 4:36 PM
Reply to: Message 262 by zaius137
05-14-2012 3:20 PM


Re: Information
Hi Zaius,
I think you're confusing yourself with your own jargon, and you're drawing a distinction that doesn't exist between information content and information communication. This sentence from the Wikipedia article on Information Theory clearly indicates a direct relationship between information and entropy:
Wikipedia writes:
For example, specifying the outcome of a fair coin flip (two equally likely outcomes) provides less information (lower entropy) than specifying the outcome from a roll of a die (six equally likely outcomes).
Did you get that? Less information == lower entropy.
--Percy

This message is a reply to:
 Message 262 by zaius137, posted 05-14-2012 3:20 PM zaius137 has replied

Replies to this message:
 Message 266 by zaius137, posted 05-15-2012 3:13 AM Percy has replied

  
Dr Adequate
Member (Idle past 284 days)
Posts: 16113
Joined: 07-20-2006


Message 265 of 314 (662325)
05-14-2012 6:59 PM
Reply to: Message 262 by zaius137
05-14-2012 3:20 PM


Re: Information
In any case, if you want information to be the exact opposite of Shannon entropy, we can accommodate you by answering your original question about adding information to the genome in those terms as well.
After all, in my first post on this subject, I explained how a mutation could increase Shannon entropy, as follows:
Dr A writes:
Well, if your choice is Shannon entropy, then creating information is easy. Any insertion would do it, since the insertion increases the number of bits in the genome, and since the content of these bits is not completely predictable from their context.
So, for example, consider a "toy genome" (real genomes are of course longer) of the form GTACT_ACTCTA, where the _ represents a base that has just been added by insertion, the identity of which I am not going to tell you.
Can you deduce with complete certainty what base is represented by the _, based only on the knowledge that it is preceded by GTACT and followed by ACTCTA?
Of course not. Therefore, it makes a non-zero contribution to the total Shannon entropy of the genome.
So if your choice of a measure of information is now the opposite of Shannon entropy, then all I need to do is reverse the argument as follows:
Well, if your choice is the exact opposite of Shannon entropy, then creating information is easy. Any deletion would do it, since the deletion decreases the number of bits in the genome, and since the content of these bits was not completely predictable from their context.
So, for example, consider a "toy genome" (real genomes are of course longer) of the form GTACT_ACTCTA, where the _ represents a base that has just been removed by deletion, the identity of which I am not going to tell you.
Can you deduce with complete certainty what base was represented by the _, based only on the knowledge that it was preceded by GTACT and followed by ACTCTA?
Of course not. Therefore, it made a non-zero contribution to the total Shannon entropy of the old genome, and so the new genome GTACTACTCTA has, by your criterion, more information than the original.
---
There you go. Your way of quantifying information may be the silliest thing since King Olaf the Silly's "Decree of Custard" back in 947, but clearly "information" so quantified can be increased by mutation.
Edited by Dr Adequate, : No reason given.

This message is a reply to:
 Message 262 by zaius137, posted 05-14-2012 3:20 PM zaius137 has not replied

  
zaius137
Member (Idle past 3409 days)
Posts: 407
Joined: 05-08-2012


Message 266 of 314 (662364)
05-15-2012 3:13 AM
Reply to: Message 264 by Percy
05-14-2012 4:36 PM


Re: Information
Percy my friend,
quote:
For example, specifying the outcome of a fair coin flip (two equally likely outcomes) provides less information (lower entropy) than specifying the outcome from a roll of a die (six equally likely outcomes).
The comparison here is between a dice throw and a coin flip. The coin flip needs only a one-bit transmission to convey the message. Whereas the dice roll takes 2.6 bits of transmission to convey the message portion. The provides less information only refers to the transmission information.
Remember I acknowledge that the information in the message is independent of the amount of information that is required to transmit the message.
I am directly implying that the less uncertainty in the message directly implies more information in the message; if you like greater negentropy.
"Negentropy" is a term coined by Erwin Schrdinger in his popular-science book "What is life?" (1943).
quote:
Schrdinger introduced that term when explaining that a living system exports entropy in order to maintain its own entropy at a low level. By using the term "Negentropy", he could express this fact in a more "positive" way: A living system imports negentropy and stores it.
Edited by zaius137, : No reason given.
Edited by zaius137, : No reason given.
Edited by zaius137, : No reason given.
Edited by zaius137, : Edit is not taking.
Edited by zaius137, : No reason given.

This message is a reply to:
 Message 264 by Percy, posted 05-14-2012 4:36 PM Percy has replied

Replies to this message:
 Message 267 by PaulK, posted 05-15-2012 4:40 AM zaius137 has not replied
 Message 268 by Dr Adequate, posted 05-15-2012 6:00 AM zaius137 has not replied
 Message 269 by Percy, posted 05-15-2012 9:12 AM zaius137 has replied

  
PaulK
Member
Posts: 17822
Joined: 01-10-2003
Member Rating: 2.2


Message 267 of 314 (662370)
05-15-2012 4:40 AM
Reply to: Message 266 by zaius137
05-15-2012 3:13 AM


Re: Information
So you are literally claiming that the less information in the message, the more information in the message.
How can you not notice the contradiction?

This message is a reply to:
 Message 266 by zaius137, posted 05-15-2012 3:13 AM zaius137 has not replied

  
Dr Adequate
Member (Idle past 284 days)
Posts: 16113
Joined: 07-20-2006


Message 268 of 314 (662373)
05-15-2012 6:00 AM
Reply to: Message 266 by zaius137
05-15-2012 3:13 AM


Re: Information
"Negentropy" is a term coined by Erwin Schrdinger in his popular-science book "What is life?" (1943).
quote:
Schrdinger introduced that term when explaining that a living system exports entropy in order to maintain its own entropy at a low level. By using the term "Negentropy", he could express this fact in a more "positive" way: A living system imports negentropy and stores it.
Note that Shannon entropy and thermodynamic entropy are not the same thing. Note also that since Schroedinger was writing five years before the publication of Shannon's paper, he was talking about the latter and not the former. Note further that there is no sense in which a living system imports negative Shannon entropy and stores it, since that is not even a concept to which one can attach a meaning.
Perhaps you should spend less time quoting from articles for beginners about information theory, and more time actually reading them.
Edited by Dr Adequate, : No reason given.

This message is a reply to:
 Message 266 by zaius137, posted 05-15-2012 3:13 AM zaius137 has not replied

  
Percy
Member
Posts: 22392
From: New Hampshire
Joined: 12-23-2000
Member Rating: 5.2


Message 269 of 314 (662390)
05-15-2012 9:12 AM
Reply to: Message 266 by zaius137
05-15-2012 3:13 AM


Re: Information
Hi Zaius,
Let's dispense with this misunderstanding first:
zaius137 writes:
Remember I acknowledge that the information in the message is independent of the amount of information that is required to transmit the message.
The amount of information required to transmit some information content over a lossless channel is equal to the amount of information content. If you have a 2 megabyte disk file on your computer then it will take 2 megabytes of information to transmit that file over a lossless channel.
In reality information content and information transmitted are the same thing. All the same concepts of information measures and entropy and so forth apply to both. They're just slightly different perspectives of the same thing. If you have a book then the measures of its information and entropy are the same whether it's sitting on your hard drive or being transmitted over the Internet.
Now let's dispense with this misunderstanding:
The comparison here is between a dice throw and a coin flip. The coin flip needs only a one-bit transmission to convey the message. Whereas the dice roll takes 2.6 bits of transmission to convey the message portion. The provides less information only refers to the transmission information.
If your message set size is 2, let's say the message set is {0, 1}, and each message has equal probability (.5), then the amount of infomation conveyed by sending one message is 1 bit:
[b][size=3]log2(2) = 1 bit[/size][/b]
The entropy of the information (as given by the Wikipedia article on Information Theory) is:
Plugging in the values we see that the entropy is (I'll be using log2):
[b][size=3]- ((.5)(log2(.5)) + (.5)(log2(.5)))
= - ((.5)(-1) + (.5)(-1))
= - ((-.5) + (-.5))
= - (-1)
= 1[/size][/b]
Now let's say your message set size is 4 and that the message set is {00, 01, 10, 11}, and each message again has equal probability (.25), then the amount of information conveyed by sending one message is 2 bits:
[b][size=3]log2(4) = 2 bits[/size][/b]
Plugging in the values to our entropy equation:
[b][size=3]- ((.25)(log2(.25)) + (.25)(log2(.25)) + (.25)(log2(.25)) + (.25)(log2(.25)))
= - ((.25)(-2) + (.25)(-2) + (.25)(-2) + (.25)(-2))
= - ((-.5) + (-.5) + (-.5) + (-.5))
= - (-2)
= 2[/size][/b]
When the information content was 1 bit per message then the entropy was 1. When the information content was 2 bits per message then the entropy was 2. See how the entropy is increasing with the information content?
--Percy
Edited by Percy, : Fix typo in one of the equations.
Edited by Percy, : Putzed with the equations a little bit to improve readability.

This message is a reply to:
 Message 266 by zaius137, posted 05-15-2012 3:13 AM zaius137 has replied

Replies to this message:
 Message 270 by zaius137, posted 05-15-2012 1:48 PM Percy has replied
 Message 273 by NoNukes, posted 05-15-2012 10:37 PM Percy has replied

  
zaius137
Member (Idle past 3409 days)
Posts: 407
Joined: 05-08-2012


Message 270 of 314 (662417)
05-15-2012 1:48 PM
Reply to: Message 269 by Percy
05-15-2012 9:12 AM


Re: Information
Percy my friend,
Great but what about the unfair coin? The message is set is still {0,1} but what is the entropy of say the probability of 70% heads and of 30% tails (more predictable higher negentropy) is...
= -(.7) log2 (.7)- (.3) log2 (.3)
= -(.7) (-.52) — (.3) (-1.7)
= (.36) + (.51)
= (.87) bits
Which demonstrates the minimum amount of bits to transmit the message and gives the theoretical limit of compression. About your hard drive data, it can be processed by a number of compression algorithms and transmitted by fewer bits (my guess).

This message is a reply to:
 Message 269 by Percy, posted 05-15-2012 9:12 AM Percy has replied

Replies to this message:
 Message 271 by PaulK, posted 05-15-2012 2:06 PM zaius137 has replied
 Message 272 by Percy, posted 05-15-2012 9:29 PM zaius137 has not replied

  
Newer Topic | Older Topic
Jump to:


Copyright 2001-2023 by EvC Forum, All Rights Reserved

™ Version 4.2
Innovative software from Qwixotic © 2024