Register | Sign In


Understanding through Discussion


EvC Forum active members: 65 (9162 total)
4 online now:
Newest Member: popoi
Post Volume: Total: 915,817 Year: 3,074/9,624 Month: 919/1,588 Week: 102/223 Day: 13/17 Hour: 0/0


Thread  Details

Email This Thread
Newer Topic | Older Topic
  
Author Topic:   Recent paper with an ID spin? Abel and Trevors (2005).
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 1 of 85 (245947)
09-23-2005 11:28 AM


In a recent paper in Theoretical biology and medical modelling, a rether obscure member of the BioMed Central family of open access pulications, Abel and Trevors publish a paper on the nature of information and complexity in biopolymers(Abel and Trevors, 2005).
Genetic algorithms instruct sophisticated biological organization. Three qualitative kinds of sequence complexity exist: random (RSC), ordered (OSC), and functional (FSC). FSC alone provides algorithmic instruction. Random and Ordered Sequence Complexities lie at opposite ends of the same bi-directional sequence complexity vector. Randomness in sequence space is defined by a lack of Kolmogorov algorithmic compressibility. A sequence is compressible because it contains redundant order and patterns. Law-like cause-and-effect determinism produces highly compressible order. Such forced ordering precludes both information retention and freedom of selection so critical to algorithmic programming and control. Functional Sequence Complexity requires this added programming dimension of uncoerced selection at successive decision nodes in the string. Shannon information theory measures the relative degrees of RSC and OSC. Shannon information theory cannot measure FSC. FSC is invariably associated with all forms of complex biofunction, including biochemical pathways, cycles, positive and negative feedback regulation, and homeostatic metabolism. The algorithmic programming of FSC, not merely its aperiodicity, accounts for biological organization. No empirical evidence exists of either RSC of OSC ever having produced a single instance of sophisticated biological organization. Organization invariably manifests FSC rather than successive random events (RSC) or low-informational self-ordering phenomena (OSC).
There is a lot of highly jargonistic terminology in the paper, presumably the sort of things that information theorists are familiar with, which doesn't mean a whole lot to me.
Some of it sounds rathe IDist however, for example...
Naturalistic science has always sought to reduce chemistry to nothing more than dynamics. In such a context, chemistry cannot explain a sequencing phenomenon that is dynamically inert. If, on the other hand, chemistry possesses some metaphysical (beyond physical; beyond dynamics) transcendence over dynamics, then chemistry becomes philosophy/religion rather than naturalistic science.
One of the Authors works at the Origin-of-Life Foundation which, apart from having a pretty crappy website, seems to be an abiogenesis version of the JREF prize. I am slightly worried that they feel they have to state in their description of themselves ...
The Origin-of-Life Foundation should not be confused with "creation science" groups.
I'm not sure why they feel the need for this clarification unles sthey expect their line of enquiry to look suspiciously like those followed by 'creation science' and even more like those of ID proponents like Dembski.
Anyone care to comment on the information theory aspects of this? Pretty much all the actual biology simply seems to be linked to assertions that certain types of information cannot arise by particular mechanisms and I don't feel qualified to judge those claims.
Am I just being paranoid about the ID nature of this paper?
TTFN,
WK

Replies to this message:
 Message 3 by Percy, posted 09-24-2005 8:51 AM Wounded King has not replied
 Message 4 by Brad McFall, posted 09-24-2005 8:54 AM Wounded King has not replied
 Message 5 by nwr, posted 09-24-2005 2:22 PM Wounded King has not replied
 Message 20 by Archer Opteryx, posted 11-02-2006 10:35 AM Wounded King has not replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 14 of 85 (246456)
09-26-2005 5:31 AM
Reply to: Message 11 by Percy
09-25-2005 3:36 AM


I'm not sure pubmed would be the right people to approach. Wouldn't you be better off contacting the publishers if you have a concern?
I've been a big fan of Open Access models of publishing journals but I'm not sure that BioMedCentral's new line of allowing interested groups of scientists to open up effectively their own journals may not be open to abuse, or at least likely to greatly increase the number of very low impact niche specific journals into which thinly disguised ID could be insinuated. I doubt that 'Theoretical Biology and Medical Modelling' really wants to become the open access equivalent of 'Rivista di Biologia' but for one out of the 3 reviews it has so far published to be this one I hope they were careful with the review process.
This inaugural editorial suhhests that the paper is looking for some more outre theoretical ideas, so maybe they consider shades of ID to be within that remit.
TTFN,
WK

This message is a reply to:
 Message 11 by Percy, posted 09-25-2005 3:36 AM Percy has replied

Replies to this message:
 Message 15 by Percy, posted 09-26-2005 9:34 AM Wounded King has not replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 18 of 85 (360734)
11-02-2006 9:50 AM


Abel and Trevors at it again
This time it is in that auspicious publication 'PHYSICS OF LIFE REVIEWS'.
Once again I can't tell if this paper is brilliant or some sort of post-modern Sokalian hoax. They go as far this time as to reference Dembski's 'No free lunch' as one source of an estimate of a universal probability bound.
Self-organization vs. self-ordering events in life-origin models
David L. Abel, and Jack T. Trevors
Self-ordering phenomena should not be confused with self-organization. Self-ordering events occur spontaneously according to natural “law” propensities and are purely physicodynamic. Crystallization and the spontaneously forming dissipative structures of Prigogine are examples of self-ordering. Self-ordering phenomena involve no decision nodes, no dynamically-inert configurable switches, no logic gates, no steering toward algorithmic success or “computational halting”. Hypercycles, genetic and evolutionary algorithms, neural nets, and cellular automata have not been shown to self-organize spontaneously into nontrivial functions. Laws and fractals are both compression algorithms containing minimal complexity and information. Organization typically contains large quantities of prescriptive information. Prescriptive information either instructs or directly produces nontrivial optimized algorithmic function at its destination. Prescription requires choice contingency rather than chance contingency or necessity. Organization requires prescription, and is abstract, conceptual, formal, and algorithmic. Organization utilizes a sign/symbol/token system to represent many configurable switch settings. Physical switch settings allow instantiation of nonphysical selections for function into physicality. Switch settings represent choices at successive decision nodes that integrate circuits and instantiate cooperative management into conceptual physical systems. Switch positions must be freely selectable to function as logic gates. Switches must be set according to rules, not laws. Inanimacy cannot “organize” itself. Inanimacy can only self-order. “Self-organization” is without empirical and prediction-fulfilling support. No falsifiable theory of self-organization exists. “Self-organization” provides no mechanism and offers no detailed verifiable explanatory power. Care should be taken not to use the term “self-organization” erroneously to refer to low-informational, natural-process, self-ordering events, especially when discussing genetic information.
I think I need to relax my brain after just reading through the abstract.
TTFN,
WK
Edited by Wounded King, : Added link to article abstract on journal site, may not work without subscription.

Replies to this message:
 Message 19 by nwr, posted 11-02-2006 10:29 AM Wounded King has not replied
 Message 21 by Silent H, posted 11-02-2006 11:06 AM Wounded King has not replied
 Message 29 by Halbwertszeit, posted 12-19-2006 11:11 PM Wounded King has replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 30 of 85 (371065)
12-20-2006 2:09 AM
Reply to: Message 29 by Halbwertszeit
12-19-2006 11:11 PM


Re: Abel and Trevors at it again
According to the Dembski blog the paper gives a reference to Dawkins:
[38] Dawkins R. Climbing mount impossible. 1996
Well Dawkins did not write this book, did he?
No, but he did write 'Climbing Mount improbable'.
TTFN,
WK

This message is a reply to:
 Message 29 by Halbwertszeit, posted 12-19-2006 11:11 PM Halbwertszeit has replied

Replies to this message:
 Message 31 by Halbwertszeit, posted 12-20-2006 9:24 AM Wounded King has not replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 34 of 85 (516765)
07-27-2009 11:50 AM
Reply to: Message 33 by Smooth Operator
07-24-2009 12:54 PM


Hi Smooth Operator,
Thats an interesting extension of the previous work. And I'm glad to say much more understandable than the orignal Abel and Trevor's paper.
I think it is a worthy attempt to try and integrate a bioinformatic approach to biological function into their research into sequence complexity.
The problem as I see it with this approach is that the actual functionality component is so subjective. The functionality variable is pretty much the only thing to differentiate this approach from any other purely sequence comparison/conservation based approach. Unfortunately it is essentially left as an exercise for the user to decide what to do with that variable. There seems to be considerable scope for sequences chosen on the basis of some functional criteria to produce misinterpretation of results due to the fact that there are multiple functional criteria of the types discussed in the paper which could be chosen. For instance if we choose a set of sequences based on their functional ability to bind a specific DNA sequence how do we know that it isn't simply generic DNA binding ability or some entirely other commonality of function effect that is being identified?
This is both the most important and by far the least satisfactory part of their methodology since in their own dataset analysis they don't actually tell us what the functionality variables for the different protein families were or what they were derived from. As far as I can tell from looking at the python program they include the function variable is simply derived from the sequence analysis. This seems to make the whole thing nothing more than an exercise in saying 'I think all these sequences share some common function' and you have to question how well such an approach would be suited to analysing anything other than exactly the sort of protein sequence sets they look at in the paper and even then the extent to which you could ascribe any similarities to the specifc shared functional criteria.
Isn't it better to leave the imputation of functionality to specific elements to downstream analysis of sequences identified as significant through the initial analysis. If 'Fits' and 'FSC' are an effective way of identifying common functionally significant sites between sequences then that is all to the good. The presupposing of exactly what function you are looking for seems an unnecessary additional step, especially when using pfam family membership is apparently the only selection criteria used for discriminating function.
They also leave open the problem they recognise in terms of only having a best guess at what the whole functional sequence space is based on the collated functional sequences. The possibility of unknown functional sequences means that an underestimation of the likelihood of any particular function arising is almost unavoidable. That is even before considering functionally equivalent but genetically/structurally distinct proteins.
This definitely seems like a worthwhile endeavour if it could discriminate something more than traditional sequence analysis methods do but certainly doesn't offer much support for ID, at least beyond being the prelude to an argument from big numbers based on a calculation of something like the "FSC value for an entire prokaryotic cell where the genome has been sequenced and all translated proteins are known" which they mention in the discussion. Such a calculation would of course serve to compound any errors arising from the other more subjective elements of the process.
There are other sophisticated information theoretic methods for identifying functional sequences from alignments, see for example Capra and Singh (2007). I'm not sure that this does anything more sophisticated or usable than those methods, or that it would discriminate a specific functionality any more readily.
TTFN,
WK

This message is a reply to:
 Message 33 by Smooth Operator, posted 07-24-2009 12:54 PM Smooth Operator has replied

Replies to this message:
 Message 35 by Smooth Operator, posted 08-03-2009 10:22 PM Wounded King has replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 36 of 85 (518117)
08-04-2009 3:54 AM
Reply to: Message 35 by Smooth Operator
08-03-2009 10:22 PM


It's not really subjective. Anyone who examines a specific part of the genome will find the same objective function there. Biological functions are objective.
If our knowledge was perfect I might agree, but given the high possibility of our failing to recognise a function. The function is not subjective but our knowledge and understanding of it can be.
We select them by which biological function they perform. Binding is not a function. Coding for a eye is. Coding for a flagellum is. Since those structures perform biological functions.
Have you even read the Durston et. al, paper? If you had surely you would understand that the very data they test their approach with was selected based on the way they were assigned to PFAM families which is based on similarities of functional structural domains for activites such as protein-protein or protein-DNA interactions, they are certainly not classified as 'Coding for a eye' or 'Coding for a flagellum'.
If you know the protein performs a function, you just need to find it's sequence on the genome and you found teh functional part of the DNA sequence for that biological function.
That isn't information science, that is simply basic biology. You identify a functional change has occurred from a change in phenotype and then use genomic sequence comparisons or classical genetics to identify where the change occurred and therefore the functional sequence which was altered leading to the change. That isn't what Durston et al. did.
Maybe we should see if we can get the program up and running and try it out in a whole set of sequences from something like a common developmental pathway? How does that sound?
I'm skeptical myself since their method seems to rely so heavily on sequence alignment. I'm happy to accept that they can identify conserved functional sites within a protein family, so can many sequence analysis methods. I am doubtful their method will work for a heterogenous set of sequences linked by a function such as 'build and eye' or build a flagellum'. Do you think it would work?
Any information based research is an indirect support for ID, becasue id is based on the idea that design is only the product of a designing mind that produces information.
It is only support if that is actually true, you keep missing the point that not everyone already accepts your assumptions, if they did we wouldn't be having a discussion.
TTFN,
WK

This message is a reply to:
 Message 35 by Smooth Operator, posted 08-03-2009 10:22 PM Smooth Operator has replied

Replies to this message:
 Message 37 by Smooth Operator, posted 08-04-2009 3:31 PM Wounded King has replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 38 of 85 (518253)
08-04-2009 5:49 PM
Reply to: Message 37 by Smooth Operator
08-04-2009 3:31 PM


I mean, are you saying that the beating of the heart is subjective?
No I'm saying that different levels of understanding of the functioning of the heart can give rise to subjectivity in discussion of it, and that the same is true of functions at the molecular level.
So where do you see the problem?
In you saying that it isn't protein-binding function that is being looked at when anyone who had read the paper would see that is exactly the sort of functions that they were looking at and 'Coding for a flagellum' was not.
They already know, everybody does, that protein coding sequences have biological functions.
But they don't know which sequences have what function in all cases, even when they know what process the protein is involved in they may still not understand its molecular function.
Why should in not work, since it is well known that protein coding sequences are coding for biological functions?
Because only an idiot would think that the point of the paper is to point at protein coding sequences and say 'these have some function'. The point is to identify regions in the proteins with functional information and from that to be able to derive a value for the functional information of the whole protein and possibly from there to higher levels of organisation up to whole genomes.
If you don't understand the paper that is fine, but if you do understand it then I'm not sure why you are making such facile statements rather than a coherent argument based on the papers methodology.
The reason I don't think it will work is because their method is heavily reliant on protein sequence alignment, which will obviously work for PFAM families, and all the proteins involved in a specific function will be hugely structurally diverse making such an alignment virtually impossible, how therefore would such an analysis proceed? Durston et al. unfortunately leave this question simply hanging.
Perhaps their idea is that you derive a value for each constituent part based on an analysis of its close structural relatives. But then you are bringing into your analysis multiple genes which are not involved in your function of interest at all and your derived values will be almost totally divorced from that specific function.
I have got the program up and running and recapitulated their results with their test data, unfortunately they use a very clunky format for the sequence data, and the paper does not make it very clear how it can be adapted for other sets of data.
TTFN,
wK

This message is a reply to:
 Message 37 by Smooth Operator, posted 08-04-2009 3:31 PM Smooth Operator has replied

Replies to this message:
 Message 41 by Smooth Operator, posted 08-06-2009 10:46 AM Wounded King has replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 39 of 85 (518475)
08-06-2009 9:36 AM


For those interested in bioinformatics
Smooth Operator doesn't particularly seem to care about actually discussing the details of the paper and whether or not this is actually research which supports ID, but the Durston et al. paper piqued my interest so I downloaded their programs to give it a go.
The 'program' consists of a series of python scripts and it is very easy to get those working. It is slightly harder to get the sequence data in quite the right format. The program expects a single line containing all the data, every sequence entry should be the same length and may include a name field of a set length. It took me a while comparing their test data, an alignment of P53 relatives from PFAM, with my own, an alignment of all the Helix-Loop-Helix proteins in PFAM, to work out that the length of the name is included in the length of the whole sequence.
Apart from sequence length their are a few variables you can alter, these all need to be changed within the python script, even the data file needs to be put into the script as it stands. I'm sure it wouldn't take long to make the scripts take command line options and therefore be much more flexible, but as it stands changing the main program is the simplest way.
The program runs fine and outputs a few things at the end including a fitness value for each residue in the sequence. I think a little bit of chnaging to the script that prints out these values might help make them more suitable for importing into a sequence alignment program but at the moment it takes a little while to hoik them out and reformat them. The program is also supposed to work with DNA sequences but it would take another little bit of re-writing to set up.
The output from my HLH family was that it had 52 amino acid sites which had high enough conservation to be above the programs set cutoff.
It has a total Hf of 93, I'm still not sure what this represents. The Hf is supposed to 'quantify the change in functional uncertainty between two biopolymeric states with regard to biological functionality'. It isn't clear how this relates to any specific sequence however, or to the entire alignment.
The Shannon uncertainty for the alignment is measured as 224.
The FSC value for the family comes out as 504 fits.
With a little bit of text manipulation the per residue Fits value can be incorporated as an annotation track on the multiple sequence alignment program Jalview. Jalview runs from Javawebstart and can be accessed at https://www.jalview.org/download/ . My project file for the HLH family can be found here. If you load it in you will see the sequence alignment for the HLH family and the Fits annotation track. The Fits track seems to agree with the other measures, especially 'Quality'. In many ways Jalviews 'Quality' measurement is amore sophisticated method than that of Durston et al.. As well as taking into account the variation at the site 'Quality' is based on the Blosum26 matrix which takes into account the physicochemical make up of the amino acids and weights substitutions accordingly. This gives a similar measure of functional conservation at a site to that from the Fits calculation, but seems more tied into the actual biochemistry of the protein.
I am trying to reconstitute the p53 data set they give as test data to get it into the sequence alignment program. If I have any luck I'll let you know.
TTFN,
WK

Replies to this message:
 Message 40 by Percy, posted 08-06-2009 10:16 AM Wounded King has replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 42 of 85 (518500)
08-06-2009 10:55 AM
Reply to: Message 40 by Percy
08-06-2009 10:16 AM


Re: For those interested in bioinformatics
This is the way it feels to me, and that's why I've paid little attention to Abel and Trevors et. al., but if you're looking into this then it tells me you must think there's something to it.
The thing is, I don't think there is anything to it hasn't been done before and done better by mainstream bioinformatics researchers, but I'm prepared to have a look in case I'm wrong.
I've just been considering tying this into the other information thread and seeing what their program says about fitness in the DNA gyrases, but I still don't see how you can compare mutant strains when all the calculations are being performed on alignments. Does one swap out the wild type for the mutant in the alignment? THat might be a worthwhile experiment, but it still doesn't say anything about a change in function. I worry that Durston et al.'s approach suffers from the same sort of platonic thinking as SO's that there is some ideal sequence (or set of sequences) and anything outwith that represents a reduction in information, regardless of the actual effect, or lack thereof, on the function of the mutant.
I think they are just re-inventing the wheel of using shannon entropies and amino acid conservation or characteristics to detect conserved functional structures or residues. But they don't seem to provide anything beyond that, other than the idea that you can add up all the values form various different functional parts of a whole system and come out with some meaningful overall measure opf fucntional complexity, which just sounds like another prelude to an IDist argument from big numbers by multiplying a whole lot of things together and saying, 'see how complex this is!! It couldn't possibly evolve!!'
TTFN,
WK
Edited by Wounded King, : No reason given.

This message is a reply to:
 Message 40 by Percy, posted 08-06-2009 10:16 AM Percy has seen this message but not replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 43 of 85 (518503)
08-06-2009 10:58 AM
Reply to: Message 41 by Smooth Operator
08-06-2009 10:46 AM


And why exactly should their difference in sequence be an obstacle?
Because they only show how to measure FSC for a set of aligned sequences with a shared putative function.
TTFN,
WK

This message is a reply to:
 Message 41 by Smooth Operator, posted 08-06-2009 10:46 AM Smooth Operator has replied

Replies to this message:
 Message 44 by Smooth Operator, posted 08-06-2009 12:07 PM Wounded King has replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 45 of 85 (518648)
08-07-2009 3:48 AM
Reply to: Message 44 by Smooth Operator
08-06-2009 12:07 PM


And you thought that FSC was a perfect method that can meassure anything from it's first go?
They claim you can apply their method to similarly functional but molecularly dissimilar proteins. I think that if they make some claims they should be able to support them with something, I can see how you wouldn't think so though, since you absolutely refuse to support any of your own claims beyond bare repetition.
You didn't think that the scientists needed time to improve their models to be able to do more?
Well have they? Its been 2 and a half years since the paper was published.
TTFN,
WK

This message is a reply to:
 Message 44 by Smooth Operator, posted 08-06-2009 12:07 PM Smooth Operator has replied

Replies to this message:
 Message 46 by Smooth Operator, posted 08-08-2009 4:51 PM Wounded King has replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 47 of 85 (518883)
08-09-2009 5:43 AM
Reply to: Message 46 by Smooth Operator
08-08-2009 4:51 PM


I always support my claims, but when I get same questions asked over and over again, what else should I do?
Well, not to put too fine a point on it you should do exactly what you just claimed you do do, but anyone looking at either this thread or the other one about information can see that simply isn't true.
Are we talking about the same 2005 paper? Anyway, you have got to give people some time.
No, we are talking about the 2007 Durston et al. paper that you introduced to the discussion. I understand that it takes time to do research but shouldn't they have done the research before claiming they could do something?
TTFN,
WK
Edited by Admin, : Bad quoting, and it Looked like some extraneous material was included so I deleted it. I've included the original version in a hide in case I've fixed this incorrectly, but I had to fix the incorrectly quoted portion, possibly incorrectly.

This message is a reply to:
 Message 46 by Smooth Operator, posted 08-08-2009 4:51 PM Smooth Operator has replied

Replies to this message:
 Message 48 by Smooth Operator, posted 08-19-2009 5:57 PM Wounded King has replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 49 of 85 (520349)
08-21-2009 6:07 AM
Reply to: Message 48 by Smooth Operator
08-19-2009 5:57 PM


So you are saying I never ever posted alink that I used as an argument.
I'm not sure what you are saying here. I don't think you should post a link as an argument, you should be able to make the argument in your own words and use references/links as supporting evidence. You certainly haven't posted links to any supporting evidence in the discussion we have been having on this thread, not since you originally brought up the Durston paper.
Listen, it's very simple. They know what they are doing.
How do you know? How do you know that what they are doing is what they think they are doing? How do you know that they know that they can do what they say they can do?
The way we know these in a scientific paper is because the paper presents the methods and data in such a way that we can know it, barring the possibility of fraud, as well as the author.
But that doesn't mean it's flat out wrong, and that they can't use the papaer they wrote for anyhting.
It doesn't mean that it is right either or that they can use their method for actually doing what they say it can.
Have Durston et al. expanded Abel and Trevor's work to get an algorithm for producing some measure of complexity from sequence alignments, yes but not any better than half a dozen already extant methods. Have they shown how to use this method to analyse a heterogenous set of sequences, no. Have they given us usable criteria for 'Function' that we can use for selection? No.
As far as evidence supportive of ID goes all they have done is assume that they can factor all of these 'FSC' measurements together to get a meaningful overall value for a system or organism. But they give us no reason to assume that this is actually the case along with them. Even if we allow this it only helps ID if we accept it as a measure of probability and further accede to concepts like the 'universal probability bound' being relevant.
TTFN,
WK

This message is a reply to:
 Message 48 by Smooth Operator, posted 08-19-2009 5:57 PM Smooth Operator has replied

Replies to this message:
 Message 50 by Smooth Operator, posted 08-21-2009 8:10 PM Wounded King has replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 53 of 85 (520552)
08-22-2009 4:15 AM
Reply to: Message 50 by Smooth Operator
08-21-2009 8:10 PM


This is their method of computing the FSC. They used the protein sequences from the PFAM database and than they were compared and measured. What do you think is wrong with this method?
So where are the usable criteria for function? A PFAM family is based upon structural similarity not function per se. I think the problem with this method is that they claim that it can be used for genes/proteins with similar functions but highly divergent molecular structures, but given their method is based on sequence similarity they give no explanation of how one could do this. Their method seems redundant when compared to more sophisticated metrics like BLOSUM which take into account the physicochemical properties of the amino acids.
Why would they not be able to put all those values together?
Well of course they can. There is nothing creationists and IDers love better than multiplying a whole lot of probabilities together to come up with one really tiny probability and then declaring that it shows evolution is impossible. The question is whether doing so actually has any sort of meaning or utility.
The FSC is really nothing more than a slightly modified measure of conservation. It in no way accounts for all the possible unknown sequences which can also fulfil a particular function. So as usual the probability measurement is only ever going to be for the exact set of genetic structures that have been fed into it. This means that rather than the likelihood of any molecule evolving to perform a particular function you are only looking at the likelihood of the exact molecular set you were studying evolving. So you will no doubt have a very small number but sets of die rolls with infinitesimally small probabilities are generated every day.
TTFN,
WK

This message is a reply to:
 Message 50 by Smooth Operator, posted 08-21-2009 8:10 PM Smooth Operator has replied

Replies to this message:
 Message 64 by Smooth Operator, posted 09-06-2009 9:39 AM Wounded King has replied

  
Wounded King
Member
Posts: 4149
From: Cincinnati, Ohio, USA
Joined: 04-09-2003


Message 56 of 85 (521267)
08-26-2009 5:06 PM


Durston et al . program
I have the Durston et al. programs working with nucleotide sequences as well now. I may tinker a bit more to see if I can get it to accept inputs from the command line rather than needing to modify the core program every time I set up a new analysis.
The Fits profile is moderatley different from the straight conservation profile. In contrast to the protein alignments JalView doesn't have any more sophisticated measures for nucleotide sequences. I might look at some other nucleotide sequence conservation metrics and see what they produce. One interesting methodology is based on DNA topologies allowing higher level DNA structure to influence the detection of conserved regions.
The file link I made previously is broken, I'll see if I can find somewhere stable to put the analysis files.
TTFN,
WK

Replies to this message:
 Message 57 by Percy, posted 08-27-2009 9:52 AM Wounded King has replied

  
Newer Topic | Older Topic
Jump to:


Copyright 2001-2023 by EvC Forum, All Rights Reserved

™ Version 4.2
Innovative software from Qwixotic © 2024