Register | Sign In

Understanding through Discussion

Essential Links ▼

HTML coding help

February and beyond, 2013

September, 2012

Index of even older POTM topics

Bob Altemeyer - The Authoritarians (1.32 MB)

National Academy of Science - Science, Evolution, and Creationism (3.16 MB)

John Woodmorappe - Radiometric Geochronology Reappraised (3.35MB)

Defining Reasonableness Online: A Case Study of an Internet Forum
about the Creation / Evolution Controversy

John A Davison - A Prescribed Evolutionary Hypothesis

Charlie Rose Show - Discussion About Charles Darwin (w/ EO Wilson & James Watson)

Marina Chicurel - Can Organisms Speed Their Own Evolution?

Wall Street Journal - Intelligent Design Intrigues, but Is It Science?

Creation/Evolution Glossary

Geology Glossary

Reference Library

Message Coding Help ►

Posts of the Month ►

Report Discussion Problems

Discuss Moderation

Thread Reopen Requests

Topic Proposal Issues

Suspensions and Bannings

Bugs, Problems, Suggestions

Document Library ►

QuickSearch

EvC Forum active members: 65 (9164 total)

4 online now:	LamarkNewAge, Minnemooseus (Adminnemooseus), PaulK, Zucadragon
Newest Member:	ChatGPT
Post Volume:	Total: 916,422 Year: 3,679/9,624 Month: 550/974 Week: 163/276 Day: 3/34 Hour: 0/1

EvC Forum ⇒ Science Forums ⇒ Biological Evolution ⇒

Sequence comparisons (Bioinformatics?)

Thread ▼ Details

Email This Thread

Newer Topic | Older Topic

Author

Topic: Sequence comparisons (Bioinformatics?)

derwood

Member (Idle past 1897 days)

Posts: 1457

Joined: 12-27-2001

Normal Thread Display

Message 39 of 42 (222974)
07-10-2005 3:54 PM

I would recommend the PAUP* package. It is not free ($85 when I purchased mine 2 years ago), but it is easy to use and offers a wide variety of programs and parameters with which to analyze data (nucleotide, protein, or morphological (coded)).

Ned asked how such data is used to generate trees.
In short, parsimony and likelihood algorithms analyze the data for patterns of nucleotide substitution. Indeed, the degree of similarity is ignored in such programs for identical nucleotide/amino acid sites are irrelevant.
Distance methods do use 'similarity', but I do not use such methods much.
Amino acid sequence data is not used in phylogentic analyses nearly as much as it used to be for a couple of reasons - DNA data can provide at least 3 times the phylogenetic informton that amino acid sequence data can (hypothetically, providing we are only using protein coding sequence).
Of course, non-coding DNA is usually much more phylogenetically informative in that it can accumulate more substitutional change than can conserved sequence (such as protein coding sequence).

Making up the input files for these programs can be tedious and frustrating. PAUP, for example, will produce an error message if you have misplaced punctuation (certain symbols are used in tghe input files - e.g., a ";" is used to denote the end of a data block) but it will not tell you where the missing symbol is (at least the earlier versions did not - I think the new one does).
Someone had mentioned making plain text files - that usually works.
I am pretty lazy, so when I am making a new input file, I usually just use an old one that I know works and cut and paste the new data into it.

As for the supposed anomalous trees using cytochrome C and B, immediately the use of amino acid data tells me not to put much stock in it, plus the fact that as has been mentioned, they represent only two small loci (mitochondrial loci at that, which are known to in genral mutate faster than nuclear genes).

derwood

Member (Idle past 1897 days)

Posts: 1457

Joined: 12-27-2001

Normal Thread Display

Message 40 of 42 (222976)
07-10-2005 4:00 PM

Reply to: Message 37 by Wounded King
06-13-2005 4:58 AM

I agree with WK re: the use of Clustal for analysis.

For one thing, we are assuming that the alignment Clustal produced is optimal or at least very good.
In my experience with Clustal, it produces good starting alignments that then need to be re-done by eye. Of course, I am used to doing alignments with 20 to 45 species each with up to 12 thousand nucleotides. It may work perfectly for 30 to 100 amino acids, but when you start tossing in big indels and such in huge nucleotide files, it starts spitting out weird results. I recall once putting in just 2 sequences and the result it gave me was one entire sequence in a row, followed by the second entire sequence - no alignment at all. Whoever wrote it is right - there are all sorts of parameters you can fiddle with that can help avoid problems like that, even so, with big files, I have found alignment programs of several types only good for getting a starting point.

And, if you are suign a questionable alignment, then one should expect any results from any analyses to be odd.

This message is a reply to:
	Message 37 by Wounded King, posted 06-13-2005 4:58 AM		Wounded King has not replied

Date format: mm-dd-yyyy

Timezone: ET (US)

Newer Topic | Older Topic

Do Nothing Button

Copyright 2001-2023 by EvC Forum, All Rights Reserved

™ Version 4.2
Innovative software from Qwixotic © 2024