Register | Sign In


Understanding through Discussion


EvC Forum active members: 64 (9164 total)
5 online now:
Newest Member: ChatGPT
Post Volume: Total: 916,782 Year: 4,039/9,624 Month: 910/974 Week: 237/286 Day: 44/109 Hour: 1/5


Thread  Details

Email This Thread
Newer Topic | Older Topic
  
Author Topic:   Social Statistics (How many samples are enough/too much to speak about a population?)
Silent H
Member (Idle past 5845 days)
Posts: 7405
From: satellite of love
Joined: 12-11-2002


Message 31 of 53 (301440)
04-06-2006 5:00 AM
Reply to: Message 29 by crashfrog
04-05-2006 12:34 PM


semantics 1
I see on most of this we are in agreement, except for some sort of semantic hangup on what I am "blaming" for the failure. Well, there are some rather large gaps in your understanding of methodology, but that is besides the point for this thread, though I mention it toward the end of this post.
Again, you seem to have missed my previous posts, but here is what you quoted as being pertinent...
Then I gave you a general example suggesting how in a population containing many numbers of unique environments, and the soc question being asked would be effected by those environments, small N could very well force a person to miss many unique environments which could effect conclusions. The sdev in any small selection could not be thought to compare well to some sigma of the entire population.
to which you say...
pardon me for mistaking your argument as being directed towards the mathematical basis of a statistical sample but if that wasn't the discussion you wanted to be part of, it's hard to tell from your posts.
The section I highlighted does not address probability theory. It is about complications which can be found in sampling that removes automatic assumptions required for probability theory to hold... and so draw conclusions.
If my argument was against probability theory itself, the mathematical basis, then the thread title would not have been specified to social stats, and I would not have included the caveats that I had.
No, they're based on reality - the reality is, people are fairly well mixed up. Walk into a room full of people - say, draft day for the NBA, or SakuraCon, or the national meeting of the Association for Certified Public Accountants - and start measuring heights, and you're going to find a distribution of heights that fits the normal curve. Every time.
You have absolutely no idea what you are talking about. You really have never taken a research methods course, or done actual research have you? No need to answer because I can tell from the above answer. As it is, I already said height is separate from social prefs, though in some cases height will actually act as a sorter.
Let me explain this in a bit more detail. A convention is not the same as where people choose to live and associate with each other on a permanent basis. A walk through the south side of Chicago will fill you in quickly that while "height" might be randomly distributed, other social elements are not.
As it stands, even within things like conventions unless people have absolutely no idea who others are, they tend to group based on some familiar aspects, and may start grouping on superficial estimates of social order. At a high school assembly for example, one may very well find boys separated from girls and social classes sorted within those.
As I mentioned height could very well play a role, if it was culturally significant. In Rwanda height defined the vying factions and massacres sometimes focused on that as a factor.
What's an "ideal assumption" is your example.
Its a hypothetical. Yes it is designed to present a challenge. Yes I am not suggesting that it is supposed to mirror what the real world is actually like in some graphic detail. It is a hypothetical meant to show that mathematical models are prone to failure where their ideal assumptions are not met. That's it.
You can say that the failure in studies to get a truly random sample is a methods problem all you want. Fine. But the reality is that sample size, however sufficient in theory according to pure statistics, can create the methods problem you are talking about.
To say it is the fault of methods rather than assumptions of prob theory holding true in all cases, is sort of a tomato tomahto issue. The POINT is that for any study, one must look at what is being studied and the factors that may come into play to judge appropriate sample size, instead of looking at pure mathematical models.
Yes, I really do think they contain some bias. Bias is pretty hard to escape.
This is sort of funny. You seemed to be suggesting there is no bias. For some reason I accidentally left out a "not" in my question, asking as if you had suggested there was. My question was suppoed to be "You really think telephone surveys do NOT contain potential biases (including on those line)?" My typing mistake. I certainly DO believe that telephone surveys carry the potential for bias, and suggested as such in the other thread.
Do I think the fact that it was a telephone survey constitutes a major bias? No, I don't. I don't expect the attitudes of the population that has telephones to be significantly different than the population as a whole.
I have no idea if it did or it didn't but it certainly can. Although it is becoming less of an issue now, rural and poor people are less likely to have phones than others. Wealthy and some social "types" are less likely to have their numbers listed for use in surveys. Certain types are less likely to deal with a survey when called, and some people are more likely to. Depending on time of calls one can change the social dynamic of who one will reach (assuming these are home numbers).
Why?
Now that's just ridiculous. Any good research will not just throw darts at a dartboard, or in this case dial randomly and assume that nets a "truly random sample". It has to be random to the population one is hoping to measure, which is NOT synonymous with random geographic or numerical distribution of points.
Thus when one knows one has only 2000 samples to use out of a population of >20million spread over an entire continent with many different cultures, selection become very important. You don't think that if by random chance more calls ended up in the midwest or the south, there would be a difference in result? Or that if they missed whole states that that might skew results?
If you can't imagine a technique or an algoryhthm that would input a list of telephone numbers and choose n of them at random, I can't help you.
If you think that computed random phone number selection equates to gaining an appropriately random selection of population for a research study, I can't help you. My time is rapidly growing short and I am not about to hold your hand through basic soc res methods.
All your suggested method does is guarantee a random phone number.
This message has been edited by holmes, 04-06-2006 11:05 AM

holmes
"Some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the light into the peace and safety of a new dark age." (Lovecraft)

This message is a reply to:
 Message 29 by crashfrog, posted 04-05-2006 12:34 PM crashfrog has replied

Replies to this message:
 Message 34 by crashfrog, posted 04-06-2006 9:11 AM Silent H has replied

  
Silent H
Member (Idle past 5845 days)
Posts: 7405
From: satellite of love
Joined: 12-11-2002


Message 32 of 53 (301446)
04-06-2006 5:55 AM
Reply to: Message 30 by nwr
04-05-2006 4:05 PM


semantics 2
In your OP, you defined the problem as related to sample size. Then you gave an example. Was I supposed to guess that you already went off topic in your OP?
Now you are misrepresenting? Hey man, that specific example was NOT in my OP, and only came up after chiro's post set out some jargon and an example of what stats can measure.
My OP had to do with sample size being a practical issue in social research. My later post's specific example was showing how one factor regarding social research (measurement of prefs) can effect assumptions. The breakdown in assumptions can effect how sample and population are compared regardless of size. Admittedly the effect of this would make larger sample sizes better, but that was not my point in that discussion.
You did not show that. In spite of repeated prompting, you have failed to refine your "specific example" to the point where it is specific enough to see how you would get numeric values.
Right, I did not show that. It was a challenge to YOU. You show how you can measure those numerically such that you can create meanings of sdev and sigma. I claimed such comparison studies can create means (at best) but no meaningful sdevs or sigmas.
It would be silly for me to have to show something can't be done.
I understand the claim that is presented in the OP. Whether that is what you intended to claim, I cannot tell.
The specific example developed from more detailed discussion after the OP and did not deal with sample size alone, though it would effect what kinds of sample sizes are more definitive for meaning. Only the general example dealt directly with sample size issue as per the OP.
Let me hold your hand and walk you through this. In the specific example, if a study cannot generate meaningful concepts of sdev then conclusions regarding populations based purely on statistical extrapolations are not likely to be accurate or meaningful. The general example, even where one can get meaningful measurements the realities of social environments will dictate a greater sample size than assumed in pure probability theory in order to ensure a truly random sample.
The act of randomly choosing a person and asking the question is a random trial. We count the number k of TRUE responses from among our sample of n. This type of sample, assuming randomness, is theoretically modelled
That's enough right there. There's the handwaving. Yes ASSUMING randomness theoretical models hold. There is a gap between assuming and getting. For some realities to GET a sufficiently random sample, one might NEED a larger sample size than required based on theory.
What you have not substantiated, is that an extremely large population makes that problem more difficult.
And again with the strawman. Population size alone is NOT the problem. I have already discussed other factors which go into this. Without a large population size (or gap between sample and population) the other factors will not contribute to many problems, but with a large size (or gap) they can.
Its like a merry go round with you.
I'm not interested in playing games... Where it goes wrong is right at the beginning where you realize that you cannot measure it because you have not defined it.
But you are playing games. Just because you can measure something does not mean you will get appropriately random samples, nor even that the measurement is able to be easily extrapolated based on numeric measurements given what they quantify.
I am willing to agree that the latter may be considered a problem in actually being able to measure something at all, rather than an extrapolation issue. Thus if it could be measured properly it should be extrapolatable. I think that's semantics, but I'd rather switch than fight.
HOWEVER, the former is NOT related to measurability at all. The general example is unrelated to measurement, and solely addresses ability to achieve a practical random sample.
You are looking at the wrong issue. The bounds are still solid. In the case of different minorities, you have to be careful interpreting the results for those minorities
I agreed with everything you said in this part. I was not trying to discuss the problem you were discussing. I realize it could be read that way but the emphasis was supposed to be on the "not as simple as 0 to 1" rather than a discussion of minority effects.
We can drop this as being under the "not able to be properly measured" issue. Again I think its semantics, but I am willing to agree that we blame "measurability" if its numbers have no extrapolatable quality.
I would guess that is wrong. But you would have to ask somebody who works in public opinion surveying.
Holy shit, I wonder what I looked at during my res methods, what kind of person instructed my res methods course, and what I did for some of my work? Can you guess?
As mentioned above, there are tests for randomness. If we use a sample size of 4000, those tests will be more stringent than with a sample size of 2000. My best guess would be that increasing the sample size makes it more difficult to get a sufficiently random sample.
I assume what you are discussing is several small tests in order to discover potential issues so that one can get at a better random sample from the population? I agree that increasing the size of those would not be necessary.
However, in any specific study increasing the sample size does not hurt a study, and certainly does not inherently or practically reduce the randomness of the sample.

holmes
"Some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the light into the peace and safety of a new dark age." (Lovecraft)

This message is a reply to:
 Message 30 by nwr, posted 04-05-2006 4:05 PM nwr has replied

Replies to this message:
 Message 42 by nwr, posted 04-08-2006 12:51 AM Silent H has replied

  
Silent H
Member (Idle past 5845 days)
Posts: 7405
From: satellite of love
Joined: 12-11-2002


Message 33 of 53 (301449)
04-06-2006 5:59 AM


note
I thought I would have been pretty busy starting on Monday, but I wasn't given work I had expected. That said, I am starting in on some other work and still expect to get that previous work any day now which will only add to my total work load.
What that means is that my time is rapidly shrinking, and I may be forced more into a lurker status for a while... perhaps a long while.
If I can't hit things on a daily basis, or even monthly, that is why. Just a notice in advance so that people don't wonder why I dropped out when I do.

holmes
"Some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the light into the peace and safety of a new dark age." (Lovecraft)

  
crashfrog
Member (Idle past 1493 days)
Posts: 19762
From: Silver Spring, MD
Joined: 03-20-2003


Message 34 of 53 (301471)
04-06-2006 9:11 AM
Reply to: Message 31 by Silent H
04-06-2006 5:00 AM


Re: semantics 1
You have absolutely no idea what you are talking about.
I'm going to ask you for the last time to refrain from personal attacks, Holmes. I don't know what your problem is that requires you to respond to a fairly polite post with these scurrilous personal insults, but you need to get a handle on it. This behavior is simply beyond the pale.
Yes I am not suggesting that it is supposed to mirror what the real world is actually like in some graphic detail. It is a hypothetical meant to show that mathematical models are prone to failure where their ideal assumptions are not met. That's it.
Then as an example it hardly supports the conclusion that statistical methods are ideals that fail under real-world situations, now does it?
But the reality is that sample size, however sufficient in theory according to pure statistics, can create the methods problem you are talking about.
Noce job of completely misrepresenting your opposition.
My time is rapidly growing short and I am not about to hold your hand through basic soc res methods.
If your time is so short perhaps it would behoove you to spend more time addressing the actual position of your opponents and a lot less time developing personal attacks against them.
Just a thought, Holmes. Discussions with you would be much more fruitful if you could, for once in your life, refrain from making it personal and calling your opponents idiots. It's pretty clear that there are at least 2 or 3 people in this thread, now, who are getting fed up with your relentless ad hominem.

This message is a reply to:
 Message 31 by Silent H, posted 04-06-2006 5:00 AM Silent H has replied

Replies to this message:
 Message 35 by Silent H, posted 04-06-2006 2:56 PM crashfrog has replied

  
Silent H
Member (Idle past 5845 days)
Posts: 7405
From: satellite of love
Joined: 12-11-2002


Message 35 of 53 (301630)
04-06-2006 2:56 PM
Reply to: Message 34 by crashfrog
04-06-2006 9:11 AM


Re: semantics 1
I'm going to ask you for the last time to refrain from personal attacks, Holmes. I don't know what your problem is that requires you to respond to a fairly polite post with these scurrilous personal insults, but you need to get a handle on it. This behavior is simply beyond the pale.
If it seemed to be just a case of name calling then I'd apologize. But it wasn't. This is what you said...
No, they're based on reality - the reality is, people are fairly well mixed up. Walk into a room full of people - say, draft day for the NBA, or SakuraCon, or the national meeting of the Association for Certified Public Accountants - and start measuring heights, and you're going to find a distribution of heights that fits the normal curve. Every time.
If a person wrote "All doctors know that the heart is where we think and feel, because as you can experience when you get emotional you feel it in your heart", it would not be merely name calling to say that person clearly has no idea what he is talking about. And your statement above is practically the same thing.
It seems to me that the only person capable of making that statement has not had the experiences I mentioned, and I said so. I also discussed why you were in error. If you are trying to argue that I should have simply stuck with the explanation, and skipped the analysis, I'll take that under advisement, but it seems a bit PCKB.
I mean I still remember (and chuckle) at that animated gif you displayed after cutting up someone's argument. Mine was quite tame by comparison.
Then as an example it hardly supports the conclusion that statistical methods are ideals that fail under real-world situations, now does it?
Yes it does. I said it is a hypothetical that does not mirror "what the real world is actually like in some graphic detail". That is different than saying it does not offer an example of how ideal assumptions can break down in the real world.
Once again, it is hypothetical. Yes it is artificial and exagerrated. That is what one does in order to guarantee an issue is brought out into the open. Notice that chiro also mentioned similar types of examples, and I believe he mentioned one was straight out of a stats text.
Noce job of completely misrepresenting your opposition.
I thought you suggested it was a methodology problem rather than a stats problem? If I got that wrong then you will have to explain what your position is.
If your time is so short perhaps it would behoove you to spend more time addressing the actual position of your opponents and a lot less time developing personal attacks against them.
I think I had two statements which were overtly negative. The rest (meaning the majority) was explanation. I really don't spend much time trying to insult people. In fact, it takes no time at all to come up with an insult. Why would it?
Discussions with you would be much more fruitful if you could, for once in your life, refrain from making it personal and calling your opponents idiots. It's pretty clear that there are at least 2 or 3 people in this thread, now, who are getting fed up with your relentless ad hominem
Who did I call an idiot? I didn't even call you an idiot. I said that that statement meant you didn't know what you were talking about, and I'm pretty sure I'm right. And by the way, you and nwr were both the first to use negative language against me personally. I'd be more than happy to drop it. Anytime the attacks stop coming, I'll change.
And what's with the exaggeration? Relentless? Beyond the pale? Come on.

holmes
"Some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the light into the peace and safety of a new dark age." (Lovecraft)

This message is a reply to:
 Message 34 by crashfrog, posted 04-06-2006 9:11 AM crashfrog has replied

Replies to this message:
 Message 36 by crashfrog, posted 04-06-2006 9:55 PM Silent H has replied

  
crashfrog
Member (Idle past 1493 days)
Posts: 19762
From: Silver Spring, MD
Joined: 03-20-2003


Message 36 of 53 (301794)
04-06-2006 9:55 PM
Reply to: Message 35 by Silent H
04-06-2006 2:56 PM


Re: semantics 1
I also discussed why you were in error.
Well, no - what you did was call me ignorant, and then proceeded to rebut an argument I didn't make.
Did you just not understand my statement, or what? If so it would have been better for you to ask, not to call me names. I'll try to restate it for your benefit.
It's a true fact that if you were to measure the height of everyone in a roomful of people, and then graph them, you would see this pattern:
It's statistical fact, Holmes. I apologize if the point was so basic that you didn't understand that's what I was saying, but there's hardly a reason to assert that I'm an ignoramus for stating a simple fact from the first week of any basic statistics course.
That is different than saying it does not offer an example of how ideal assumptions can break down in the real world.
No, it's not different. You can't use an obviously artifical example to try to prove that a technique doesn't work in the real world. If your example doesn't happen in the real world, it proves nothing.
I mean that's obvious. So obvious, I guess, you had no refuge but name-calling. Quite predictable.
. I said that that statement meant you didn't know what you were talking about, and I'm pretty sure I'm right.
Actually all you've proven is that you didn't know what I was talking about. Again, not surprising.
And by the way, you and nwr were both the first to use negative language against me personally.
To which statements of mine are you referring? Specific cites, please.

This message is a reply to:
 Message 35 by Silent H, posted 04-06-2006 2:56 PM Silent H has replied

Replies to this message:
 Message 37 by Silent H, posted 04-07-2006 5:30 AM crashfrog has replied

  
Silent H
Member (Idle past 5845 days)
Posts: 7405
From: satellite of love
Joined: 12-11-2002


Message 37 of 53 (301846)
04-07-2006 5:30 AM
Reply to: Message 36 by crashfrog
04-06-2006 9:55 PM


Re: semantics 1
It's statistical fact, Holmes. I apologize if the point was so basic that you didn't understand that's what I was saying, but there's hardly a reason to assert that I'm an ignoramus for stating a simple fact from the first week of any basic statistics course.
Man, if ALL you were saying is that you'd get a bell curve if you measured the heights of everyone in a room, then you were clearly rebutting an argument I never made (well it wouldn't be so clean, but the idea is there).
I assumed you were trying to suggest that populations were generally mixed and gave the conference examples. Like a conference room is an appropriate example for cities or nations. If not, I am sorry to have assumed that you were trying to make a point against something I said.
What's more the height bellcurve example was already used by chiro. If you followed that then you know there could be issues. And in some real life cases a bell curve will not appear. Lets take a basketball player convention. My guess is that might turn out to have two peaks. Place it in Asia and I'd be pretty certain.
You can't use an obviously artifical example to try to prove that a technique doesn't work in the real world. If your example doesn't happen in the real world, it proves nothing.
I really have to explain a hypothetical to you? Look at what chiro gave, and stated from a stats textbook. That it is exaggerated for effect, does not remove the point. It shows how a theoretical breaks down in application to nonideal environments.
Actually all you've proven is that you didn't know what I was talking about. Again, not surprising.
For a guy demanding civility you show absolutely no inclination to show any yourself. Yeah, I guess I didn't understand that you were arguing against a strawman that would have been recognized as a strawman if you had read much of the beginning of this thread. Shall I say, again not surprising given that you have admitted to not reading everything I write, or is that too offensive for you?
I'll tell you what... you write what it is I am actually arguing (in your own words) and what you have to go against it, and if you can get my position right then we can continue.
I'm certainly not going to put up with someone coming in late, lobbing arguments against positions I have not taken, and then telling me I'm the one inventing strawmen when I assume the arguments lobbed had something to do with my actual position.
This message has been edited by holmes, 04-07-2006 11:31 AM

holmes
"Some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the light into the peace and safety of a new dark age." (Lovecraft)

This message is a reply to:
 Message 36 by crashfrog, posted 04-06-2006 9:55 PM crashfrog has replied

Replies to this message:
 Message 38 by crashfrog, posted 04-07-2006 9:18 AM Silent H has replied

  
crashfrog
Member (Idle past 1493 days)
Posts: 19762
From: Silver Spring, MD
Joined: 03-20-2003


Message 38 of 53 (301900)
04-07-2006 9:18 AM
Reply to: Message 37 by Silent H
04-07-2006 5:30 AM


Re: semantics 1
Man, if ALL you were saying is that you'd get a bell curve if you measured the heights of everyone in a room, then you were clearly rebutting an argument I never made (well it wouldn't be so clean, but the idea is there).
That's just it, Holmes. I wasn't. If you'll read my post I offered that fact not as rebuttal to your argument, but in defense of my own - that statistics is built on the real world, not theoretical assumptions. My example was a simple defense of that.
I really have to explain a hypothetical to you?
I have to explain the concept of "plausible" to you?
It shows how a theoretical breaks down in application to nonideal environments.
Which I've already stated is something else entirely from stating that it breaks down under real-world conditions.
For a guy demanding civility you show absolutely no inclination to show any yourself.
Why bother? My attempt to address you with civility was met with the assertion that "I have absolutely no idea what I am talking about." I notice that you couldn't support your assertion that "I started it."

This message is a reply to:
 Message 37 by Silent H, posted 04-07-2006 5:30 AM Silent H has replied

Replies to this message:
 Message 39 by Silent H, posted 04-07-2006 10:41 AM crashfrog has replied

  
Silent H
Member (Idle past 5845 days)
Posts: 7405
From: satellite of love
Joined: 12-11-2002


Message 39 of 53 (301926)
04-07-2006 10:41 AM
Reply to: Message 38 by crashfrog
04-07-2006 9:18 AM


Re: semantics 1
If you'll read my post I offered that fact not as rebuttal to your argument, but in defense of my own - that statistics is built on the real world, not theoretical assumptions. My example was a simple defense of that.
Heheheh... That was not a defense that statistics is built on the real world. And indeed if your point was that, then ironically it remains showing some amount of ignorance and my challenges to it remain.
That statistics CAN apply to certain real world situations, does not mean that it is based on real world assumptions. It is math, it is theoretical, and it is based on theoretical assumptions. Certainly I have not seen any stats book which read that that the theorems were derived by looking at people in conventions or something to that effect.
I have to explain the concept of "plausible" to you?
By which I guess you mean "yes".
In an earlier post (not to you) I actually described how it could be made more plausible or reflect real life. My guess is you missed that. In any case, that it isn't what is on the planet at this time, or may not at any time, does not change the fact that it pinpoints realworld issues. They were exaggerated to bring them out, but the issues stand.
Which I've already stated is something else entirely from stating that it breaks down under real-world conditions.
It shows that stats are only as good as their ideal assumptions are fulfilled. The world is not ideal and so stats do break down under real world conditions.
I notice that you couldn't support your assertion that "I started it."
Not couldn't, didn't. Why repost material on a side issue of no importance to the thread and not that far back? People can look upthread and figure it out for themselves if they care about it. They obviously know both our opinions and can check it out.
What's hilarious is that you are still claiming my statement about what your position suggests about your level of knowledge, was name calling... and an assertion to boot! Hey I called it as I saw it, and I gave the criteria. It appears that I am right. If it seems insulting to you personally, what can I say?
When a person tells me something that indicates they are talking out their ass, I'm going to mention it (as you have in so many other cases).

holmes
"Some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the light into the peace and safety of a new dark age." (Lovecraft)

This message is a reply to:
 Message 38 by crashfrog, posted 04-07-2006 9:18 AM crashfrog has replied

Replies to this message:
 Message 40 by crashfrog, posted 04-07-2006 12:31 PM Silent H has replied

  
crashfrog
Member (Idle past 1493 days)
Posts: 19762
From: Silver Spring, MD
Joined: 03-20-2003


Message 40 of 53 (302003)
04-07-2006 12:31 PM
Reply to: Message 39 by Silent H
04-07-2006 10:41 AM


Re: semantics 1
Certainly I have not seen any stats book which read that that the theorems were derived by looking at people in conventions or something to that effect.
Say what? Every stats book I've ever read relates the classic story that Blase Pascal developed the mathematics of probability because he was trying to help a friend who liked to play dice.
I mean, you're way out in left field on this, Holmes. Everybody recognizes that the purpose of statistical methods and probabilistic mathematics is to model the real world, and the theoretical foundations of the mathematics were developed to solve real-world problems. The normal curve wasn't developed from equations and assumptions; it was developed from observations of falling balls on a pinboard.
I mean, c'mon. When you say that statistics isn't a mathematics based on the real world, it makes me think you aren't interested in an honest debate; that you aren't taking this seriously. When someone says something so obviously at odds with the facts and with the history, it's like saying
quote:
"All doctors know that the heart is where we think and feel, because as you can experience when you get emotional you feel it in your heart."
In an earlier post (not to you) I actually described how it could be made more plausible or reflect real life.
Ah, yes. Another Holmes classic - recurse to "earlier posts." If you could, please provide a quick summary of how that earlier post actually addresses my position, or cite a specific post number. In the past you've repeatedly made reference to an "earlier post" in order to dodge points you don't want to appear to be ignoring, so I can hardly trust you when you say you've dealt with this in an "earlier post."
The world is not ideal and so stats do break down under real world conditions.
Unsupported conclusion from the premises. If statistics has been developed to explain real-world phenomena, then by definition, the real world would tend to be the ideal condition for statistics. The fact that it breaks down under impossible, hypothetical circumstances that could never occur in the real world has absolutely nothing to do with the power of statistics to explain the real world.
It's simple reasoning, Holmes. It's a very simple point that is true no matter how many times you assert that an impossible counterexample disproves the utility of statistics in the real world. The two have nothing to do with each other. If you want to prove that stats can fail in the real world, give a real example of stats failing in the real world. Not of stats failing in a world that could never exist.
What's hilarious is that you are still claiming my statement about what your position suggests about your level of knowledge, was name calling...
It was name-calling, Holmes. These are two very different statements:
"Your knowledge of statistics appears flawed."
"You have absolutely no idea what you're talking about."
The first would be an acceptable, if still somewhat rude, way to point out that your opponent appears to lack some prerequisite knowledge. And that happens often in discussions of a technical nature.
The second is an insulting ad hominem, and, if you'll notice, your exact words. Name-calling. It's just that simple, Holmes.
When a person tells me something that indicates they are talking out their ass, I'm going to mention it
Again, not simply a factual statement about your opponent lacking crucial knowledge, but an insulting ad hominem attack against their honesty and intelligence. It's simply unbelievable that you would pretend like you can't tell the difference.

This message is a reply to:
 Message 39 by Silent H, posted 04-07-2006 10:41 AM Silent H has replied

Replies to this message:
 Message 43 by Silent H, posted 04-08-2006 6:14 AM crashfrog has replied

  
wj
Inactive Member


Message 41 of 53 (302221)
04-07-2006 7:44 PM
Reply to: Message 1 by Silent H
04-01-2006 5:06 PM


Holmes
Your arguments by message #39 seems to have moved a long way from your original OP without you acknowledging that your initial issue has been addressed.
holmes writes:
My argument is that size of population and geographic region covered (as well as how populations group within that region) complicate sampling such that one cannot simply assert that 2000 people are enough.
It should have been amply demonstrated to anyone's satisfaction by now that a sample of 2,000 will give an accurate measure of a mean value with 95% level of confidence. That is simply a product of the mathematics involved. That level is generally accepted as sufficiently significant to give a meaningful answer. However if you wish to go to a 99% level of confidence then a very much larger minimum sample is required. This is usually not practical.
As someone has already pointed out to you, one cannot do a meaningful analysis of subgroups within a 2,000 point sample as the sample number is obviously less than the required minimim size to have a meaning level of confidence. If, however, the initial sample was 10,000 points then one could have an adequate level of confidence in analysing a subgroup with 2,000 data points. So the original case of polling Americans' attitude to atheists may be valid as a description of Americans' overall attitude but could not be relied on to describe a sampled subgroup's (eg. christians) attitude.
holmes writes:
Randomization and "true cross-section" depend on understanding demographics so that sampling "points" can be picked correctly.
I'm not sure if this is just worded poorly or reflects some incorrect premise which you have about statistics.
I'm sure that all thread participants will agree that unbiased, random sampling is essential to give a valid statistical result - the subsequent number crunching is rather trivial. Your subsequent scenaios about clustered data, whether through geography or demographic distribution, is irrelevent if basic features of unbiased, random sampling are satisfied and the number of possible values of data points is not large compared to the sample size. On the latter point, a range of possible values being measured (say 59.5001 inches to 60.4999 inches will be grouped as a value of 60 inches). So, if there are 20,000 towns in the US with populations over 20,000 and there are only 50 possible values for a datapoint then a sample size of 2,000 would still (all other factors being equal) be valid for measuring the average value over the entire popluation of people living in cities over 20,000 people. It doesn't matter that you don't happen to get a sample from a particular town that has an extraordinarily high value amongst its population.
The tools used for assessing some value and the method of obtaining a random sample are critical. For example, a telephone poll in Zimbabwe would only sample those people rich enough and close enough to communications to actually have a working phone in the first place. Therefore it could not be considered to be a good tool for the entire population of Zimbabwe. However a phone poll in the US might be biased for other reasons - it might exclude those too poor to have a phone connection or the younger population who might only have a mobile phone which doesn't appear in a phonebook.
As NWR (I think) also correctly pointed out, the quality which you are endeavouring to analyse statistically also needs to be defined so that it is measurable and that the tools used are actually measuring that quality or a suitable proxy.
So one does not pick and choose where to take sampling to ensure that all demographics are included. One does not decide that 20 individuals have to be sampled from rural Iowa and 200 from New York to ensure "an appropriate cross-section". One ensures that the selection mechanism is unbiased and could potentially include all of the population in a sample and then carries out the sampling.
Further discussion about the measurability of a quality, the validity of the tool used to measure that quality, the sampling techniques used to estimate the value of that quality would make for further interesting discussion rather than more name calling and chest beating. However the issue of sample size is well settled.

This message is a reply to:
 Message 1 by Silent H, posted 04-01-2006 5:06 PM Silent H has replied

Replies to this message:
 Message 45 by Silent H, posted 04-08-2006 9:04 AM wj has replied

  
nwr
Member
Posts: 6411
From: Geneva, Illinois
Joined: 08-08-2005
Member Rating: 4.5


Message 42 of 53 (302269)
04-08-2006 12:51 AM
Reply to: Message 32 by Silent H
04-06-2006 5:55 AM


Re: semantics 2
Hey man, that specific example was NOT in my OP, and only came up after chiro's post set out some jargon and an example of what stats can measure.
You are right. My mistake. I apologize for that.
Right, I did not show that. It was a challenge to YOU. You show how you can measure those numerically such that you can create meanings of sdev and sigma.
Your example was hopelessly vague. It is not up to me to fill in the missing details. I doubt that they can be filled in.
There is never a problem with meanings of sdev and sigma, when they can be computer. The only problem with them in this case, is that your example is still badly incomplete.
It would be silly for me to have to show something can't be done.
You seem to be saying that it would be silly for you to actually provide an argument to support the claim you have made.
Let me hold your hand and walk you through this. In the specific example, if a study cannot generate meaningful concepts of sdev then conclusions regarding populations based purely on statistical extrapolations are not likely to be accurate or meaningful.
I can only repeat what I have said before. The reason your "specific example" cannot generate meaningful sdev, is that your example is little more than some bullshit that you spewed. If you want to actually support your claim, you need a real example complete with the interview questions that will be asked in the survey.
That's enough right there. There's the handwaving. Yes ASSUMING randomness theoretical models hold. There is a gap between assuming and getting. For some realities to GET a sufficiently random sample, one might NEED a larger sample size than required based on theory.
Then you need to produce an argument that a larger sample would make it easier to get a sufficiently random sample. You have not yet made a case for that, and I am inclined to think it false.
Its like a merry go round with you.
That's only because you have repeatedly failed to make a credible case in support of your claim.

This message is a reply to:
 Message 32 by Silent H, posted 04-06-2006 5:55 AM Silent H has replied

Replies to this message:
 Message 44 by Silent H, posted 04-08-2006 7:37 AM nwr has replied

  
Silent H
Member (Idle past 5845 days)
Posts: 7405
From: satellite of love
Joined: 12-11-2002


Message 43 of 53 (302295)
04-08-2006 6:14 AM
Reply to: Message 40 by crashfrog
04-07-2006 12:31 PM


Everybody recognizes that the purpose of statistical methods and probabilistic mathematics is to model the real world, and the theoretical foundations of the mathematics were developed to solve real-world problems.
Yeah, everybody including me. Of course they were developed to solve real-world problems. Some maths may have even been generated using some observations from the real world. That does not mean that the maths and ideal assumptions within them are "based on" the real world. Formulas (like stats) are based on assumptions, which may not (and usually are not) met in real life.
And once again, that was not defended nor advanced by your example, and my reply correctly countered it.
In the past you've repeatedly made reference to an "earlier post" in order to dodge points you don't want to appear to be ignoring, so I can hardly trust you when you say you've dealt with this in an "earlier post."
In the past you have admitted that you never read my entire posts. A classic "frogism", leaping over details of my arguments in order to tell me (incorrectly) what I have been saying, debating points already talked about, and escaping ever having to admit you are in error. Sorry pal, once you admitted that on more than one occassion, I gave up handholding you through everything. I have tried to lead you back to specific posts and could not make you read.
I mean gods, you started by addressing an argument that wasn't even my position!
And oh yes, the insinuation game. Because I don't serve you chop chop as you wish, I must be avoiding and trying to hide something. Note how many issues you have not dealt with since you came on.
If statistics has been developed to explain real-world phenomena, then by definition, the real world would tend to be the ideal condition for statistics. The fact that it breaks down under impossible, hypothetical circumstances that could never occur in the real world has absolutely nothing to do with the power of statistics to explain the real world.
So when I develop equations from watching springs then by definition all springs are the ideal condition for applying such equations?
And look at what you said after your "definition". I did not give an "impossible" hypothetical situation that could "never occur" in the real world. I gave a hypothetical which is not what we see and MAY never occur. There is a huge difference. A world of difference. While it is not what we see right now, there is absolutely no logical bar to it occuring, or that it might represent (even if exaggerated in number) actual situations we see today.
If we are asking the residents of 20K cities, to compare what they feel about all other cities, there could be many different responses between cities, and generally uniform opinions within a city. Heck how about taking 20K schools and asking the students opinions on all different schools? Or 20K churches about other churches?
There is a social reality that communities can develop unique sentiments regarding other communities.
The first would be an acceptable, if still somewhat rude, way to point out that your opponent appears to lack some prerequisite knowledge. And that happens often in discussions of a technical nature.
Oh I'm sorry, I did not know that. Maybe you can point me to the objectively true handbook for correct english usage that you have.
See this is what is so great about debates on use of language, I get to have YOU tell ME what I intended to mean with a statement. I mean its not like my telling you what I meant means anything at all... you know because that's what YOU would have meant by saying that thing. And of course if I told you what I felt was a negative statement made by you, you would declare that it wasn't! Hot dog!
Oh did I say great? I meant pointless.
I guess I will apologize for saying something that you took to be just an insult. It wasn't meant that way. But also have to say for a guy who routinely bashes people with insults, I do find it amazing how sensitive you can be.

holmes
"Some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the light into the peace and safety of a new dark age." (Lovecraft)

This message is a reply to:
 Message 40 by crashfrog, posted 04-07-2006 12:31 PM crashfrog has replied

Replies to this message:
 Message 46 by crashfrog, posted 04-08-2006 10:45 AM Silent H has replied

  
Silent H
Member (Idle past 5845 days)
Posts: 7405
From: satellite of love
Joined: 12-11-2002


Message 44 of 53 (302303)
04-08-2006 7:37 AM
Reply to: Message 42 by nwr
04-08-2006 12:51 AM


Re: semantics 2
Your example was hopelessly vague. It is not up to me to fill in the missing details. I doubt that they can be filled in. There is never a problem with meanings of sdev and sigma, when they can be computer. The only problem with them in this case, is that your example is still badly incomplete.
I don't see how my asking you to show how you could ask questions regarding a specific subject such that you could get a meaningful sdev is hopelessly vague. I gave you the purpose and the specific groups to be involved. The rest is up to you.
If you simply don't want to spend the time working it out in more detail, you have my sympathy and I would not hold it against you.
If your point is that you don't see how that could be computed, then I am pretty much in agreement. That is of course what the atheism study was getting at, and part of my problem with it.
You seem to be saying that it would be silly for you to actually provide an argument to support the claim you have made.
No, you can't prove a negative. That's why it'd be silly to try. On the other hand you can prove a positive.
is that your example is little more than some bullshit that you spewed.
Snicker snicker. Bullshit? I essentially restated the atheist study with fewer and more specified minority groups. I then asked you to show how you could make something statistically meaningful regarding the question addressed.
Then you need to produce an argument that a larger sample would make it easier to get a sufficiently random sample. You have not yet made a case for that, and I am inclined to think it false.
Again this is a strawman. I did not say that larger samples make it easier to get a sufficiently random sample. My argument was that in some cases small sample numbers could hinder sufficiently random sampling. The general hypothetical does work toward that end. At best sampling would net 1/10th of all large "unique" preferential environments within the population.
I am certainly not trying to argue that large numbers alone = better quality data.
Since we seem to be going in circles, lets drop this. I am going to try and wrap everything up in my reply to wj's post. If you have issues with what I say there, reply to that one.

holmes
"Some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the light into the peace and safety of a new dark age." (Lovecraft)

This message is a reply to:
 Message 42 by nwr, posted 04-08-2006 12:51 AM nwr has replied

Replies to this message:
 Message 47 by nwr, posted 04-08-2006 12:01 PM Silent H has replied

  
Silent H
Member (Idle past 5845 days)
Posts: 7405
From: satellite of love
Joined: 12-11-2002


Message 45 of 53 (302313)
04-08-2006 9:04 AM
Reply to: Message 41 by wj
04-07-2006 7:44 PM


semantics 3
I'm not sure why you worded your post as you did. It seemed to make out as if I had not acknowledged, discussed, or agreed with certain points. Indeed some of the issues I had actually raised first, both here and in the original thread this came from (which is why I did not mention them as an issue to discuss here).
Again I am left with the feeling that something else is going on here. That maybe it is felt I was knocking all statistics, or their use? I don't know. But I'm hoping to wrap this up using your reply.
First I must apologize for not putting all conditions regarding my argument in my sentence stating my argument. I tried to put my argument within bounds stated (admittedly) vaguely before it, but I thought sufficiently detailed afterward. While your quote was right, my OP also said...
There may be more than one person interested in how statisticians explain that small numbers of people can be used to represent/model the feeilngs of large populations. This is a thread to explain/debate how this is done.
and
I wasn't talking about "many cases" and "most purposes". I made it very clear that I was talking about getting an accurate picture of social prefs in very large populations (100+mil) over very large areas such that there may be many subcultures.
Thus...
It should have been amply demonstrated to anyone's satisfaction by now that a sample of 2,000 will give an accurate measure of a mean value with 95% level of confidence. That is simply a product of the mathematics involved. That level is generally accepted as sufficiently significant to give a meaningful answer. However if you wish to go to a 99% level of confidence then a very much larger minimum sample is required. This is usually not practical.
Is something I was pretty much agreeing with. What it can't do is be used as a blanket to cover all studies as drawing valid conclusions. Even the concept of 95% confidence is a product of assumptions regarding the population and the data involved. Yes it is a product of math, but based on assumption. As the assumptions become questionable the ability to make the claim of confidence decreases. And that is a question of how data is treated/collected in reality. I think we can agree on that correct?
I'm sure that all thread participants will agree that unbiased, random sampling is essential to give a valid statistical result - the subsequent number crunching is rather trivial.
Absolutely.
Your subsequent scenaios about clustered data, whether through geography or demographic distribution, is irrelevent if basic features of unbiased, random sampling are satisfied and the number of possible values of data points is not large compared to the sample size.
If I am reading you correctly, I also agree almost completely. Where there may be a difference will be addressed later.
So, if there are 20,000 towns in the US with populations over 20,000 and there are only 50 possible values for a datapoint then a sample size of 2,000 would still (all other factors being equal)... It doesn't matter that you don't happen to get a sample from a particular town that has an extraordinarily high value amongst its population.
Yes, though caveats are starting. But for sake of argument, lets say alrighty.
The tools used for assessing some value and the method of obtaining a random sample are critical... However a phone poll in the US might be biased for other reasons...
Well I stated this explicitly within this thread so I have to agree, absolutely. And yes I DO agree that this is a different issue than size of sample discussions.
the quality which you are endeavouring to analyse statistically also needs to be defined so that it is measurable and that the tools used are actually measuring that quality or a suitable proxy.
Absolutely agreed. And also agreed to be a side issue to the question of sample size.
However, there is a semantic issue which comes out of this. I was suggesting (when this was brought up in this thread) that some data while "measurable" or "collectable" may not fit stats assumptions and so defy meaningful extrapolation.
Thus a study could have some usable data regarding preferences, just not useful for traditional stats. I think NWR was making that sound worse, like it connotes a total nonuse, or immeasurability, than I'd agree with.
So one does not pick and choose where to take sampling to ensure that all demographics are included. One does not decide that 20 individuals have to be sampled from rural Iowa and 200 from New York to ensure "an appropriate cross-section".
Well yes, and that is not what I was trying to say. One does not try and handpick the samples from a specified list of all possible demographics. However, as the number of potentially large subpops grow, so do the number that should be sampled, and thus number of samples.
Remember one of the counterclaims I was dealing with in my OP was that 3 was as good as 3000 which is as good as 3million. I think its safe to claim 3 is not ensured of hitting an adequate number of environments within the US to safely draw conclusions. I also don't believe 3 million is necessary. I think 3000 will depend on the number of environments which might exist for any particular pref under investigation.
One ensures that the selection mechanism is unbiased and could potentially include all of the population in a sample and then carries out the sampling.
Well that's not exactly true. That a selection mechanism "could potentially" is not enough. Good studies do review and put into place procedures to ensure crossing demographics. I suppose that may be called part of the mechanism, but then that seems a semantic issue to me.
I think its safe to say that whether a mechanism "could" reach everyone, if it could also (when taken randomly) end up selecting people from a limited number of potential demographic environments compared to the total, poses a potential issue. And certainly there will be if it ends up doing so.
For example, if a study is regarding US opinion on a candidate, while relatively simple, they will not have the same % confidence just because they "could" have gotten people in rural and urban areas via their method. It would be advisable to make sure some people in rural areas actually ended up in the study along with urban samples, and not just one.
Of course, I am agreeing that not all must be hit, and certainly some demographics are superfluous (did we get women with red hair and wear dresses?). However with population size and separation ability, there are more meaningful demographics to be hit, and so a greater minimum number to be sampled from for accurate extrapolation.
Hopefully we can all agree that this is the where we may have differences, and not those other issues. This is where I am claiming that reality runs headlong into statistical assumptions and demands greater numbers of samples. I think it is a semantical issue to blame the "randomness" or "bias" of mechanisms, if the reality is that more samples are needed to overcome potential bias and so achieve statistical utility. That to me is a statement that we cannot use the needs of "pure" stats to make assumptions about how many samples are necessary, or sufficient, for a study.
Lets me paraphrase a question I asked in my OP. Lets say in the future that we have significant human populations (100millions) on earth, mars, the moon, and europa. If we are to get at what human opinion is regarding issue X, would 2000 be enough to realistically speak of what the entire human population generally feels? Would we not have ensure that we did get opinions from all four planetary bodies? If not, why not? If despite our method potentially sampling from all planetary bodies it only grabbed from one or two, would there not be an issue with such an extrapolation? If not, why not?
This message has been edited by holmes, 04-08-2006 03:06 PM

holmes
"Some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the light into the peace and safety of a new dark age." (Lovecraft)

This message is a reply to:
 Message 41 by wj, posted 04-07-2006 7:44 PM wj has replied

Replies to this message:
 Message 50 by wj, posted 04-09-2006 2:25 AM Silent H has replied

  
Newer Topic | Older Topic
Jump to:


Copyright 2001-2023 by EvC Forum, All Rights Reserved

™ Version 4.2
Innovative software from Qwixotic © 2024