?

Log in

No account? Create an account

K so - Tactical Ninja

Sep. 12th, 2011

08:16 pm - K so

Previous Entry Share Next Entry

Data analysis folks - n00b question.

There's this forum, see. It has 79,621 registered users.

On the forum, a question is asked. Approximately 1216 people answer the question (32 pages @ ~38 posts/page).

How many answers would I need to analyse for the analysis to be useful?

Currently I have 203 (this covers one week of the 12 weeks the forum's existed) and it's taken about 2 hours to get that into an accessible format - I'm painfully aware I'm going to have to code all this manually (sorry geeks, qualitative analysis is like that - unless your sentiment analysis software is shit hot it won't help me much), therefore I'm not keen to retrieve all of the answers unless it's absolutely necessary.

Any suggestions?

PS I don't know whether Chemtrail Guy or "I'm here because they are offering poledancing classes for children" should get the Really Doesn't Get It award.

Comments:

(Deleted comment)
[User Picture]
From:tatjna
Date:September 12th, 2011 09:35 am (UTC)
(Link)
From my brief browsing, the later contributions contain less pole dancing, chemtrails and alien abductions - but I could be selectively avoiding clicking Teh Oddness.

I was thinking maybe first week/latest week, then I could do a compare and contrast thingy. Also, the 203 I already have are from the first week so I'm keen to use them.

Or, people who posted on a Monday or something.
(Reply) (Parent) (Thread)
[User Picture]
From:thatgirljj
Date:September 12th, 2011 02:14 pm (UTC)
(Link)
What are you trying to do?

Also, what qual program are you using? If it's Atlas.Ti I may have some suggestions to make it easier.
(Reply) (Thread)
[User Picture]
From:tatjna
Date:September 12th, 2011 06:29 pm (UTC)
(Link)
I have two broad questions - who are these people, and why have they joined this forum? So I'm looking for four types of statements - I am, I think, I feel, and I want statements. I suspect there'll be a 5th type of statement that refers to moral shock theory, since the ones I've read so far contain a lot of personal experience trigger-type stories.

I'm looking to compare the responses to the various theories of social movements and I don't really have a thesis at the moment but I suspect it'll emerge that no one theory fits, rather a combination of all of them. I'd like to try and make predictions as to whether the maturation of Anonymous as a social movement and the influx of a lot of non-4chan type people will affect the likelihood of the success of the movement.

Not much, eh?

I'm not using any qual program at the moment, but I'm open to suggestions if you know of one that is a) free and b) capable of doing the above.
(Reply) (Parent) (Thread)
[User Picture]
From:thatgirljj
Date:September 12th, 2011 09:56 pm (UTC)
(Link)
OK, I skipped to this before thoroughly reading your most recent reply, so take this with a grain of salt.

It seems like the main thing you need right now is an argument that your sampling strategy and sample size are appropriate for what's being measured. So, for instance, 10% or 20% of the posts replying to the question (120, 240) would be reasonable. Then decide if you can make an argument for some sort of time blocking... I.E. oversampling more active respondents who would respond quickly within the first week vs. sampling the entire length of the thread (in which case I'd roughly 10% of the first week, 10% of the second etc...)

The one thing that confuses me though is you say you have 79,621 registered users and then 1216 people answer the question (1.5%), but how do you know that each individual post/reply is an individual respondent and not someone responding more than one time. This is made even more complex if you're talking about an anonymous chan.

And sadly, all the qual analysis programs I know fail on your A condition. :-( If you could get Atlas.Ti cheap through school, it's pretty handy.
(Reply) (Parent) (Thread)
[User Picture]
From:tatjna
Date:September 12th, 2011 10:00 pm (UTC)
(Link)
[edit because that wasn't clear]

In answer to the question, the forum is divided into subforums, one of which asks the question. Each reply to the question has been posted as a new topic, so counting the number of topics gives the number of replies.

The forum is protected in that you have to be a registered user to post (but not to read this section, thus public domain in terms of ethics), so individual users are identifiable by their pseudonym (this is not an anon forum). I'm able to combine any multiple posts by the same user. There are only a couple of those though.

Edited at 2011-09-12 10:01 pm (UTC)
(Reply) (Parent) (Thread)
[User Picture]
From:thatgirljj
Date:September 13th, 2011 10:34 pm (UTC)
(Link)
Then I think the sampling plan you outlined in your other post should be fine. Unless you can think of some systematic bias that would mean that certain types of people would NEVER post on the first week. Which might be an issue in soup kitchens, but probably not on the internet.
(Reply) (Parent) (Thread)
[User Picture]
From:tatjna
Date:September 13th, 2011 10:38 pm (UTC)
(Link)
The only thing I can think of is that the only people posting in the first week would be those 'in the know' that the site was going up. I think taking more replies from a month later sets that off reasonably well.

Thanks! ;-)
(Reply) (Parent) (Thread)