Submitted by Cahoona1984 on Sat, 02/02/2013 - 04:18
Hi! I am very interested in the Frame Component Analysis. Let's assume that I have crawled about 1000 sites which I would like to classify according to their frames. How many sites do I have to code manually until the programm will use the knowledge to make automatic predictions properly for the rest of the data?
Is there a broad rule or the like? Thanks so much! Cheers, Nils
Forums:
Hi - apologies for delayed response to this.
In the following paper:
Ackland, R. and M. O'Neil (2011), "Online collective identity: The case of the environmental movement," Social Networks, 33, 177-190. http://voson.anu.edu.au/papers/OnlineCollectiveIdentity_FINAL.pdf
we used libSVM to extract text features (meta keywords) that best classified the 167 environmental activist websites in our network. We used these kewords to construct an 'online frame network'.
We didn't then use the support vector machine to classify unclassified sites (but that could certainly be done).
So unfortunately I can't give you any guidance on how big the training dataset needs to be. But if you do a google search on SVM you might find an answer to this, or perhaps the libSVM documentation has something on this.
Regards,
Rob
test comment