tf idf - Can tfidf be weighed to improve classification of sparse data in a corpus? -


i using tfidf prior performing classification on number of websites based on content. unfortunately, training data not uniform: 70% of pre-labeled websites news sites, while rest (tech, arts, entertainment, etc.) each vast minority.

my questions following:

  1. is possible adjust tfidf weighs different labels differently , make behave if data uniform? should perhaps using different approach in case? using gaussian naive bayes classifier after tfidf analysis, else better suited in specific case?

  2. is possible have tfidf give me list of possible labels when probability given label below threshold? example, if vector entries close enough (< 1-2%) more probable 1 class rather another, can print both?


Comments

Popular posts from this blog

java - Plugin org.apache.maven.plugins:maven-install-plugin:2.4 or one of its dependencies could not be resolved -

Round ImageView Android -

How can I utilize Yahoo Weather API in android -