Research #121
TFKL
| Status: | Closed | Start date: | 2009-07-01 | |
|---|---|---|---|---|
| Priority: | Normal | Due date: | 2009-12-18 | |
| Assignee: | % Done: | 100% |
||
| Category: | - | Spent time: | - | |
| Target version: | - | Estimated time: | 50.00 hours |
Description
I am opening up this issue to track the progress of the TFKL paper.
As discussed, we need to
- redo the 2-class problems for TFRF, using 2 classifiers
- replot the curve for TF.KL using absolute value
- rewrite the intro, shifting the focus from sentiment analysis to supervised term weighting approaches, which is actually moving towards a feature-selection problem. You should also include some unsupervised term weighting approach like "A compartative study on feature selection in Text Categorization" by yiming yang and jan o pedersen.
- read and include some more related papers like:
- "an improved term weighting scheme for vector space model" by yue-heng sun pi-lian he, zhi-gang chen
- seach for "term absence" and text classification in google scholar
- emphasize the key advantages (selling point) of TF.KL
- it considers both term presence and term absence from the class distribution, and
- in the case of 2-class problems, only 1 classifier need to be trained, unlike tf.rf, which requires you to train C classifiers for C classes even when C=2.
- do more experiments on C>2
- try out TFKL on the (C choose 2) summation formula you showed me for C>2
- in the literature, compare with other supervised term weighting approach
History
Updated by Thanh Tam Nguyen over 2 years ago
- % Done changed from 0 to 20
Updated by Kuiyu Chang about 2 years ago
any updates?
Updated by Kuiyu Chang about 2 years ago
- Due date changed from 2009-07-15 to 2009-12-18
Updated by Kuiyu Chang over 1 year ago
- Status changed from New to Closed
- % Done changed from 20 to 100