Differences

This shows you the differences between two versions of the page.

Link to this comparison view

nlp:document-clustering [2015/04/23 21:45] (current)
ryancha created
Line 1: Line 1:
 +<pre>
 +Warning: Do EM found without Unlabel Portion!
 +Warning: You should specify a file parser!
  
 +File List Root Directory: z:​\Reuters\data.new\reduced_set
 +Data Root Directory: z:\Reuters
 +Classifier Type: NB_MN
 +EM Enabled
 +Number of Clusters: 10
 +Distribution Initializer Type: RANDOM_HARD
 +
 +Loading Training Data...
 +  Loading file list z:​\Reuters\data.new\reduced_set\training\GHEA.txt...589 files.
 +  Loading file list z:​\Reuters\data.new\reduced_set\training\GVOTE.txt...1098 files.
 +  Loading file list z:​\Reuters\data.new\reduced_set\training\GDEF.txt...837 files.
 +  Loading file list z:​\Reuters\data.new\reduced_set\training\GREL.txt...280 files.
 +  Loading file list z:​\Reuters\data.new\reduced_set\training\GENT.txt...391 files.
 +Loading class labels...
 +[GVOTE, GREL, GHEA, GDEF, GENT]
 +Loading Test Data...
 +  Loading file list z:​\Reuters\data.new\reduced_set\dev\GHEA.txt...73 files.
 +  Loading file list z:​\Reuters\data.new\reduced_set\dev\GVOTE.txt...137 files.
 +  Loading file list z:​\Reuters\data.new\reduced_set\dev\GDEF.txt...104 files.
 +  Loading file list z:​\Reuters\data.new\reduced_set\dev\GREL.txt...35 files.
 +  Loading file list z:​\Reuters\data.new\reduced_set\dev\GENT.txt...48 files.
 +Found command line argument k
 +Unlabeled: 3544/3544 documents (target: 1.0).
 +Just created instance of Random Hard Initializer
 +No labeled documents detected, commence clustering.
 +Likelihood for round 0 = -4108512.9494642015
 +Likelihood for round 1 = -4108485.8545367457
 +Likelihood for round 2 = -4035512.0653823065
 +Likelihood for round 3 = -3966089.4853993617
 +Likelihood for round 4 = -3921910.82400111
 +Likelihood for round 5 = -3895045.5002938258
 +Likelihood for round 6 = -3881272.5391428014
 +Likelihood for round 7 = -3870279.333689569
 +Likelihood for round 8 = -3856354.415602896
 +Likelihood for round 9 = -3844625.574732625
 +Likelihood for round 10 = -3834605.0231753476
 +Likelihood for round 11 = -3824872.0415230813
 +Likelihood for round 12 = -3818767.132530139
 +Likelihood for round 13 = -3815710.052677934
 +Likelihood for round 14 = -3814792.6428085743
 +Likelihood for round 15 = -3814240.611344331
 +Likelihood for round 16 = -3813557.825200702
 +Likelihood for round 17 = -3813324.683373153
 +Likelihood for round 18 = -3813090.477829844
 +Likelihood for round 19 = -3812948.9962592535
 +Likelihood for round 20 = -3812944.094247621
 +Likelihood for round 21 = -3812945.338805639
 +Likelihood for round 22 = -3812936.51004531
 +Likelihood for round 23 = -3812772.609597657
 +Likelihood for round 24 = -3812676.4874882638
 +Likelihood for round 25 = -3812669.2891049692
 +Likelihood for round 26 = -3812665.0176783875
 +Likelihood for round 27 = -3812635.0462839566
 +Likelihood for round 28 = -3812624.948677367
 +Likelihood for round 29 = -3812568.4781594537
 +Likelihood for round 30 = -3812459.071967899
 +Likelihood for round 31 = -3812068.7494916776
 +Likelihood for round 32 = -3811346.2435353254
 +Likelihood for round 33 = -3811171.591922586
 +Likelihood for round 34 = -3811137.241213156
 +Likelihood for round 35 = -3811127.018922891
 +Likelihood for round 36 = -3811127.0375125445
 +Likelihood for round 37 = -3811127.0560666067
 +Likelihood for round 38 = -3811127.0785230165
 +Likelihood for round 39 = -3811127.1094681732
 +Likelihood for round 40 = -3811127.159344023
 +Likelihood for round 41 = -3811127.2549962364
 +Likelihood for round 42 = -3811127.0055806288
 +Likelihood for round 43 = -3811104.876000952
 +Likelihood for round 44 = -3811083.1818371853
 +Likelihood for round 45 = -3811054.1568926456
 +Likelihood for round 46 = -3811054.160615779
 +Likelihood for round 47 = -3811054.161220626
 +Likelihood for round 48 = -3811054.1613139277
 +Likelihood for round 49 = -3811054.1613279106
 +Likelihood for round 50 = -3811054.1613300266
 +Likelihood for round 51 = -3811054.1613303465
 +Likelihood for round 52 = -3811054.1613304005
 +Likelihood for round 53 = -3811054.1613304005
 +Likelihood for round 54 = -3811054.1613304056
 +Likelihood for round 55 = -3811054.1613304056
 +Likelihood for round 56 = -3811054.1613304047
 +Now testing classifier: Naive Bayes Multinomial,​ EM: true  Clustering: true
 + "​Cluster00"​ "​Cluster01"​ "​Cluster02"​ "​Cluster03"​ "​Cluster04"​ "​Cluster05"​ "​Cluster06"​ "​Cluster07"​ "​Cluster08"​ "​Cluster09"​
 +GVOTE 546(0.44426) 260(0.21155) 6(0.00488) 23(0.01871) 2(0.00163) 0(0.00000) 392(0.31896) 0(0.00000) 0(0.00000) 0(0.00000)
 +GREL 19(0.06230) 73(0.23934) 21(0.06885) 164(0.53770) 0(0.00000) 0(0.00000) 28(0.09180) 0(0.00000) 0(0.00000) 0(0.00000)
 +GHEA 86(0.12991) 9(0.01360) 113(0.17069) 53(0.08006) 275(0.41541) 0(0.00000) 126(0.19033) 0(0.00000) 0(0.00000) 0(0.00000)
 +GDEF 11(0.01198) 13(0.01416) 150(0.16340) 504(0.54902) 3(0.00327) 0(0.00000) 237(0.25817) 0(0.00000) 0(0.00000) 0(0.00000)
 +GENT 86(0.20000) 18(0.04186) 189(0.43953) 49(0.11395) 72(0.16744) 0(0.00000) 15(0.03488) 0(0.00000) 1(0.00233) 0(0.00000)
 +Adjusted Rand Index: 0E+1
 +Accuracy: 0.0
 +</​pre>​
nlp/document-clustering.txt ยท Last modified: 2015/04/23 21:45 by ryancha
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0