sql - How to calculate TF-IDF in OracleSQL? -


this text mining project. purpose of project see how every word weighs differently in different document.

now having 2 tables, 1 table tf information (word | wordfrequency_in_eachfile), table idf (word | howmanyfile_have_eachword). not sure query use calculation.

the math trying here is: wordfrequency_in_eachfile*(log(n/howmanyfile_have_eachword)+1) n total number of document. below code:

create table tf_idf (word, tf*idf) select a.frequency*((log(10,132366/b.totalcount)+1))  term_frequency a, document_frequency b a.word=b.word; 

here 1323266 total number of documents, , totalcount how many documents word shows.

since new sql, appreciate little explanation code. lot!

calculation looks good, there invalid syntax.

right variant may below:

create table tf_idf select    a.word                                           word,   a.frequency*( log(10, 132366/b.totalcount) + 1)  tfidf    term_frequency     a,    document_frequency b    a.word=b.word ; 

in create ... select ... statement don't need column specifications. column names , types derived field aliases. also, must provide values word column in new table. , 1 more point: there 1 excess pair of brackets in expression.


Comments

Popular posts from this blog

c# - Unity IoC Lifetime per HttpRequest for UserStore -

Change the color of an oval at click in Java AWT -

I am trying to solve the error message 'incompatible ranks 0 and 1 in assignment' in a fortran 95 program. -