Sunday, April 11, 2010

salton1975vector A vector space model for automatic indexing

Salton, G., Wong, A., and Yang, A. C. S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18:229-237.

= = = = = = = = = =

[das2007survey]
"In the bag-of-words representation (Salton et al., 1975) each document is represented as a sparse vector in a very large Euclidean space, indexed by words in the vocabulary V . A well-known technique in information retrieval to capture word correlation is latent semantic indexing (LSI), that aims to nd a linear subspace of dimension k jV j where documents may be approximately represented by their projections. ........ dst"

No comments:

Post a Comment