### KWIC Concordances, Word Senses and Geometric Sums of Stationary Random Variables Dr. Jason Stover, Georgia College Mathematics Department

#### Abstract

I will present a measure of ``distance'' between phrases with the same central word and examine the probabilistic behavior of this distance. It is defined between phrases formed with the method of KWIC concordancing from the field of linguistics. The distance measure is formed by first replacing rare words with a single, common artificial word, then matching words between two phrases with a moving window, then using the fraction of non-matched words in a geometric series of powers of some $\theta \in (0,1)$. The data suggest the continuity of the limiting distribution of the series for some $\theta$, window lengths and replacement rates. A generalized version of a theorem by Garsia \cite{garsia63} states conditions under which this distribution is singular. For the same window length and $\theta$, two particular words with different meanings are shown to have different distributions of their respective distance measures. A difference in distribution functions for different words may therefore imply a difference of meanings between two words.
