Files
Abstract
This dissertation addresses the importance of extralinguistic contexts on domain-specific language. Using publicly available documents from the regulated nuclear power industry, a representative corpus of over 9 million words was created using the methodology developed for the construction of the Tobacco Documents Corpus at the University of Georgia from tobacco industry documents. The reproducibility of this methodology was confirmed through analysis of the rejection ratios over four sampling iterations by means of the two proportion z-test. Analysis of 20 key terms from this industry and the words with which they co-occur within a span of four words to the right and left of the term, lexical profiles were generated to describe this variety of domain-specific language. The validity of these collocations were measured using their Mutual Information scores. The results indicate that that while there is some shared meaning with regard to key terms in this variety of domain specific language, documents authored by different industry groups, as well as individuals located in different NRC regions, will exhibit differences with regard to their use and connotation of these same key terms. This observation demonstrates that context does matter with regard to domain-specific language in much the same way that has been observed in other language varieties (i.e. speech).