Using Technology to Improve Dictionaries

Richard Trench’s speech On Some Deficiencies in our English Dictionaries and the OED Proposal for the Publication of a New Dictionary demonstrates how lexicographers have kept certain principles while trying to improve dictionaries. It also exemplifies how methodologies can be improved upon with new technology.

These articles show the continuing debate over what words (if any) should be excluded from dictionaries. Trench comments, “A Dictionary is a historical monument, the history of a nation contemplated from one point of view, and the wrong ways into which a language has wandered, or attempted to wander, may be nearly as instructive as the right ones in which it has travelled: as much as may be learned, or nearly as much, from its follies as from its wisdom” (Trench 6). The Philological Society concurs, stating “We entirely repudiate the theory, which converts the lexicographer into an arbiter of style, and leaves it to his discretion to accept or reject words according to his private notions of their comparative elegance or inelegance” (180). This remains an issue as discussed in class and in our text, with people debating on the appropriateness of certain words such as profanities, regionalisms, or slang.

As Landau discusses in the course text, there is no known limit to the number of words in the English language. I believe our expanding population and increased diversity have increased the number of words in our language. Society has been exposed to words from minority cultures and subcultures thanks to the expansion of media and more interaction with people from different backgrounds, be they ethnic, religious, or otherwise. For example, the plethora of television networks have led to a number of shows[1] targeted to different demographics such as ethnic, religious, and age. These shows are not always watched by the target demographics, exposing people to new cultures and subcultures including vocabularies (the question of how representative of said cultures these shows are is open to debate). While society still has a long way to go in terms of acceptance of “the other,” people are less segregated than they were in the past. This has to more exposure to words outside the mainstream. This means there are many new words to be processed and assessed.

I believe technology is expanding the way dictionaries are compiled and how much information they contain. Obviously, the fixed space of print dictionaries is still an issue, but online dictionaries offer theoretically unlimited space for words. Online dictionaries can be tailored for users much easier than they can in print dictionaries whether it be technical dictionaries, etymological dictionaries, or general dictionaries. The only restriction to online dictionaries is the cost of compiling these dictionaries and determining what goes in them and what doesn’t.

Just as the Philological Society asked for help with providing word usages, I believe the manufacturers of online dictionaries can rely on volunteers. While crowdsourced dictionaries (information compiled by a group of people) present challenges such as accuracy and reliability, they present a starting point for lexicographers and trained volunteers to mine data using corpus linguistics. With optical recognition technology, voice recognition technology, and improved computer processing power, lexicographers can look to things such as mining media such as print, music, television and film for new words. For example, books, newspapers, and magazines can be scanned and examined for new words or new meanings of existing words. Voice recognition technology can be utilized to identify music, television, and film for new words also[2] As I’ve mentioned in previous blogs, corpus linguistics can be aided by improvements in optical and voice recognition technology.

The articles in this week’s class show that dictionaries can rely on established principles for the information they contain and use new and/or improved technology to expand their content.

[1] Exposure is not limited to television as mediums such as rap music have provided exposure also, with rapper Chuck D. (and others) discussing how rap is its own news outlet.

[2] This is conjecture on my part as I do not know if voice recognition technology has improved to the point where this can be accomplished in a cost-effective manner.

