Uma palavra ou vocábulo é uma unidade da linguagem falada ou escrita. As palavras podem ser combinadas para criar frases. A palavra "palavra" em si deriva originalmente do grego parabolé, tomada emprestada pelo latim, que também gerou "parabola".

A word is a unit of spoken or written language. Words can be combined to create phrases, clauses and sentences.

Codex claromontanus latin.jpg

Latim escrito sem quaisquer separações entre palavras no Codex Claromontanus

However, a word is not an easy concept to define. In written language, it is easier to spot what a word is. In most writing systems, a word is marked out in the text by spaces on either side. Some languages (for example, Amharic) use word dividers, and others (for example, Sanskrit) do not separate words in writing. In ancient Latin manuscripts and in Japanese, word boundaries are optional.

Even in writing systems that plainly mark word boundaries there remains some ambiguity: for example, is breadknife one word, two (bread knife) or something in between (bread-knife)? In English, many common phrases have historically progressed from being written as two separate words (e.g. to day) to hyphenated (to-day) to a single word (today), a process which is still ongoing, such as the now common 'misspelling' of at all as atall.

In synthetic languages, a single word stem (for example, love) may have a number of different forms (for example, loves, loving and loved). However, these are not usually considered to be different words, but different forms of the same word. In these languages, words are considered to be constructed from a number of morphemes (for example, love+s).

In polysynthetic languages, the number of morphemes per word can become so large that the word performs the same grammatical role as a phrase or clause in less synthetic languages (for example, in Yupik, angyaghllangyugtuq means 'he wants to acquire a big boat'). These large-construction words are still single words, because they contain only one lexeme, 'lexical morpheme', or 'dictionary-definition word': the other morphemes are grammatical particles that cannot stand alone.

Matters seem easier for analytic languages. For these languages, a word almost always consists of a single morpheme. However, even then, some concepts are expressed by combining syllables or graphemes into a single compound.

In spoken language, the distinction of individual words is even more complex: short words are often run together, and long words are often broken up. Spoken French has some of the features of a polysynthetic language: je ne le sais pas ('I do not know it') tends towards /ʒənələsepa/. As the majority of the world's languages are not written, the scientific determination of word boundaries becomes important.

There are five ways to determine where the word boundaries of spoken language should be placed:

Potential pause
A speaker is told to repeat a given sentence slowly, allowing for pauses. The speaker will tend to insert pauses at the word boundaries. However, this method is not foolproof: the speaker could easily break up polysyllabic words.
A speaker is told to say a sentence out loud, and then is told to say the sentence again with 'extra words' added to it. Thus, I have lived in this village for ten years might become I and my family have lived in this little village for about ten or so years. These 'extra words' will tend to be added in the word boundaries of the original sentence. However, some languages allow for infixes: additional information added inside the 'word'.
Minimal free forms
This concept was proposed by Leonard Bloomfield. Words are thought of as the smallest meaningful unit of speech that can stand by themselves. This correlates phonemes (units of sound) to lexemes (units of meaning). However, some written words are not minimal free forms, as they make no sense by themselves (for example, the and of).
Phonetic boundaries
Some languages have particular rules of pronunciation that make it easy to spot where a word boundary should be. For example, in a language that regularly stresses the last syllable of a word (like Hebrew), a word boundary is likely to fall after each stressed syllable. Another example can be seen in a language that has vowel harmony (like Turkish): the vowels within a given word share the same 'quality', so a word boundary is likely to occur whenever the vowel quality changes. However, not all languages have such convenient phonetic rules, and even those that do present the occassional exception.
Semantic units
Much like 'minimal free forms', this method breaks down a sentence into its smallest semantic units. However, language often contains words that have little semantic value (and often play a more grammatical role), or semantic units that are compound words.

In practice, linguists apply a mixture of all these methods to determine the word boundaries of any given sentence. Even with the careful application of these methods, the exact definition of a word is often still elusive.

