English vowels are subject to pre-fortis clipping, then, when they are followed by a fortis consonant within the same syllable. The /f/’s in self, selfish /ˈself.ɪʃ/, and dolphin /ˈdɒlf.ɪn/ trigger clipping, but not those in shellfish /ˈʃel.fɪʃ/ or funfair /ˈfʌn.feə/. So do the /t/ in feet and the /ʧ/ in feature, but not the /p/ in fee-paying or the /k/ in tea-kettle. The vowel /æ/ undergoes pre-fortis clipping in lap, lamp, happy /ˈhæp.ɪ/, and hamper /ˈhæmp.ə/, but not in slab or clamber.
/t/ tapping
‘For some speakers [of RP], and generally in American English, /t/ is realized in weakly accented intervocalic positions as a lenis, rapid tap resembling a /d/ or one tap [ɾ], e.g. butter, latter, put it over there’ (Gimson 1980: 164-5). More generally expressed, those who tap /t/ do so when it is syllable-final. (Other constraints are that the /t/ is not preceded by an obstruent, and is immediately followed by a vowel in the next syllable). Candidates for tapping include the /t/ of might I, but not that of my tie.
/t/ glottalling
‘Increasingly, /t/ in syllable final positions is reinforced or replaced by a glottal closure unless a vowel or syllabic /n/ or /l/ follows’ (Gimson 1980: 165). Rightly said: though nowadays a following syllabic /n/ exhibits a distinctly waning influence in this regard, and the other constraints are weakening, too. (But a preceding obstruent, as in beastly, continues to block glottalling.) A glottal stop is a distinct possibility for the syllable-final /t/’s of Atkins, Gatwick, and jointly, but not for the syllable-initial /t/’s of twice or atomic.
/r/ allophony
‘Within RP, the frictionless continuant variety [ɹ] is frequently replaced by an alveolar tap [ɾ] in intervocalic positions, e.g. very, sorry, ... for ever ...’ (Gimson 1980: 207). Those whose speech follows this pattern may also have [ɾ] in your ice, but not in your rice. That is to say, [ɾ] is a realisation of their /r/ in syllable-final position but not in syllable-initial. Others (I am surely not alone in this) have markedly greater lip action (protrusion, rounding) in syllable-initial /r/ than in syllable-final: greater labialisation in red /red/, key-ring /ˈkiː.rɪŋ], her rice /hɜː ˈraɪs/ than in berry /ˈber.ɪ/, fearing /ˈfɪər.ɪŋ/, her ice /hɜːr ˈaɪs/. (There are also speakers who tap and/or labialise independently of syllable position.)
Plosive epenthesis
‘Few RP speakers regularly maintain the distinction between /ns/ and /nts/ ..., /nts/ tending to be used in all cases’ (Gimson 1980: 187). Yet although many people pronounce [t] in fence and dance, no-one does in inside or rain-soaked. Epenthesis happens only within a syllable, not across a syllable boundary.
Elision of /t/ and /d/
‘In PresE simplification of clusters continues to take place, especially involving the loss of the alveolars /t, d/ when medial in a cluster of three consonants’ (Gimson 1980: 236). The /t/’s in strong and mistrial prima facie conform to Gimson’s conditions for elision. Yet no-one elides them. Elision of English /t/ and /d/ is a possibility only when /t/ or /d/ is part of a syllable-final cluster. Thus in first-rate /ˌfɜːst.ˈreɪt/ the word-medial /t/ is elidable; in mistrial /ˌmɪs.ˈtraɪəl/ it is not.
Other duration rules
The difference between a name and an aim is well-known (Gimson 1980: 295). It is easily accounted for by distinguishing between the longer, stronger allophone in syllable-initial position (annoy /ə.ˈnɔɪ/, a name) and the shorter, weaker, perhaps tapped allophone in syllable-final pre-vocalic position (penny, an aim). More subtly, /n/ is shorter in syllable-final clusters (standing /ˈstænd.ɪŋ/, brandish ‘wield, wave’ /ˈbrænd.ɪʃ/) than when a following consonant is in a separate syllable (bran-dish ‘dish for bran’ /ˈbræn.dɪʃ/).
The main syllabification principle
If allophonic rules are to be allowed to refer to syllable boundaries as part of their conditioning environments, we need a principled way of specifying the location of such boundaries. I propose that English syllabification is governed by a straightforward principle:
(1) Subject to certain conditions (discussed below), consonants are syllabified with the more strongly stressed of two flanking syllables.
Thus the /k/ in packet belongs to the first, stressed, syllable. This analysis is supported by its homophony with pack it: both are /ˈpæk.ɪt/. The /f/ of dolphin belongs in the first syllable: /ˈdɒlf.ɪn/ has the same rhythm as selfish /ˈself.ɪʃ/, where this division is supported by the morphology. The /p/ in happy belongs in the first syllable, as evidenced by its relative lack of aspiration and by the pre-fortis clipping of the /æ/: /ˈhæp.ɪ/. Both the /n/ and the /t/ of enter belong in the first syllable, since the /t/ triggers clipping of both the /e/ and the /n/. The /p/ of typing /ˈtaɪp.ɪŋ/ conditions clipping of its syllable-mate /aɪ/: compare tiepin, where the /p/ exerts no such influence. (Such clipping of the /aɪ/ as there is in this latter word falls under the different heading of ‘rhythmic clipping’, the isochronising effect of unstressed syllables on a preceding stressed syllable.)
Similarly, crisis is /ˈkraɪs.ɪs/: compare rising /ˈraɪz.ɪŋ/, with a lenis syllable-final consonant, hence less clipping. The rhythmic difference between hearty /ˈhɑːt.ɪ/ and hardy /ˈhɑːd.ɪ/ has the same explanation, and is to be referred to the durational difference between heart and hard. In driver /ˈdraɪv.ə/, as in thousands of other words, the phonology parallels the morphology (pace Fudge 1969: 20). In banker we see this even more clearly (pre-fortis clipping, /ˈbæŋk.ə/); anchor rhymes with it perfectly, but fan club has a different rhythm.
As the influence exerted by suffixes causes the stress to shift, so the syllabic affiliations of consonants change. In note and noting /ˈnəʊt.ɪŋ/ the /t/ of not(e) is syllable-final, but in notation /nəʊ.ˈteɪʃ.n/ and annotate /ˈæn.ə.teɪt/ it is syllable-initial and aspirated. In attest /ə.ˈtest/ the first /t/ is strongly aspirated, attracted into the second syllable by the stress; in attestation /ˌæt.e.ˈsteɪʃ.n/ it has less aspiration or none, since the second syllable is now unstressed while the first has secondary pre-tonic stress, which makes it capture the /t/ back. In apply /əˈplaɪ/ the /l/ is voiceless, as it carries the aspiration of the syllable-initial /p/; in application /ˌæp.lɪ.ˈkeɪʃ.n/ it is less so. In magnetic /mæg.ˈnet.ɪk/ the /t/ is syllable-final and a candidate for possible tapping; in magnetism /ˈmæg.nə.₀tɪz.əm/ the tertiary (post-tonic) stress on /ɪz/ is sufficient to attract the /t/ into syllable-initial position, triggering aspiration while blocking tapping.
Stress levels
The expression ‘more strongly stressed’ in (1) has to be interpreted as referring to position on a five-point scale:
- primary word stress;
- pre-tonic secondary stress;
- tertiary (post-tonic) stress;
- unstressed but with full vowel;
- weak (reduced) vowel.
All five grades are illustrated in the word substitution-product /ˌsʌb.stɪ.ˈtjuːʃ.ən.₀prɒd.ʌkt/, where the syllables are of grades 2, 5, 1, 5, 3, 4 respectively. In magnitude the /t/ goes with the final syllable, /ˈmæg.nɪ.tjuːd/, because the third syllable (grade 4) outranks the second (grade 5).
Adjacent syllables of equal rank
The only cases in English where immediately adjacent syllables have equal grade are those involving weak vowels (grade 5). They are governed by the principle:
(2) Where adjacent syllables are of equal grade, consonants are (again subject to stated conditions) syllabified with the leftward syllable.
The /t/ allophones in carpeting /ˈkɑːp.ɪt.ɪŋ/, covetous /ˈkʌv.ɪt.əs/ and purity /ˈpjʊər.ət.ɪ/ make this clear.
Americans seem agreed that they can tap the /t/ in quality, but not the /t/ in politics. This must be because the /ɪ/ of -ics counts as a full vowel, sufficient to outrank the weak vowel of the second syllable and thus capture the /t/; but the /ɪ/ (or /iː/) at the end of quality counts as weak, leaving the /t/ syllable-final in the second syllable, by (2), and thus tappable.
Words like apex
According to (1), the /p/ in apex should go with the first, stressed, syllable: /ˈeɪp.eks/. This seems correct for RP, though it may well not be correct for some other varieties of English. If we consider the contrived examples A pecks (in some classification of kinds of peck, with A pecks, B pecks, C pecks ...) and a possible brand name ape-x (compare Timex, Durex), it is clear that their syllabification follows the morphology: A pecks /ˈeɪ.peks/, ape-x /ˈeɪp.eks/. And the ordinary word apex is like the second and unlike the first. The /p/ of apex does indeed condition pre-fortis clipping of the /eɪ/, and must therefore be in the first syllable.
The morpheme boundary condition
Morpheme boundaries such as those between the elements of a compound normally block the operation of (1). The /p/ of fee-paying remains initial in the second syllable, so that there is no pre-fortis clipping of the /iː/ (compare deep). The same applies in re#print (n.) /ˈriː.prɪnt/ (compare reaper /ˈriːp.ə/) and pre#suppose /ˌpriː.sə.ˈpəʊz/ (compare priest). There is pre-fortis clipping of the /aɪ/ in hyphen /ˈhaɪf.ən/, but not of that in high-faluting /ˌhaɪ.fə.ˈluːt.ɪŋ/. We need the following as a condition on the main principle:
(3) In polymorphemic words, consonants belong to the syllable appropriate to the morpheme of which they form a part. This applies only to synchronic, psychologically real morphemes.
It is this condition which explains the potential rhythmic differences between Roman /ˈrəʊm.ən/ and bow#man /ˈbəʊ.mən/, bonus /ˈbəʊn.əs/ and slow#ness /ˈsləʊ.nəs/, or highness (regal term of address) /ˈhaɪn.əs/ and high-ness (quality of being high) /ˈhaɪ.nəs/ (Sharp 1960). A recent example from the world of popular music is prima donna /ˌpriːm.ə.ˈdɒn.ə/ vs. pre-Madonna /ˌpriː.mə.ˈdɒn.ə/.
For the many English suffixes which begin with a vowel, (3) is irrelevant. By (1), words such as bigger, oldest, putting, horses, zealous, scenic already receive syllable boundaries coinciding with the morpheme boundaries: /ˈbɪg.ə, ˈəʊld.ɪst, ˈpʊt.ɪŋ, ˈhɔːs.ɪz, ˈzel.əs, ˈsiːn.ɪk/.
Certain suffixes do not count as ‘psychologically’ real under (3). An example is the ‑ful of aweful, careful, words which are pronounced with pre-fortis clipping of the stressed vowel, which vowel must therefore have captured the /f/ from the suffix. (Compare awe-ful ‘full of awe’, where the /ɔː/ is unclipped.) In proper names, ‑ton and ‑son are usually treated as if not separate morphemes, as is evidenced by the pre-fortis clipping usual in Barton /ˈbɑːt.n/ and Dawson /ˈdɔːs.n/ and by the possible epenthesis in Benson /ˈben(t)s.n/. Yet ‑ford, I think, does behave phonetically as a separate morpheme: Crayford /ˈkreɪ.fəd/. The morpheme boundary before ‑ism does not inhibit capture of the /t/ in magnetism, as we saw above; but in less familiar words the syllabification tends to follow the morphology: Bonapartism /ˈbəʊn.ə.pɑːt.ˌɪz.əm/, puppetism /ˈpʌp.ɪt.ˌɪz.əm/. It is again American tapping that shows clearly that the morpheme boundary before the suffix ‑ise does not inhibit /t/-capture in magnetise /ˈmæg.nə.taɪz/, sensitise, sonnetise. Introspecting, I find that I treat ‑dom inconsistently, saying freedom /ˈfriːd.əm/ but boredom /ˈbɔː.dəm/.
Some speakers, aware of etymology and meaning, may have an unclipped /aɪ/ in tri#pod; but not, presumably, in tripos /ˈtraɪp.ɒs/. In general, this whole area of presence/absence of phonetic correlates of morpheme boundaries is still far from fully explored.
The phonotactic condition
The main syllabification principle does not operate in such a way as to lead to consonant clusters which are phonotactically ill-formed. Thus (1) is subject to the condition:
(4) Phonotactic constraints on syllable structure (as established with reference to monosyllables) are not violated.
This means, for example, that timber is syllabified as /ˈtɪm.bə/, since /mb/ is not a possible final cluster: /b/ cannot be captured into the stressed syllable. Similarly, anger is /ˈæŋ.gə/, at least in RP. But tender is /ˈtend.ə/, /nd/ being a permitted cluster (stand). Notice how neatly this fits with permitted initial /Cl/ clusters: tumbler /ˈtʌm.blə/, English /ˈɪŋ.glɪʃ/, but chandler /ˈtʃɑːnd.lə/ (just as we have /bl‑, gl‑/, but no /dl‑/).
Questions arise over /r/ and /ʒ/. Although final /r/ does not occur in RP in words pronounced in isolation, it does occur in connected speech, and in such a way as to make clear that /r/ can be syllable-final (see the discussion of /r/ allophony above). Linking /r/, both internal and external, is indeed syllabified with the preceding vowel. Accordingly we need not hesitate to analyse bleary, sharing as /ˈblɪər.ɪ, ˈʃeər.ɪŋ/ and hence the phonetically comparable weary, Mary as /ˈwɪər.ɪ, ˈmeər.ɪ/. Equally, sorry is /ˈsɒr.ɪ/ and spirit /ˈspɪr.ɪt/. Final /ʒ/ is admittedly restricted to loan-words (rouge, beige), but this is sufficient justification for us to accept the analysis measure /ˈmeʒ.ə/.
Notice that (1) is already sufficient to reflect the phonotactic constraint that disallows the occurrence of short vowels finally in a stressed syllable. Better is /ˈbet.ə/ rather than */ˈbe.tə/ not only because the /t/ triggers pre-fortis clipping and is tappable, but also because a syllable /ˈbe/ would not be phonotactically well-formed.
A difficulty arises in the case of words like nostalgic, posterior, fastidious. Is the correct analysis /nɒ.ˈstældʒ.ɪk/, as follows straightforwardly from (1)? Or must we protect a full (unreduced) short vowel from exposure in syllable-final position, and syllabify as /nɒs.ˈtældʒ.ɪk/? Perhaps the truth is that speakers differ, and are also inconsistent; but Davidsen-Nielsen’s investigation (1974) tends to show that the first type of syllabification, /nɒ.ˈstældʒ.ɪk/, predominates where there is no morpheme boundary after the /s/.
The example cacophony /kæ.ˈkɒf.ən.ɪ/ confirms that syllable-final short vowels are not absolutely precluded in unstressed syllables; tattoo /ˌtæ.ˈtuː/, not recorded in EPD with double stress but readily observed with this pronunciation, shows that this licence even extends to syllables having secondary stress.
The affricate condition
The last condition which we have to impose on (1) is one relating to the post-alveolar and palato-alveolar affricates:
(5) Affricates (i.e. /tr, dr, tʃ, dʒ/) are not split between syllables, but are treated as indivisible.
This can hardly be contested in the case of the palato-alveolars. For catching, teacher, allergic, courageous the predicted syllabification again parallels the morphological: /ˈkætʃ.ɪŋ, ˈtiːtʃ.ə, ə.ˈlɜːdʒ.ɪk, kə.ˈreɪdʒ.əs/. Ratchet, feature, merger, magic are equally straightforward. Adjusting for place of articulation, petrol, mattress, squadron, Audrey are allophonically parallel; yet the putative /ˈpetr.əl, ˈmætr.əs, ˈskwɒdr.ən, ˈɔːdr.ɪ/ might seem to violate the phonotactic condition. It seems, though, that we must accept these syllabifications. If petrol is not /ˈpetr.əl/, what can it be? If it were /ˈpet.rəl/ we should expect possible glottalling (glottal replacement), as in rat-race, out#right. If it were /ˈpe.trəl/ we should have a unique violation in the phonotactic constraint against stressed short vowels in syllable-final position. Given that we demand explicit syllabic boundaries, as we must if the allophonic rules based on them are to be coherent, only /ˈpetr.əl/ remains. This analysis, actually, is supported by the occurrence at the surface level of word-final /tr/ after elision in items such as matter-of-fact /ˌmætr.ə.ˈfækt/.
The other phonotactic constraints still apply: district must be /ˈdɪs.trɪkt/, with condition (4) ruling out a possible */ˈdɪst.rɪkt/ (which would be wrong both allophonically and etymologically). The rules lead us to choose /ˈeks.trə/ as the correct syllabification of the much-discussed extra. The morpheme boundary condition ensures that light-ship is /ˈlaɪt.ʃɪp/, and indeed its longish [ʃ] is clearly a token of the fricative /ʃ/, not part of the affricate /tʃ/. Similarly, board-rubber is /ˈbɔːd.ˌrʌb.ə/, although bedroom tends to be pronounced as if morphologically solid, /ˈbedr.ʊm/.
If it is accepted that petrol is /ˈpetr.əl/ and squadron /ˈskwɒdr.ən/, we must also allow that entry is /ˈentr.ɪ/ and sundry /ˈsʌndr.ɪ/. Although this may seem unlikely at first sight, notice the pre-fortis clipping exerted by the post-alveolar affricate of entry on the initial vowel and nasal. And notice the parallelism with the morphology in wintry /ˈwɪntr.ɪ/. Compare in-tray /ˈɪn.treɪ/, with a morpheme boundary separating the nasal from the affricate and no pre-fortis clipping. I would claim that the schwa-elided variant of entering is /ˈentr.ɪŋ/, again with correspondence between syllabification and morphology. Affricates preceded by /l/ must follow the same pattern: paltry /ˈpɔːltr.ɪ/, cauldron /ˈkɔːldr.ən/, but mail-drop /ˈmeɪl.drɒp/.
Conclusion
I claim that by principle (1), together with codicil (2) and conditions (3), (4) and (5), we achieve a correct syllabification virtually throughout the English vocabulary — correct, that is, for purposes of predicting appropriate allophones where allophonic variation is sensitive to a syllable boundary. In support of this claim I can report that in the course of working on a new pronouncing dictionary I have transcribed over 50,000 entries with explicit syllable boundaries throughout: indeed, that was the task which led me to formulate the principle. Once I had discovered the principle, it constituted a convenient decision procedure for uncertain cases without, so far as I am aware, any serious untoward results.
Self-criticism
Occasional difficulties do remain. I am worried about words such as accelerate and memorise, where introspection leads me to posit /ək.ˈsel.ər.eɪt, ˈmem.ər.aɪz/ rather than the predicted /‑ə.reɪt, ‑ə.raɪz/ (compare annotate, advertise, for evidence that these endings normally attract a preceding consonant). Perhaps instrumental evidence will throw light on the accuracy of my intuitions regarding this apparently wayward behaviour of /r/.
The distinction between unstressed full and weak vowel is not always clear, since RP /ɪ/ is ambiguous as between these two categories, and so sometimes are /əʊ/ and /aɪ/. Are the final vowels in armistice and cannabis weak (/ˈɑːm.ɪst.ɪs, ˈkæn.əb.ɪs/) or full and therefore consonant-capturing (/ˈɑːm.ɪ.stɪs, ˈkæn.ə.bɪs/)? RP really offers no grounds for a decision either way. For what it is worth, Australian English does, since Australians would keep a full /ɪ/ as /ɪ/ but turn a weak one into /ə/. It turns out that Australians use /ə/ in the final syllables of armistice and cannabis, thus indirectly demonstrating RP /ˈɑːm.ɪst.ɪs, ˈkæn.əb.ɪs/. In politics, as we have seen, Americans treat the final syllable as full-vowelled; so do the Australians (/‑tɪks/). RP offers no direct evidence, but may be presumed to agree with them.
Ambisyllabicity?
So strong is the presumption among linguists for CV.CV structure as universally preferred that many writers assume it to be true for English even in the face of strong counter-evidence such as is discussed here. Fudge (1984: 21) asserts, on no substantial evidence that I can detect, that competitive, for example, has a stressed syllable /pe/; but American writers, alert to the implications for /t/ allophones, correctly insist on /.ˈpet./ (see, for example, Webster’s Ninth New Collegiate Dictionary). Grunwell (1982) assumes, equally without justification, that a word such as better is /ˈbe.tə/.
A more sophisticated idea is that a ‘left-captured’ consonant such as the /t/ in better is ambisyllabic, belonging to both syllables simultaneously (Kahn 1976: 33; Gussenhoven 1986). This notion has a respectable origin in the phonetic approach to the syllable in terms of sonority: the intervocalic consonant represents a trough of sonority and ‘belongs’ to neither peak. In modern terms, ambisyllabicity may be felt to allow us to satisfy at the same time both the putative universal preference for CV.CV and the overwhelming allophonic arguments in favour of CVC.V. The principle of Occam’s razor, though, shows that ambisyllabicity is not a useful concept. Those who believe in an absolute universal preference for unchecked (open) syllables must, I believe, accept that in English this can at best be true only of deeply abstract representations, and that by the level at which allophonic conditioning becomes relevant a resyllabification rule must have come into operation, namely the principle I propose. And this is uneconomical, since a word such as additive, morphologically /æd+ɪt+ɪv/, would have to have been switched to phonological /ˈæ.dɪ.tɪv/ before surfacing again as [ˈæd.ɪt.ɪv]. There may be occasions when the Duke of York gambit is necessary (Pullum 1976), but I do not believe this is one of them.
Acknowledgements
My debt not only to Jones and Gimson but also to Fudge and Kahn must be self-evident, even though I frequently do not agree with them. Gussenhoven (1986) was published only after I had already delivered the UCL staff seminar paper of which this article is a version; I am delighted to see that our thoughts are along the same lines.