Solution for May 2007

labeled spectrogram
<A href=="../wav/wav0705.wav"> "He plays oboe and clarinet."


First the segmental, then the prosodic.

[h], IPA 146
Lower-case H
So for about 50 msec 'around' 100 msec, if you follow, there's some noise at the bottom, then nothing, then some noise at about 2200 Hz, more or less, then some more at about 2600 Hz, more or less, then a little more at 3400 and so on. The noise at the bottom isn't striated, as in periodic voicing, so this is actual noise, i.e. voicelessness, i.e. airflow through the glottis uninterupted by glottal pulsing. But the noise up above indicates an open vocal tract. So this is a glottal fricative. Note the energy up about 2000 Hz is more or less in line with the formants in the following vowel, as if (actually, there's no 'as if' about it) the resonances of the vocal tract are being excited, not by periodic energy but by the noise. This is why [h] is often equated with 'voiceless vowel'. Vowel, in the sense of an open vocal tract with resonances, but voiceless. And in English, in onsets.

[i], IPA 301
Lower-case I
So on to the vowel. F1 is low, sort of low enought to get los tin the voicing bar, so we're dealing with a high vowel. F2 is very high, up around 2200 Hz, and really only [i] (and analogous glides) eery have an F2 that high.

[pʰ], IPA 101 + 404
Lower-case P + Right Superscript H
So the gap starts around 200 msec, or slightly after, with a few pulses of perseverative voicing into the closure. Note right at 200 msec that F2 is pulling way, way down, all the way to about 1500 Hz, and F3 (and F4) are also pointed down into the closure. When all the formants are pointing down like that, you're likely heading into a bilabial closure. The release seems to occur at about 300 msec, and is followed by at least 50 msec of aspiration. So we've got an aspirated bilabial plosive going on here. That's plenty of aspiration to count as aspirated, but it doesn't seem to be that long, all things considered. But as it turns out, it's not just long enough to be aspirated, but lengthened.

[ɫ̥], IPA 209 + 402
Tilde L (Dark L) + Under-Ring
See how the F3 is raised coming out of the aspiration? How to account for that, you might ask? Why bother? Cuz that aspiration is longer than we might expect, which suggests that there's something going on--either the approximant degree of closure increases the duration of the noise, or the voicelessness spreads into the duration of the appoximant segment, depending on what kind of explanation you're looking for. So we can account for both the raised F3 and the relatively long aspiration by positing an approximant. With a raised F3 like a lateral. Consistent also with the low F2 (dark/velarized English /l/, even in an onset).

[eɪ], IPA 302 + 319
Lower-case E + Small Capital I
So, just so we can all agree, the only really steady state portion of this is the F2 at its highest point, right? Okay, the low F2 transition is part coming from the low F2 position of the approximant, but the fact that it doesn't immediately skip up, but sort of moves in a straight line up to the steady-state position are both suggestive of an intermediate 'target', i.e. a frontish but not as-front-as-the-glide Thing at the front of this vowel. The F1 is clearly at about 500 Hz, middish, and then suddnely it moves a little lower. So we have something that is middish and front that becomes higher and fronter, i.e. a diphthong [eɪ].

[z], IPA 133
Lower-case Z
So the noise starts at about 425 msec, and there's 25msec or so of voicing accompanying the noise. So this is probably a voiced fricative. This also might explain the relative length of the preceding vowel, especially the offglide, but I don't know much about that. The noise forms a signel broad band that seems to get stronger as it goes upin frequqnecy. Even though it seems to die off belwo the F2, it doesn't die away comletely below the F2, and there's no accompanying 'pole' or amplitude band in the F2/F3 range. Had they been there, it would suggest a post-alveolar, but since they're not, we're talking alveolar.

[ʔ], IPA 113
Glottal Stop
There's a looong duration between the offest of the noise, at about 600 msec and the onset of regular voicing, at almost 700 msec. Too long to just have nothing, and anyway, it's not nothing. It's a buncha creaky voice. Creaky voice, especially at the beginning of a vowel like that, is usually a correlate of glottalization, and hence glottal stop. Technically it's not a stop, and eventually I'll get weird about transcribing creaky voice instead of glottal stop the way I've gotten weird about [ɐ] rather than [ʌ]. But anyway, it's the beginning of a phrase of some kind.

[o], IPA 307
Lower-case O
Diphthongized? Dunno. But it doesn't have a clear 'moment' when it changes, as opposed to just a slow and steady slide into the closure. So I didn't transcribe it taht way. I could be wrong. Judg(e)ment call. Anyway, look at the F1. Mid mid mid. F2, low. Mid and back/round.

[b], IPA 102
Lower-case B
Nice longish gap with a nice clear voicing bar at the bottom. Voiced plosive. Transitions into it are F1-falling (consistent with closure), F2-falling (consistent with a bilabial), and F3 and F4 just sitting there. On the opposite side, they're the opposite. So we have a fairly good cue for bilabial and nothing pointing specifically anywhere else. Bilabial.

[nʊ], IPA 307 + 321
Lower-case O + Upsilon
Longish stretch of vowel, with a funny dip in the F2 between 1050 and 1100 msec. So I think from 950 msec to 1150 msec, we've got three things to considere. F1 is pretty mid throughout, flat and not particularly interesting, excep tmaybe as the amplitude changes a little. F2 starts low as before, so the first bit of this is clearly [o] again. Then there's a dip in the F2, almost as it the lips started to close or tighten or something. This is what I'm counting as the off-glide. It may not be an offglide so much as a transitional thing between the [o] and the moving bit that follows, but whatever. For just a moment, it gets more round. Or maybe back. The lost of the F3 I think points to round--smaller apertures just transmit less, so I think rounding accounts for the total drop in amplitude. And the weirdness in F4, which I just don't want to think about.

[ə], IPA 322
Schwa
So the past bit, from 1100 to 1150 msec or so. It's really short, and the F3 is moving throughout. So let's jsut call it a schwa and move on.

[nd], IPA 116 + 104
Lower-case N + Right Superscript D
50 msec or so of strong, sonorant voicing (compare the relatively strong amplitud of voicing here than during the previous [b]. Not much happening in terms of resonance until we get up to 2500 Hz, but there it is, a resonance. So we've definitely got something sonorant. Probably nasal, given the zeroes (absence of energy). The F3 transition into it is hanging there. F4 I'm going to continue to ignore. F2 is rising through the schwa. So we're not getting any bilabial cues, and not much in the way of velar cues. So this is probably alveolar. It woudl be nice if there were a resonance at F2, but oh well. The mushy release burst is an oral release, so, in my increasing IPA weirdness, I'm now transcribing these as orally-released nasals, but this is just standard for my homorganic nasal-stop sequences. So this is an underlying /nd/.

[kʰ], IPA 109 + 404
Lower-case K + Right Superscript H
There's a funny, broad, high amplitude clunk here, which I'd be tempted to ignore or declare a closure transient, since it's another 25 msec or so before we get any decent release noise going. But then there's a double-bursty looking thing in F2 at the same moment. Double bursts are usually associated with velar releases but then I oculdnt explain the delay in the noise. So I'm going to suggest that this is a rare velar closure transient, consistent with a short delay between closure(s) on each side of the velum (with the uvula in the middle), which is my (admittedly minority) explanation of the prevalence of double bursts with velars. So this is velar. Although there's nothing really useful in terms of transition on either side. Aspirated, of couurse. Just look at that VOT.

[ɫ̥], IPA 209 + 402
Tilde L (Dark L) + Under-Ring
BUt once again, there's something weird happening in the upper formants, and an ohterwise unexplained low F2 during all that VOT noise to be accounted for. So again I'm jamming an approximant /l/ and moving on.

Egad this is a long spectrogram. Pushing on: There's a funny discontinuity at about 1425 msec, which I take to be the 'moment' if there is one, when the tongue approaches the minimum coming up. So ths sretch from 1375 to 1575 msec (or so) I'm dividing into three bits again. The first bit, before the discontiniuty, the bit around the F3 minimum, and the bit after.

[ɛ], IPA 303
Epsilon
Okay, vowel. F1 is mid or a little bit high of mid, so we're dealing with a middish or lower-middish kind of vowel. F2 is neutral, but not nearly as low as with the [o]s previous, so I'm going to declare this central/frontish. F3 I'm goign to ignore even though it's flat because it's heading somewhere. So we've got something not at all back or round (by virtue of having been declared central/frontish--Peter Ladefoged always said you had to know what you were looking at before you could look at it, and this kind of circular reasong is what he meant. And mid or mid-low. Admittedly, there's a couple of English vowels down in that part of the vowel space but since there's some serious coarticulatory (or allophonic) issues with the upcoming segment, it hardly matters which you pick....

[ɹ], IPA 151
Turned R
So the critical thing here is to notice the F3 is way low. Not as low as it sometimes gets, but way low. Really, the only thing that pulls F3 down like that is /r/.

[ɨ], IPA 317
Barred I
Another short, moving vowel. Moving on.

[n], IPA 116
Lower-case N
Okay, this is a more canonical looking nasal. Not quite 100 msec around 1600 msec. Look at those edges. Look at that nice voicing bar. Those fabulous zeroes. And best of all, there's just a hint of resonance at 1500 Hz or so, telling us that this is an alveolar nasal. Also the transitions are consistent with that, but that 1500 Hz resonance is just beautiful.

I've been looking at these things too long.

[ɛ], IPA 303
Epsilon
F1 is relatively high, so we're dealing with something lowish. F2 is a little above neutral, so this is front. So we're dealing with a lowish frontish vowel. This one is the stressed one, so it can be a little lower and fronter compared with the previous [ɛ], and it doesn't have the same contextual difficulty. So again, we've got two vowels down in the lower fronter part of the space, and here it matters which one it is. Although how you'd tell the difference I'm not sure at this point. Hmm. Anyway, there's some glottalization at the end of this, which is partly phrase-final low pitch ...

[t], IPA 103
Lower-case T
... and partly allophonic glottalizaiton of this consonant. So what we can see of the transitions during the glottalization is consistent with alveolar closure, which is lucky because alveolars glottalize more than velars and bilabials. The release noise is 'tilted' toward the high frequencies, like a sibilant (well, like [s]) which is consistent with an alveolar release. I'd expect a velar release to have more noise in the F2/F3 range, and a bilabial release to not have that very high frequency noise component to it. Usually. So on the balance, almost undoubtedly alveolar.

So let's talk prosody. This utterance didn't work out quite the way I expected.

First the break indices. Zeroes mark the ends of reduced words, think 'clitic' although that word is loaded. Ones mark the ends of prosodic words. Anything 1 or above should have a lexical (*) accent. Twos are for something in between word and phrase, either that has a boundary tone but doesn't otherwise exhibit the features of a boundary, or as in this case something that seems to have the timing of a boundary but doesn't seem to have a tonal mark. Threes, if there were any, would mark phrase boundaries, and the four the utterance boundary. Or at least that's what they're doing here.

Tonewise, there are two clear highs. Whether they're H*s or H*+L I'm not 100% sure, so I just marked them as H*. Probably the first one should be an H*+L, since otherwise there's no real reason for the low pitch on the following syllable. Unless you think that's really a 3], which is possible. The last H* is placed on the stressed syllable, which happens also to be final in the utterance, so it's getting kind of squished in with the low boundary tone.

That 2] was a compromise in my head. It felt (and sounded) like there was some lengthening there, but there didn't seem to be a phrasal tone. It looks like the H* there is deaccented, and I don't know how to deal with this in ToBI. Suggestions appreciated. But following (my) ToBI rules, the lexical word there is supposed to get some kind of mark. Again, I could be wrong, as far as real ToBI goes.

I obviously need to brush up on my ToBI. So this'll be it for the intonation for a while.