Lower-Case M
[m], IPA 114
Starting at 75 msec and goign on until about 150 msec, we've got a nice little
sonorant happening. It's got a nice, clear voicing bar at the bottom, and resonances
at the higher frequencies. The sharpness of the edge (of the following vowel),
the overall lowered energy (relative to the following vowel), the presence of a nice
clear zero (around 750 Hz) and mostly flat (unchanging) resonating structures, are
all good pointers to a nasal stop. The pole around 1000 Hz is usually a pretty good
clue (in my voice) that it's bilabial. The F2 transition in the following vowel
is consistent with that--that F2 onset frequency is too low to be alveolar, and the
distance between F2 and F3 is atypical of velars. But it's that F3 transition that
bothers me. The F3 seems to fall into the following vowel, which is
consistent really only with alveolars. So we've got conflicting cues. Which are we
going to believe? Well, we're going to wait for a deciding vote. Once we have a
clearer idea of what the first few syllables of this utterance are, knowing it's
English and a declarative sentence, we'll use lexical access to decide whether we're
looking at an [m] or an [n]. Or something else...
Lower-Case A + Small Capital I
[aɪ], IPA 304 + 319
So the F1 onset frequency in the first full pulse is just below the 750 Hz zero
in the nasal, but it rises very quickly and reaches a peak well before 200 msec.
So ignoring the first few pulses as transition, we've got something that starts
fairly low in the vowel space. The F2 at that moment is still fairly low as well,
but some of that might be transitional. So we've got something that starts lowish
and sort of backish (or roundish?), but the F1 lowers over almost 100 msec toward
the following consonant, indicating a slow rising of the vowe, and the F2 never
stops moving up (forward in the vowel space). So what we have here is a diphthong
starting lowish and backish and moving up and forward. Again, there may be two
choices, but one is probably better than the other. (Quick, what's the other choice,
and how would you expect it to look, assuming this isn't it?)
Always be an active learner.
Lower-Case S
[s], IPA 132
Well, this is interesting. From 300 msec (a little earlier in the higher frequencies) to
almost 400 msec, there's a nice voiceless fricative. There's no hint of voicing
or anything at the low end. There's some noise into the very low frequencies, and
for some reason the amplitude hikes up a bit at about 1500 Hz. Then it stays pretty
much flat (i.e. at the same amplitued) all the way up. So this is fairly strong and
broad band, typical of sibilants. And the sudden drop off below 1500 Hz is usually
a clue that it's post-alveolar. But I'm going to suggest it's not. Partly,
it's because I know what it's supposed to be, and I'm floundering for reasons to
be right. Okay, usually a post-alveolar (rather than alveolar) sibilant has that
strongest energy in the F2-F4 range, and I think that low energy 'border' isn't
quite continuous with the F2 band in the following vowel, such as it is. So
I don't know. This is supposed to be an [s]. And I think if we followed it up to
the 6-12 kHz range, we'd see it really gets really, really loud up there. So
this is an alveolar. Accept it. Move on.
Barred I
[ɨ], IPA 317
Well, for a scant 25 msec or so, there's a vowel. There is. Look at it. But
it's so short, it's hardly worth spending any time worrying about. So I won't.
Quick, why isn't it worth spending any time worrying about?
Lower-Case N
[n], IPA 116
Another nasal. Now look at this one carefully. There's a nice strong voicing
bar, and there's a band of weaker energy just above that. Now compared with
the initial one, this one is a little higher in frequency or broader in band.
So they're not quite the same. There's a zero. It's narrow, but it's a little
higher in frequency than the zero in the previous one. There's a little energy at
1000 Hz, but it's weak w/r/t the previous one. And there's that blip, or whatever
youw ant to call it, several pulses of resonance, or something, up just below 1500 Hz.
I point that out because it turns out that it's important. I think that's the
real pole. But I could be wrong. But what are the odds.
Lower-Case K + Superscript Lower-case H
[kʰ], IPA 109 + 404
There's what appears to be a closure transient, or maybe it's just a clunk, where
I've marked the boundary. There's some perseverative voicing, I guess, but look
at that aspiration. Even excluding the material before the second burst, that's
at least 75 msec of aspiration. Which is quite a lot for me. So this has to
be aspirated, and therefore voiceless. Now look at that double burst. Double
bursts like that, especially centered in F2/F3 like that, are typical of velar
releases. So there you go. There's not a lot of unambiguous transition information
but the long VOT and the double burst and pretty good cues.
Script A + Rhoticity Sign
[ɑ˞], IPA 305 + 419
Well, the F3 is low, so you might be tempted to call this a syllabic /r/. But that
wouldn't explain the F2 movement. Or for that matter, the F1 movement. Which together
look like diphthongal movement, which I suppose is what this sequence is.
Lower-Case T + Superscript H
[tʰ], IPA 103 + 404
From 775 msec to about 850 msec, there's serious gap. The few periods of voicing
leadin gup to 800 msec I'd say are just perseverative. Since the release at 850 is
followed by going on to 75 msec of aspiration (voicelessness, VOT), there's little
doubt that this plosive is aspirated. The transitions into it are decidedly alveolar
looking, in the sense that both F2 and F3 are pointed up, but given their frequency
in the preceding segment, they have precious little choice. The aspiration noise
is the big clue. It all respects (except for the formant shapping in F2 and F3,
this looks like a sibilant, particularly [s]. (I suppose
you might say it looks like an [ʃ], but really it
doesn't. There's not enough energy in the F2/F3 pole relative to the higher ones.)
Ennyhoo, it's not an [s], it's just really heavy aspiration following an alveolar
release. So it's not 'grooved' like an [s], but the airflow is basically high pressure
being directed at the incisors, just like [s]. SO this has to be alveolar.
The transitions out look vaguely velar-pinch-y, but since there's no way a velar
would have aspiration that looks like this, we can rule that out.
Turned M
[ɯ], IPA 316
Well, this is not good. The highest-pitched voice in the whole spectrogram.
Which probably makes this syllable the nuclear accent, or at least the focus
accent of the utterance. But in practical terms it means a) the striations are so
close together you can't tell one pulse from the next, and b) the harmonics are
widely separated (Quick--why?) and so bandwidths just increase. sSo it's hard to tell
exactly where F1 is. It could be that band around 500 Hz (or just below, but above
the very strong voicign bar), or it could be that band up around 800 Hz. Which
makes this either a relative mid to higher-mid kind of vowel or a very, very low one.
The F2 is a little easier. Before it fuzzes out, you can see the F2 transition in the
aspiration noise, so you know where it's headed at least. So the F2 has to be around
1200 Hz or so, depending on exactly where you measure. So knowing the answer, I
might suppose that the strength of the 'voicing bar' was actually a very low
first formant, and the two things I'd considered before are just strong harmonics.
But I don't know. It probably ain't the increibly low vowel that it would be.
SO figure not high and realtively back, but not outrageously round (or very round
but not outrageously back). And we'll try to make a word out of it later.
For the record, this is a fairly typical /u/ for me. Not at all round, fairly high, and with front on-glide following the coronal.
Lower-Case N
[n], IPA 116
So I think the oral closure happens on at about 1075 msec--when the zero kicks in.
Which is another contributer to the fuzziness of the preceding vowel--nasalized
vowels tend to have broader bandwidth (and more centralized formant frequencies)
than their oral counterparts. So the zeroes are a good thing, really--they tell
us this has to be a nasal. Frankly, the pole looks like it's about 1000 Hz, and
so I'd say this was bilabial. And I'd be wrong. Good guess, but if it's not
bilabial, then it has to be alveolar. No hint of velar pinch, and, well, there
is that narrow thing at 1500, which is where I'd expect the pole for an [n] to
be, in my voice. There's no hint of that in the initial nasal of this utterance,
so there's some difference. But I wish I knew what was going on on at 100 Hz.
Lower-Case Z
[z], IPA 133
Well, there's a hint of voicing at the bottom, so this is probably voiced. The
noise is [s]-shaped, if you follow, and weaker (and shorter) than we'd expect
for [s], which is consistent with the idea that it's voiced.
Lower-Case I
[i], IPA 301
Well, if the previous thing is an alveolar, then we can say that the onset frequency
of F2 is in line with the alveolar locus, which means all that movement is just
transitional. Or we could suppose that it's meaningful. I n the first case,
coupled with the relatively low F1, I'd be looking at that spot, just after 1300
msec where the F2 levels off or just a bit, and say that was our target F2
frequency, which would make this an [i], just because nothing else ever has
an F2 above 2200 Hz. But in the other case, we'd say this was a relatively high,
front vowel moving higher (I guess) and much much fronter, something much more like
classical [eɪ]. One or the other. One is right, the
other's a good guess.
Lower-Case T
[t], IPA 103
So with the exception of that one pulsey thing before 14500, the gap here
seems to start at about 1350 emx and go on for almost 100 msec. The transitions into
look sort of pinchy (but very front velar, if you follow) and the burst
is slightly doubled. All of which just screams [k]. But then we wouldn't get
this spectrogram to say anything. So on the high-tilt to the burst, and the
phonotactics of the following thing, I'd say this was [t].
Esh
[ʃ],
IPA 134
So here you see how much stronger the F2 pole is. And the energy below is weaker.
So this looks like an [ʃ]. THis is also more consistent
with the F2/F3(/F4?) poles, which are more typical of postalvelaors than alveolars.
There's just more room to couple and a longer front cavity to play in. That is, for
acoustic coupling to take place and to resonate in, respectivecly. Shame on you for
thinking what you were thinking!
Lower-Case I
[i], IPA 301
Well, there's a couple of odd amplitude discontinuities, but they're not really
radical, considering the length and overall energy in this vowel. So I'm thinking
it all has to do with pitch change, and therefore striation spacing and harmonic
structure. So from 1575 to 1925 msec, I'm thinking this is really all one vowel.
And since the F2 reaches 2200 Hz (i.e. 'absurdly high for anything except [i],
and then still very, very high'), I'd say this was [i]. If you were determined to
put vowels on either side, what would you do with the middle?
Lower-Case Z + Under-Ring
[z̥], IPA 133 + 402
Well, this is a lesson, so here goes. This looks like an [s] again, but it's
very weak. There's no hint of voicing, but it's weak, and it's shorter than
even the fricative in the affricate, even though it's final in utterance.
So there's something odd about it. It's not post-alveolar, because even though
it looses energy below F2, you'd still expect the F2 pole in the fricative to
be a little stronger than above it, and this is flat. The noise gets
a little better organized off the top of the spectrogram. All this points to [s].
So how do we account for the weakness? Well, voiced fricatives are almost always
weaker and shorter than their voicless counterparts, just because the act of
voicing impedes airflow and therefore pressure build up. But this isn't voiced.
So I'll suggest it's passively devoiced. That is, rather than devoicing
by abducting the vocal folds (as with underlyingly voiceless sounds), the vocal
folds remain adducted here. But because we're at the end of an utterance, we (I)
don't have a lot of subglottal pressure to work with, and the result is the vocal
folds don't vibrate. And there you have it, devoiced [z]. As distinct from [s].