Lower-Case H
[h], IPA 146
So starting note quite 100 msec in, and going on until 225 msec or so,
there's some voiceless (no striations in the very low frequencies, in the
range of the fundamental or first harmonic, which in my voice could be
anywhere between 90 Hz to 130 or 140. So it's voiceless. There's lots of
energy up above, but it's aperiodic, or noisy. If you notice the formants
of the following vowel, there's a little more noise in those same
frequencies. Which is typical of [h]. The
noise, being produced in the laryngopharynx, bounces around the vocal
tract the same way periodic energy does, and thus gains energy in the
frequencies of the vocal tract resonances and loses it in between. What's
interesting is the high F3-there's no hint of rhoticity in the noise,
until about 200 msec, when it starts to come down in frequency. We can
see that transition continue once the voicing kicks in, but then we're
well into the next segment.
Script A + Rhoticity Sign
[ɑ˞], IPA 305 + 419
So ignoring the voicing bar, which you can see is a very narrow band down
there around 150 Hz or so, the first resonance is quite high. Depending
on how you measure these things, I'm thinking it's that upper band around
850 Hz or so. If you look though, there's another, slightly fainter band
just below, closer to 600 Hz. I'm thinking that's just an
idiossyncratically strong harmonic (there's something about there all
through the spectrogram, regardless of where the F1 is). (Well, use your
imagination.) In a perfect world, we might located the 'center' of that
formant in that slightly lower-energy space in between what I'm calling
F1 and what I'm calling that weird harmonic, since the combined width of
those two things is only a little wider than the formants above it. So I
don't know. But F1 is definitely high here, so this is a low vowel. F2
starts about 1200 Hz or a bit below, but it rises a little into the
following segment. Now look at that F3. This is the best argument for
segments (or at least sub-syllabic constituents) I've seen in a while.
The F3 in the [h] is up around 2500 Hz, dead
neutral. It comes down in the last part of the fricative and through the
'clear' part of the vowel until it approaches its low steady state in the
following segment. But if you believe a) [h]
does not have oral features/targets of its own, and b) "rhoticity"
(lowering of F3) is a feature realized on vowels before approximant /r/,
why doesn't the F3 start low in the fricative? Or at least
lower, if you believe that the F3 of the vowel is categorically affected
by rhoticity. Which it obviously is not. But here you can see the
rhoticity is a) not phonological, and b) constrained in the phonetic
grammar to the coda /r/, and is allowed to creep into (but not
take over) the F3 of the vowel. But not really at all into the fricative.
But there's nothing in the fricative to prevent it from doing so. Except
obviously there is. So there must be something 'there'.
Turned R
[ɹ], IPA 151
So I guess I've given it away, this is an approximant /r/ (properly, IPA
[ɹ]) of the North American variety. The F1
is still where it was for the vowel, the F2 (oddly enough) is raised to
approximate the low F3, and the F3 is very low, almost 800 or
900 Hz lower than it was in the beginning of the [h] (where you can see
it returns eventually. Typical of [ɹ].
Lower-Case M
[m], IPA 114
Then the amplitude falls off around 350 msec. The F2 transition in the
/r/ is diving at that moment, which suggests labial transitions. The
overall energy from 350 to 425 msec (or so) is lower than either of the
surrounding vowels, so this is relatively consonantal. And its edges are
sharp, if you see what I mean, suggesting some acoustic change that sucks
energy out of the source suddenly turns on, and then off. So this is a
typical nasal-the aforesaid sucking occurring as the nasal cavity is
opened and the oral cavity is closed, and then stopping when the
velopharyngeal port is closed and the oral closure released. There's a
nice pole around 400 Hz, which is just to be expected, but the first
'real' pole/formant in the nasal is around 1000 Hz. You can see pole
above that (continuous with the F3 of the /r/) is rising. The frequency
of that middle pole, the one around 1000 Hz is a good cue to this being
bilabial-if the oral closure were further back, this would be higher in
frequency. (Go back to acoustic phonetics and read about 'side cavities'
if you're not sure why.) So that's two solid cues to this being
[m], and none particularly pointing anywhere
else.
Barred I
[ɨ], IPA 317
So from 425 to about 475 msec, there's a vowel. The F1 is sort of low,
unless you believe it's still high, but it's not particularly distinct
either way. The F2 is in constant motion, almost as if it had nowhere in
particular to go. The F3 is still transitioning, so it' snot helping
either. Also the F4 if it comes to that, but since we almost never look
at F4, we won't belabo(u)r the point. So we've got a short vowel of
indistinct structure that never really develops a strong identity of its
own. So call it reduced, transcribe accordingly, and move on.
Lower-Case N
[n], IPA 116
So here we have another one of these. Note its similarity, in terms of
its amplitude and edges, to the previous nasal. There's a pole I don't
think I've ever seen before at about 850 Hz, so I'm going to ignore
it.... The main pole is up around 1400 or not quite 1500 Hz. Note how
much higher it is than the 1000 Hz or so pole in the [m]. So there we go. This one isn't bilabial, so we're
stuck with alveolar or velar. There's no hint of velar pinch in the
transitions into or out of this nasal, and the transition-end frequencies
(around 1700 Hz) is consistent with the locus of alveolar transitions.
Lower-Case I
[i], IPA 301
So the F1 is still rather low. Note the voicing bar in the first
syllable. There's a strongish harmonic just below 500 Hz but the main
body of the resonance is clearly between the voicing bar and that
harmonic. So this is an exceptionally low F1. So this is an exceptionally
high vowel. The F2, once it straightens out, is exceptionally high, up
around 2100 or 2200 Hz. So this vowel is exceptionally front. And the
highest, frontest vowel you can think of? Right!
Barred I
[ɨ], IPA 317
Well, another section of vowel that's mostly F2 transition. If you missed
it as just transition, you have to explain why this vowel is so long when
its pitch is clearly quite low (see how far apart the striations are
compared to most of the preceding vowels-each of those striations is a
glottal pulse). So I think this is actually two different
vowels/syllables. In fact, two different words. I worked hard at not
putting a glottal stop in this one, so I hope you appreciate the
duplicity involved.
Lower-Case Z
[z], IPA 133
So the striations continue, albeit in weaker form, all through the
following amplitude dip (from about 700 msec to 750 msec or so?). So
whatever it is, it's a consonant and it's voiced. But up above the
voicing bar, there's no evidence of periodicity, so no resonance to speak
of. So there must be a very tight constriction somewhere. And it's noisy,
so it's a close constriction, but not a closure. So we're talking about a
fricative. Voiced, but very noisy. The noise is not particularly
organized into bands. In fact, it's one broad band. It's a trifle weaker
in the lower frequencies than the higher frequencies (note the relative
lightness of the noise just around and below 1000 Hz compared to anywhere
above), so this looks like it's tilted to the high frequencies. Very high
frequencies, without any tilt toward the F2 or F3 region. So there you
go. [s]-shaped noise, but voiced.
Barred I
[ɨ], IPA 317
And another short little vowel, overlapped in the high frequencies with a
bit of the noise from the fricative. Or maybe the noise is coming from
the upcoming closure. Or both. Hmm. So this is amazingly reduced.
Lower-Case T
[t], IPA 103
Nice sharp gap so obviously we're dealing with some kind of plosive.
There's not a lot going on in terms of transitions suggesting anything in
particular. On the other hand, if you look at the release noise burst,
it's very sharp, broad band, and evidently [s]-shaped. Although this may
be in part a product of the following frication. But whatever. Believe
it's alveolar, or at least coronal, or remain agnostic. When it comes to
parsing the upcoming fricative your choices will be limited.
Esh
[ʃ], IPA 134
So here we go. We've got some very loud friction here. No voicing bar,
but with that much noise, you wouldn't really expect any voicing. The
frication is very loud, but you'll notice it isn't one very broad band,
but has some formant-like shaping to it. It's loudest not off the top of
the spectrogram (i.e. between 4-6-8-12 kHz), but seems loudest in the
F2-F3-F4 bands. And the F2 band is pretty noisy, while below it the
energy drops off sharply. That's pretty typical of post-alveolar
[ʃ].
Lower-Case I
[i], IPA 301
So it's tough to tell where F2 is. You have to surmise from that falling
transition afterwards that it's really, really high, around 2200 Hz or
so. It's almost merged with the F3, but that's not supposed to happen, so
the combined band is still wider than you'd expect a single band to be,
but at this bandwidth there's no telling where the separation is. So the
edges of the filter overlap slightly. Get over it. So that's the F2,
where's the F1? Low low low, I say. We could argue about that, but
TMSAISTI.
Lower-Case V
[v], IPA 129
Another voiced fricative here, from 1075 to 1125 msec or thereabout. Nice
striations at the bottom, but no periodicity to speak of above. This is a
very loud fricative-it has about the same energy as the previous [z]. But
spectrally, this looks different. It doesn't have any tilt to it at all.
It just looks white, in the sense of having equal energy at all
frequencies. Sort of unfiltered. Well, probably this is louder than it
should be-I may have been spitting into the microphone or something. The
unfiltered-ness is a huge clue though. In order to be unfiltered, your
source has to be uncoupled from the resonators of the vocal tract. Which
means it has to have a tight closure, and no vocal-tract-tubey-volumes in
front for the energy to bounce around. So this has to be at the teeth or
lips. Given that this is English, the lips (bilabial) is unlikely. It
would be really helpful if the transitions on either side looked more
labial, but they don't. Which might make us think coronal, just by
default. But then we'd be wrong. So let's just keep both [v] and [ð] in mind until we
can make a word out of it.
Schwa
[ə], IPA 322
Very short, indeterminate vowel. Moving on.
Lower-Case B
[b], IPA 102
Another gap, this one rather long, although since we're approaching the
end of the utterance that might be lengthening of the final syllable.
There's a nice, clean gap in most frequencies, but if you look at the
bottom, there's an awful lot of perseverative voicing. More than you'd
get if there were a nice abduction gesture associated with an underlying
voiceless stop. So this is probably voiced. It's a little annoying that
the transitions are so ambiguous. The F2 in the preceding vowel seems to
be coming down, well below the 1700-1800 Hz alveolar locus we usually
look for with alveolars. So that looks labial. F3? Seems to be
high, if anything. Ya gotta love coproduction messing up all
your cues. So on the balance, I'm going to say bilabial. The F2 isn't
even close to alveolar or velar looking. The F3 is ambiguous, but I'll
attribute it to coproduction with ...
Tilde L (Dark L) + Syllabicity Mark
[ɫ̩], IPA 209 + 431
... the raised F3 of this segment, which is lateral. You can tell because
of the raised F3. /r/s have greatly lowered F3s, /l/s tend to have
slightly raised F3s, and/or sometimes F4s. With an F2 below 1000 Hz, this
can only be described as back (or round), so it's dark as well. If you
believe those first few pulses with energy in F3 and F4 and above are
evidence of a separate vowel before lateral-contact, you're welcome to
insert a schwa or something. But I tried to be careful and release the
/b/ into the lateral. There are advantages to doing these things with
your own voice....