Solution to Last Month's Mystery Spectrogram

Solution for March 2003

labeled spectrogram
"Some find shiny things here."

This month's high-pedagogy mode spectrogram is heavy with fricatives (hence the extended frequency scale) and nasals. So pay attention. Things aren't going to stay this "easy" for long...

[s], IPA 132
Lower-case S
So from about 75 msec to 200 msec, there's a voiceless fricative. Note the absence of any voicing striations at the bottom, and the 'snowy' 'random' noise. The noise seems to be composed of a single wide, wide band, with no (or at least very little) formant-like 'shaping'. The band seems to be centered up above the top of this spectrogram, which goes up to about 8000 Hz. If we see this as a single huge band, centered up there somewhere, then what we're seing is the bottom half of the bell curve--the greatest energy near the putative center, and falling away fairly quickly, but with a long tail, all the way down to the very low frequencies. Noise that loud (at its center) is charcteristic of sibilants. So compare the energy here (and also in the segment from 750-900 msec, and even 1500-1600 msec) with the noise in the 400-500 msec segment, and the one around 1200 msec. That's sibilance folks. Very high frequency, and very high amplitude, noise. Centered up above 8000 like this is charactgeristic of alveolars, so this is [s].

[ɐ], IPA 324
Turned A
From 200 to not quite 300 msecs, wew've got voicing (note the 'voicing bar' between 100-200 Hz or so, consistent with my probable F0/fundamental/first harmonic). The F1 is up around 800 Hz, the F2 just avoe it around 1200 Hz, and the F3, hmm, I guess around 3700 Hz. But it's kind of fuzzy. Luckily for vowels we don't usually consider F3.... So an F1 that high must be a relatively low vowel. An F2 that low should be back (or round), and I recall this was backer than I usually produce this vowel. Probably should have transcribed it as back, but I was in a hurry. This one is closer to my (now) prefered phonemic transcription for this vowel, which you'll need to find this word before you'll understand. So it's down there and not at all front. That's the main thing.

[m], IPA 114
Lower-case M
Then at 300 msec the amplitude suddenly dies off. Where F1 was, suddenly there's nothing, the thng that looks like it used to be the F2 is now weaker, and all the way up there's just less energy. Typical of nasals (if you've been through any acoustic phonetics, you know that side cavities suck energy out of spectra-- as antiresonances--rather than adding energy to it. So this almost has to be a nasal. Note thte transitions in the previous vowel. F4 falls sharply; F3 isn't really doing much, or if it is, it's interrupted by the zero. F2 starts out low and if anything falls. F1 always falls into closure, so that's not really indicative of anything. So we have mostly falling formants, usually correlated with bilabiality. And if you know my voice, you know my nasal pole (the formant above the lowest zero) is usually around 1000 Hz, where it's closer to 1300-1500 Hz for an alveolar. So everything points to bilabial, or at least away from anything else.

[f], IPA 128
Lower-case F
Another fricative. This one is much, much weaker overall that the sibilant (it's hard to tell that from the 'below the dotted line' frequencies, unless you have a lot of experience with this sort of thing, but then that's why I've provided 'above-the-dotted-line' frequencies in this spectrogram. If you look, it's not obvious it's stronger anywhere up above 8000 Hz. While the noise in the [s] was distributed in sort of a curve, this noise is sort of flat acros smost freuqencies. There's some shaping into formants in F4, I guess, but for the most part, this looks like a non-sibilant fricative that doesn't have any formant 'shaping' to it. That suggest it's produced in front of any useful resonating cavities, so if this is English, it's probably labiodental or (inter)dental. The F4 seems to be falling, which is sort of unaccountable. F3 might be flat. But notice that the F2 starts, if anything, below 1000 Hz and rises. Now it rises sharply throughout the following vowel, but there's nothing like anything but a labial transition into the following vowel. So this might be a labial. I suppose it could be something else, but labial is probably the best guess for now.

[ɑɪ], IPA 305 + 319
Script A + Small Capital I
So, abstarcting away from the first 25-30 msec following the 500 msec mark (which I take to be mostly transitional), we've got an F1 that reaches it's steady state (if you believe in steady states, or its maximum if you don't) at about 800-900 Hz. So it start very low, but starts to transition towards the higher space (lower F1 frequency--try to keep 'vowel height' and 'formant frequency' straight in your heads at all times in theses discussions) in the last half of the vowel (from before 600 msec to the vowel end at about 650 msec). When the F1 is 'steady' at its maximujm, F2 is transitioning, but is still quite low, so at the beginning of this vowel, it's pretty low and quite back. The F2 rises up to almost 1900 Hz and then suddenly transitions sharply down (to about 1750 Hz) in the last 25-30 msec of the vowel, which again I take to be transitional. So where the F2 reaches its maximum, near 650 msec, the F1 is around 500 Hz, so sort of mid. Note the F1 is getting fuzzy in the second half of this vowel. This will be important later. SO this is a diphthong. The nucleus is low and back, and the offglide is toward the high front (as opposed to the higher-back) space. The transcription reflects the 'reality' of the nucleus, but only the 'direction' of the offglide, which is sort of combininb transcription conventions. So I'm explaining it here.

[n^d], IPA 116 + 104
Lower-case N + Right Superscript D
Again, making up symbols on the fly will require explanation. But work with me. WE've got another nasal here. See the sudden drop of amplitude? See the zero? See how there's no energy right at 1000 Hz, but some up at 1500 Hz or so? Must be alveolar. This is consistent with the transitional information. Even though the F2 is heading down, it's pointed at 1750 Hz or so, which is generally the 'locus' (if you believe in loci) of alveolar transitions. Somewher ein there. If this were a velar, there'd be more evidence of 'velar pinch' in the approaching transitions, and a bilabial would have a sharper fall (one presumes) in F2, and something like falling transitions in the upper formants. So everything points to an alveolar nasal. The fuzziness in F1 in the previous vowel I noted before is a sign of nasalization on the vowel. In spectrograms, I rarely mark contextual nasalization of vowels, unless a) it's really, really obvious--with creeping zeroes and whatnot) and/or b) the following nasal stop isn't obvious. This is, so English phonology being what it is, the vowel must nasalize. Compare my decision to mark rhoticity later. The 'right superscript d' diacritic is my ad hoc way of marking oral plosion. There's no real 'oral stop' phase to this, unless you count the last 10-15 msec or something right before the onset of the noise. If it ain't long enough to segment out, I'm not wasting a lot of time trying to. So I've just marked an oral release rather than a separate segment. Take that for whatever it's worth, which I don't expect is much. Anyway, this is how homorganic nasal-stop coda clusters seem to look in my voice.

[ʃ], IPA 134
Esh
Now this is obviously another fricative. But while the initial [s] in this spectrogram was 'tilted' toward the high frequencies, this one is much flatter. Still very broad band, and very high amplitude, so we're still talking sibilant. Which pretty much just leaves [ʃ], but let's suppose we didn't have an [s] to compare to. We could still identify this, a) because it isn't tilted toward the high frequencies, b) it's much stronger in the F2/F3 region than a typical [s], and c) just below the F2 region, the amplitude suddenly drops off really sharply. All of which point to [ʃ]. The noise at the bottom is pretty noisy. It's not striated into (fairly) clean vertical pulses. So this is voiceless.

[ɑɪ], IPA 305 + 319
Script A + Small Capital I
Well, it's back to formants for this stretch between 900-1000 msec. F1 is kind of fuzzy, but it seems to be centered betwen 750 and 800 Hz. F2, is also kind of fuzzy, but it seems to start around 1300 Hz. At about 950 msec, the fuzzinees starts to leave off, slightly, and the F1 may be dropping slightly. F2 is clearly rising. So we've got a vowel that moves from fairly low and more back than central to something slightly higher and definitely front. I really only have two fronting diphthongs (under normal circumstances) and only one starts anything like low.

[n], IPA 116
Lower-case N
So here's our next nasal. It's really short, just about 50 msec, but oh well. See the zero? See what would have been the F1 die? Now, where's the pole? Well, it's not at 1000 Hz. It's not at 1500 Hz. There's definitely something at 2500 Hz or so, but that's not what we're looking for. So our pole is weak, and we need other cues. So let's look at the transitions. The F2 transition in particular seems to point down into the closure, but to somewhere around 1700-1800 Hz. Alveolar locus. So that's our best cue. And the shortness is sort of consistent with that--it's verging on flapping. TMSAISTI.

[i], IPA 301
Lower-case I
F1 is fairly low, certainly lower than we've seen anywhere else, so we're dealing with a fairly high vowel. F2 is way the heck up avoe 2200 Hz, which is about as high as I've ever seen my F2. Which tells us this is massively front. So we've got a high, outrageously front vowel.

[θ], IPA 130
Theta
Another fricative, probably voiceless, from just before 1200 msec and lasting about 100 msec. The noise in the lower (normally visible) frequencies is very light. The noise that we can see is tilted to the high frequencies, but even up there it's not loud enough to be sibilant. So this ain't sibilant. It isn't shaped like a vowel (i.e. with noise running through the vocal tract and filtered by resonances) so it ain't [h]. So that leaves the labiodental and interdental again. So now it's time to compare the transitions with the previous non-sibilant. The F2 in the preceding vowel is too high to do anything but fall but the F2 coming out of the fricative is just flat. And it's even near the alveolar locus. Now look at that low F2 coming out (but rising) of the [f]. It definitely starts lower than it 'needs' to. Where this one has room to start lower if it wanted to. So it doesn't. Probably interdental.

[ɪ], IPA 319
Small Capital I
Now here's a vowel. F1 is hard to read, but it's definitely not very high. And it's not as low as the previous vowel. So we're talking high, but not highest. F2 is quite high, but under 2000 Hz, so it's very front, but not nearly as front as the previous vowel. So if that was [i], we need to find something not quite as high and not quite as front. Could be [e], but, well, it's not.

[ŋ], IPA 119
Eng
So we've got another nasal. This one is quite long, and definitely doesn't have a pole at 1000 or 1500 Hz. IT also ahs a pole above 2000 Hz, like the last one, but it's looking like they all do. So all we have going for us is that pinchy transition into it. Look at that F2 and F3 come togther! Doesn't get any pinchier than that.

[z], IPA 133
Lower-case Z
But then there's a discontinuity. An alveolar looking pole kind of comes in just before 1500 msec. The voicing also kinds of dies away for a bit, but doesn't go away completely. It's atypical, but the nasal resonance changes around this moment too, which suggests something odd is happening with the coordination of my soft palate raising and the alveolar closure. Which isn't technically a closure since we're dealing with a fricative. I've tried slowing down my [ŋz] sequencies and I do get some nasality over the frication. As improbable as that sounds. Anyway, it's good that I increased the visible frequencies for this spectrogram, so you can see what's really going on with the noise. The noise is quite high amplitude, but mostly at the highest frequencies. That is, it's obviously sibilant, tilted to the high frequencies, but less loud than the initial [s], so the noise dies off in the 'normally visible' frequencies much faster than for the voiceless [s].

[h], IPA 146
Lower-case H
The voicing never quite dies off, but since there's no evidence of striated formant stuff, I decided not to bother transcribing this as voiced. This in spite of the apparent resonances that creep in really early. It's nice though, because you can see the transitions into the high, front tongue position, but excited by noise rather than voicing.

[i ˞], IPA 301 + 419
Lower-case I + Rhoticity Sign
I don't usually mark rhoticity on vowels, since Keating et al. (1994) suggested it was entirely redundant. But I needed something to indicate/account for the movement in the fully voiced section of the vowel. So actually the clear high, front position is hit really early, during the noise, and by the time full voicing and resonance kicks on (where I've marked the 'beginning' of the vowel) the F3 is already transitioning to its final position (more below) dragging the F2 along with it. So even though it's moving, it was still clearly an [i] target. In the center of the vowel the F2 is closer to [ɪ]. But this is a rhotacized [i], not an allophonic [ɪ] selected before an [ɹ]. TMSAISTI.

[ɹ], IPA 151
Turned R
So following the transitions, that blob around 1750 Hz is both F2 and F3. An F3 that low can only be a [ɹ]. Nuff said?

So remember how these fricatives really look, so that when you can only see them up to 4000 Hz you can hypothesize how they are supposed to look.