Solution to Last Month's Mystery Spectrogram

Solution for January 2006

"Laughter can soothe and heal."

Tilde L (Dark L)
[ɫ], IPA 209
Voicing begins, without an obvious release or anything, at about 50 msec. At about 125 msec there's a sudden change in both the overall amount of energy and the frequencies of the formants. So I'd say we have a segment from 50 to 125 msec or so. Could be a nasal, but there's a little oo much energy in the upper formants (and no clear zero in the space between F1 and F2, and for that matter, too clear and strong an F1). So we're probably looking at an approximant. And probably not a glide-like approximant, owing to the discontinuity with the following vowel So this is probably /r/ or /l/. The F1 is below 500 Hz. The F2 is at about 1250 Hz. And F3 is up around 3750 Hz at least. That's a raised F3, typical of laterals. The F3 of an initial /r/ would presumably be down around 1500 or 1600 Hz. Note the F2, well below neutral, indicating velarization. This might be 'lighter' than a coda /l/, but there's no way you can interpret this as anything except velarized.

Ash
[æ], IPA 325
Nice clear formants. F1 is very high, let's say 800 Hz or so. The F2 is a hair lower than neutral, let's say 1300 or so. Not quite low enough to be really, really back, but not really what you'd call amazingly front. I'll have to listen to this vowel again, but I'd say this was pretty central(ized), judging from the F2 frequency. Remember how centralized, relative to other dialects, this vowel is in the western US.

Lower-Case F
[f], IPA 128
This is an interesting lesson in acoustics. The periodicity in F3 seems to leave off at about 225 msec. But the voicing doesn't end for almost 50 more msecs. So what seems to be happening is that as the constriction increases, the upper-frequency harmonics are getting suppressed. I'm not sure what lip-teeth compression does to radiation, but I'm wondering if there's either an acoustic or an aerodynamic change in spectral slope here. Anyway, There's friction starting about 225 msec or so, and clearly voiceless friction starting 275 msec or so. It's not particularly loud friction, so this isn't sibilant. There's some organization in the resonant frequencies, but not the kind of support you'd get with [h]. So this is probably a voiceless labiodental or interdental. I'm not sure how to tell the difference. The formant transitions aren't really giving us much information. Odds are against the interdental, just because it's a coda of a stressed syllable. I think.

Lower-Case T
[t], IPA 103
Nice little voicelss gap from about 300 msec to the release just after 350 msec. Interestingly enough, there seems to be an alveolar-shaped (that is, broad band, tilted to the very high frequencies) *closure* transient. There's a nice sharp release at about 350-375 msec or so. The release is a little odd, centered in the F3 region, or possibly showing signs of F3/F4 pinch. F3/F4 pinch is sometimes associated with dentality (velar pinch is F2/F3), but I haven't seen it enough to be sure about its value as a cue. But the center frequency is a little low for an alveolar burst, and might be a into the velar-burst range. But there's no involvment with F2, which you'd expect with a velar, there's no pinch, and the burst is sharp and fairly clean--not at all mushy or doubley-looking. So this is an alveolar burst. The lowness of the center might have to do with the upcoming low F3 (i.e. a long front cavity?) or liprounding (i.e. a long front cavity?). But I don't know.

Turned R + Syllabicity Mark
[ɹ̩], IPA 151 + 431
There's a vowel here, sandwiched between two consonants. THe F1 is failry low. THe F2 is a little high of neutral. The F3 is way freaking low. Barely below 2000 Hz, which is why I never say 'below 2000 Hz', but it is. So this is an approximant /r/ in syllable nucleus/local sonority peak position.

Lower-Case K + Superscript Lower-case H
[kʰ], IPA 109 + 404
If you're a fan of these things, you know this is my voice. And you know my velar stops (and my stops in general) are kind of mushy. So the noise here is distracting. There's some low frequency, but not much. THe main centers to the noise are in the pinched F2/F3 range, and up in F4, if that's what that is. So while it does have some formant shaping, it doesn't really look like an [h]. The velar pinch on both sides is a pretty strong cue, and the noise in the F2/F3 pinched range is typical of a velar (fricative?). Note also the double-yness (though quick) of the release, or whatever you want to call it.

Barred I
[ɨ], IPA 317
Tiny short vowel, barely four or five pulses long. We don't waste a lot of time on these. Transcribed as a reduced vowel, following Keating et al (1994), barred-i iff the F2 is closer to the F3 than the F1, schwa elsewise.

Lower-Case N
[n], IPA 116
On the other hand, the voicing continues even though at about 600 msec the amplitude takes a sharp dive. This is a nice nasal-y looking thing. Reduced overall amplitude, reduced formant amplitudes, and a nice clean zero between the lower resonances. The F1 is mostly neutral or low of neutral, typical of nasals, and the F2 is nice and high (relative to nasal poles) at about 1500, which in my voice is a very nice, clean alveolar [n]. (Velars show more F2/F3 pinchiness than we see here, and labials always have their pole much lower, around 1000 Hz or even just below.) That "clunk" at about 625 msec (in the F2, and from the F3 all the way up) is a phenomenon known technically in the biz as a "clunk". Clunks can happen any time, but for some reason they often happen in nasals. They're due to something viscous (saliva or some other fluid somwhere in the vocal tract) flying around somewhere at the wrong moment. Distracting in a spectrogram, but so obviously an anomaly (unless it happens where you might be wondering if it's a release transient or something) that they can safely be ignored.

Lower-Case S
[s], IPA 132
So, even if you're a beginniner, you should at this point be able to tell that there's something going on from about 750 msec all the way to 900 msec. It's voiceless (no striations at the bottom). It's noisy--the energy is snowy and random, not organized in nice striations. It's very broad band--there's no formanty-organization. It's centered (darkest) in the very high frequencies. So this is very loud (dark) this is very high pitched (as noise goes) and very long. Sounds like a classic sibilant to me. In fact clearly an [s]. Even though the energy cuts off (sort of) below F2, if this were an esh the noise would be centered lower down, in the F3 F2 range, down to the cut-off frequency (around F2).

Turned M
[ɯ], IPA 316
If you are a beginner, or if you're not familiar with the west coast US vowel system (or Japanese...) this vowel will mystify you. BUt I'll try to explain. Starting about 900 mex all the way to about 1150 or 1175 msec, there's some very high pitched voicing going on. The F1 (lowest formant) is sort of low, at least lower than neutral (around 500 Hz), so this vowel is higher-than-mid. The F2 starts basically neutral (near 1500 Hz), maybe a little lower (backer) and moves a little lower (backer). F3 is nice and flat in more or less its neutral range (about 2500 Hz) as is the F4. So we've got a highish, central-to-back and moving backer-or-rounder vowel. So this my /u/. There's nothing particularly round about it, or alternatively it might be round but then there's nothing particularly back about it. So take your pick. I've transcribed it as back and unround, but that's my intution, not anything measurable. In southern California, the primary effect is definitely unrounding, although the 'centralizing' of the F2 is achieved in other dialects of US English by centralizing the tongue but maintaining rounding. Go fig.

Eth
[ð], IPA 131
Well, the energy in the very low frequencies is 'voicing bar', even though the frequency is such you can't really see the individual striations. So it's voice, whatever it is. It could be a mushy stop, but the noise isn't really organized the way I'd expect. So it's probably a fricative. Voiced. Definitely not sibilant. So again we've got something that is most likely labiodental or interdental. Here, the transitions are being a little more helpful. There's definitely a 'lift' in F4, and no evidence of anything remotely labial about any of the transitions. So on the balance, the (inter)dental is more likely here based on the cues, although it's pretty unlikely statistically. That should make this word really easy to identify--no near neighbors... ;-)

Schwa
[ə], IPA 322
I probably should have transcribed a glottal stop in here, as there's defintiely some creaky voice going on here. But oh well. The vowel here is sort of short and the creakiness doesn't make it any easier. So in the end, given the great lenght of the preceding and following vowels, I'd say this was reduced and move on.

Lower-Case N
[n], IPA 116
Well, it's not as long as the last one, but this is another nasal. From about 1350 to 1400 msec. Or thereabouts. Following thoes three or four clear periods of voicing. The F2 again is up around 1500 Hz, at least if you can see it. So this really can only be [n].

Lower-Case D
[c], IPA 104
Keating et al (1994) distinguished closures from releases (in similar fashion to Steriade's Aperture Theory model of stops), which would be a handy thing to be able to do here. This is an oral release (see the nice sharp burst) of something that doesn't seem to have much in the way of an oral stop component. Nasals stop with oral release. But that's not an option the IPA givse us (I'm not suggesting it should, it just underscores the theoretical constraints imposed by strictly segmental model like IPA transcription), so there you go. The release characteristics are consistent with alveolar. In case homorganicity wasn't an option. Given the following segment, it probably was.

Lower-Case H
[h], IPA 146
This will be controversial. Because there's some very clear voicing starting from the release of the previous stop, at 1400 msec, that goes on for almost 100 msec. There's a dip in the voicing amplitude from 1475 to 1550 msec or so. ANd then the voicing comes back up. But if you look at the upper frequencies, there's no periodicity to speak of in the formants. So what we have here is a mostly voiced [h]. Which I should have transcribed as such, but I was paying more attention to the noise than I was the voicing bar. It's not unusual for intervocalic /h/s to be fully voiced, but this is just bizarre. But it's an [h], voiced or not. Note the formanty organization of F2 and F3.

Lower-Case I
[i], IPA 301
Well, this will be controversial too. I'm guessing the 'real' vowel is really just get beginning of this, i.e. when the voicing kicks back on at 1550 or whenever, up to when the F2 starts to dive, around 1675 msec or so, and the rest of the vowel is just transition. But whatever. The F1 is low, the F2 is unbelievably high (especially in the preceding /h/, which is typical of [i]. The diving F2 is transition to the following consonant.

Tilde L (Dark L)
[ɫ], IPA 209
Speaking of which, this is weird again. There's a sharp discontinuity in the F3 and F4, which makes this look like a sudden aperture change, but the F2 and F1 keep their energy and maintain it longer than that. SO I don't know where the 'boundary' is. There probably isn't one (again one of the limitations of the segmental model). But by the time you get to the end, the F1 has moderated to neutral, the F2 has lowered to about 1000 Hz which clearly indicates backing or velarization, the F3 has risen again to well above the neutral freuency it has for most of the vowels. SO this is another velarized /l/.