Recall that a North American [] usually involves three different constriction gestures:
Recall that the frequency of each formant is determined by one of the possible standing waves in the tube:
Constricting the tube will move the frequencies of the formants:
The main acoustic property of English [] is a very low F3. Retroflex approximants, pharyngeal approximants, and lip rounding are so often performed simultaneously because all three have the same desired acoustic effect -- lowering F3. You can often get a reasonable approximation of an [] by doing only the lip rounding and/or the pharyngeal approximant:
In fact, lip rounding will lower every formant, since all possible standing waves have a maximum point at the opening of the tube. (Recall that the formant transitions that allow you to identify a bilabial stop during the first few split-seconds of the following vowel also involve lowerings of both F1 and F2.)
If you are pronouncing a back vowel, it will have a fairly low F2.
If you round your lips at the same time, you will lower F2 even more.
This is why it is nearly universal for languages to have rounded
back vowels and unrounded front vowels. Lip rounding
exaggerates the acoustic effect of backness, and helps make
the back vowels more easily distinguishable from front vowels.