From Susan's Place Transgender Resources
Revision as of 12:42, 12 January 2011 by Renate (Talk | contribs) (Changed protection level for "Voice": Removed cascade [edit=sysop:move=sysop])

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Human voice

Human voice is sound made by a person using the vocal folds for talking, singing or crying. The tone of voice may show that a sentence is a question, even if it grammatically is not, and shows emotions such as anger, surprise, happiness; in a request the tone reveals much about how much one wants something, and whether it is asking a favor or more like an order; the tone of saying e.g. "I am sorry" says a lot: it may vary from begging for forgiveness to "I have the right to do this even if you do not like it". (See nonverbal communication) Singers use the human voice as an instrument for creating music.

Surgery and hormones

Neither hormones nor genital surgery will 'un-break' a male voice, and voice-changing surgery is widely regarded as inadvisable due to the relatively low success rate and significant risk of permanent damage, in addition to being at best only a partial solution.

See external link for overview by Anne Lawrence.

Many if not most transsexual women are capable of achieving an F0 in the lower part of the female range through voice training alone, without voice surgery. The problem is not physiologic incapacity; the problem is finding the motivation to undertake the hard work of learning the necessary vocal skills and of practicing them consistently until they become second nature (how well I know. . .). Nevertheless, in my opinion most transsexual women should concentrate primarily on voice training, and consider voice feminization surgery only as a last resort.

Thus, speech training is necessary in order to produce a satisfactory 'female' voice, with or without voice surgery.


See Puberty

Voice registers

The human voice is a complex instrument. Humans have vocal cords which can loosen or tighten or change their thickness and over which breath can be transferred at varying pressures. The shape of chest and neck, the position of the tongue, and the tightness of otherwise unrelated muscles can be altered. Any one of these actions results in a change in pitch, volume, timbre, or tone of the sound produced. One important categorisation which can be applied to the sounds singers make relates to the register; or the "voice" which we use. Singers refer to these registers according to the part of the body in which the sound most generally resonates, and which have correspondingly different tonal qualities. There a widely differing opinions and theories about what a register is, how they are produced and how many there are. The following definitions refer to the different ranges of the voice.

Chest voice

The chest voice is the register used in everyday speech. When you talk to the person next to you in a normal voice, you can feel that the sound seems to be "coming from" your upper chest. This is because lower frequency sounds have longer wavelengths, and resonate mostly in the larger cavity of the chest. When you sing notes at the bottom of your range, you are using your chest voice. The tonal qualities of the chest voice are usually described as being rich, full, deep, loud and strong.

Middle voice

The middle voice, also known as the "blend", is the term used to describe the range of notes which marks the crossover between the chest and head voices. It may be a distinct change (a passaggio) or a more gradual blending. With training, many singers can choose whether to sing notes in this range in the head or chest voice.

Head voice

The head voice is often used when we shout, or are highly excited. In these situations we tend to produce higher pitches, and these resonate in the mouth and in the bones of the skull - so the sound feels as if it is "coming from" our head. When you sing the notes at the upper end of your vocal range, you are using your head voice. The tonal qualities of the head voice are usually described as being sweet, balladic, lilting, and pure. It is usually more tonally precise but less loud than the chest voice.


Falsetto is a higher range than the head voice; it relies on completely relaxed vocal cords and may sound breathy. Imagine the Bee Gees singing "Stayin' Alive", or Terry Jones playing an old woman in Monty Python; that is the sound of the falsetto voice. It is generally only used by men. It is a difficult register to sing accurately in, and it tends to be rather quiet. It also requires an uncomfortable muscle effort for many men. It is a quite distinct range from the head voice, and generally when singers describe their range they exclude the falsetto voice.

Finding your voices

  • Stand up.
  • As loudly as you can, say the word "hellooooooooo" (holding the "o") in your normal speaking voice. Put your hand on your chest; you should feel it vibrating. If not, try singing a little louder or lower. This is your chest voice.
  • As loudly as you can, repeat the word "helloooooo" with as high a pitch as you can comfortably sing without any special muscle effort; you should feel your chest is no longer vibrating, but instead your skull is. This is your head voice.
  • Now say "hellooooooooooo" in as high a pitch as you can, even if it feels uncomfortable and sounds silly. This is your falsetto voice.

You may find it interesting to gradually sing up your range from the bottom and feel where you cross over from the chest voice to the head voice.


At first, it may seem hard to concentrate on all the different facets of producing a feminine voice, and lapses will happen. The only solution is to practice and practice again until it gradually becomes second nature.

The Methods


To loosen-up the voice box, extend your pitch range, and help develop good control, it can be very helpful to choose a female vocalist that you like, preferably one with a relatively deep voice, and sing along. The musically-minded may also wish to perform singing exercises, such as singing scales.

Decreasing male resonance

Raise the position of the laryngeal cartilage : up raises your voice pitch and decreases the characteristic male resonance. (The laryngeal cartilage is the 'movable' piece of cartilage that you can feel rising if you place a hand on your throat and sing a rising scale ( "doh, re, mi, fa, sol, lah, ti, doh" )). The point of this is to try to gain a higher 'baseline' pitch than you have previously used, and then increase the pitch further when placing emphasis. For example you might decide that if you pitch the "doh" as your baseline male pitch, raising your basic pitch to about "fa" or "sol" would be sufficient. But do not overdo the pitch-raising: a squeaky, falsetto voice sounds very inappropriate on an adult woman. The pitch adjustment is a compromise --- for the technically-minded you should aim for above 160Hz; if you have access to a musical instrument that's about the G below middle C. Of course, everyone starts out with a different original voice and some will be able to raise it more than others without sounding squeaky. You might find it slightly tiring on your voice-box at first, as you are unused to speaking in that register, but it should become comfortable with a little practice. If it does not, then you are probably trying to force your pitch up too high.

The Glottus

Partially open the glottis when speaking : The position of the glottis controls how much air passes over the vocal cords. When breathing rather than speaking, when whispering, or when producing an 'unvoiced' sound (where the vocal cords do not vibrate, like 'hhh' or 'sss' ), the glottis is fully open and all the air bypasses the vocal cords. With the glottis firmly closed, all the air is forced over the vocal cords, producing a fully-voiced and typically male voiced sound. You need to try to find a 'semi-whispering' position that eliminates the fully-voiced sound with heavy resonance in the chest, and imparts a breathy quality to the voice. You can hear the difference between voiced and unvoiced sounds by comparing S and Z sounds (say 'sss' and 'zzz' , and feel how your vocal cords vibrate on the Z but not the S). You're trying to find a midpoint between an unvoiced (whispered) sound, and a fully-voiced 'male' sound. Try saying the word 'hay', and pay attention to how you change between the unvoiced H sound and the voiced A sound: say it very slowly ( 'hhhhhaaaay' ) and feel the change in the vocal cords as your voice slides from the unvoiced 'hhh' sound to the voiced 'aaa' vowel sound. Then try to stop before you reach the fully-voiced point, and you should be producing a soft, breathy (feminine) 'aaa' sound. Then try to learn to always use that half-open position for all voiced sounds. This is simply a matter of practice.


Place emphasis with pitch, not volume : Upward intonation places emphasis. Men place emphasis in their speech by varying the loudness, but keep their pitch within a very narrow range; on the other hand women tend to keep their loudness much more constant but vary their pitch a great deal to express emphasis.


Speak slowly, enunciate clearly : Especially consonants at the beginning and end of words. Don't mumble; clear voice requires fairly big lip movements. On the whole, women enunciate much more clearly and precisely than men.

Pace your speech

carefully : Start and end sentences slowly and gently; do not sound clipped. Do not 'swallow' pronouns, articles or other 'little words' at the beginning or end of sentences. Male speech tends to be characterised by what speech therapists call 'hard attack' --- the first syllable is pronounced very hard, and quickly. Women usually start a sentence more softly.

Use appropriate content

Men and women tend to talk about the same things in different ways; what you say contains gender cues, just as much as how you say it. Women tend to concentrate more on thoughts and feelings, while men concentrate on objects and actions. Men generally use more 'short cuts', colloquialisms and bad language, too. A simple illustration is to imagine someone asking a friend if they are going to go for a drink after work. A male might say something like 'Coming down the pub?' : rather abrupt, using the minimum of words and concentrating on the desired action in a rather impersonal way. A woman might say 'Do you feel like going for a drink tonight?' : concentrating on her friend's feelings and desires, personal, and not abbreviated.

Tongue positioning

Pay attention to tongue position : The tongue is higher and flatter for female than for male. This gives 'dental' sounds (ones that involve the teeth, like T and D) a softer, breathier, almost sibilant quality in the female. Say 'tttt' in male mode, then 'ssss'; find the halfway position, that is the female position for the letters T and D; likewise for a TH sound, etc. Use plenty of air to get a breathy sound.


Hold your mouth in the right shape : A slight smile helps, and is the 'resting' facial expression for a woman anyway. 'Rounder' lips (a slight pout), and good lip movement, help produce a clearly enunciated voice.

Develop head resonance

One of the biggest problems facing TS women is, after learning to produce a soft, feminine voice, to then learn how to speak loudly when necessary without the voice returning to a masculine sound. Women gain loudness by using the cavities inside the head as a 'sounding box' whereas men use the chest.

Louder Femminine Voice

To gain a louder feminine voice, develop head resonance rather than chest resonance --- open your mouth a little more, use more air, and 'push' your voice up into your head.

Use Feedback

Record samples of your voice and listen to yourself. Read a passage of text, listen to yourself and keep practicing. It can be helpful to actually read these notes aloud, practising each point as you read it. Then listen to yourself and successively refine your voice.

Loading on tissue in vocal folds

The fundamental frequency of speech for an average male is around 110Hz and for an average female around 220Hz. That means that for voiced sounds the vocal folds will hit together 110 or 220 times a second, respectively. Suppose then that a female is speaking continuously for an hour. Of this time perhapse five minutes is voiced speech. The folds will then hit together more than 30 thousand times an hour. It is intuitively clear that the vocal fold tissue will experience some tiring due to this large amount of hits.

Vocal loading

Vocal loading includes also other kinds of strain on the speech organs. These include all kinds of muscular strain in the speech organs, similarly as usage of any other muscles will experience strain if used for an extended period of time. However, researcher's largest interest lies in stress exerted on the vocal folds

Effect of speaking environment

Several studies in vocal loading show that the speaking environment does have a significant impact on vocal loading. Still, the exact details are debated. Most scientists agree on the effect of the following environmental properties:

Air Humidity

Dry air increases stress experienced in the vocal folds hydration - dehydration increases effects of stress inflicted on the vocal folds background noise - people tend to speak louder when background noise is present, even when it wouldn't be necessary. Increasing speaking volume increases stress inflicted on the vocal folds


The "normal" speaking style has close to optimal pitch. Using a higher or lower pitch than normal will also increase stress in the speech organs. In addition, smoking and other types of air pollution might have a negative effect on voice production organs.


Objective evaluation or measurement of vocal loading is very difficult due to the tight coupling of the experienced psychological and physiological stress. However, there are some typical symptoms that can be objectively measured. Firstly, the pitch range of the voice will decrease. Pitch range indicates the possible pitches that can be spoken. When a voice is loaded, the upper pitch limit will decrease and the lower pitch limit will rise. Similarly, the volume range will decrease, especially so that the most silent voices cannot anymore be spoken. Secondly, an increase in the hoarseness and strain of a voice can often be heard. Unfortunately, both properties are difficult to measure objectively, and only perceptual evaluations can be performed.

Voice care

Regularly, the question arises of how one should use one's voice to minimise tiring in the vocal organs. Basically, a normal, relaxed way of speech is the optimal method for voice production, in both speech and singing. Any excess force used when speaking will increase tiring. The speaker should drink enough water and the air humidity level should be normal or higher. No background noise should be present or, if not possible, the voice should be amplified. Smoking is discouraged.

Voice analysis

Voice Analysis is the study of speech sounds for purposes other than linguistic content, such as in speech recognition. Such studies include mostly medical analysis of the voice i.e. phoniatrics, but also speaker identification.

Typical voice problems

A medical study of the voice can be, for instance, analysis of the voice of patients who have had a polyp removed from his or her vocal cords through an operation. In order to objectively evaluate the improvement in voice quality there has to be some measure of voice quality. An experienced voice therapist can quite reliably evaluate the voice, but this requires extensive training and is still always subjective. Another active research topic in medical voice analysis is vocal loading evaluation. The vocal cords of a person speaking for an extended period of time will suffer from tiring, that is, the process of speaking exerts a load on the vocal cords where the tissue will suffer from tiring. Among professional voice users (i.e. teachers, sales people) this tiring can cause voice failures and sick leaves. To evalute these problems vocal loading needs to be objectively measured.

Analysis methods

Voice problems that require voice analysis most commonly originate from the vocal cords since it is the sound source and is thus most actively subject to tiring. However, analysis of the vocal cords is physically difficult. The location of the vocal cords effectively prohibits direct measurement of movement. Imaging methods such as x-rays or ultrasounds do not work because the vocal cords are surrounded by cartilage which distort image quality. Movements in the vocal cords are rapid, fundamental frequencies are usually between 80 and 300 Hz, thus preventing usage of ordinary video. High-speed videos provide an option but in order to see the vocal cords the camera has to be positioned in the throat which makes speaking difficult.

Inverse filtering

Most important indirect methods are inverse filtering of sound recordings and electroglottographs (EGG). In inverse filtering methods, the speech sound is recorded outside the mouth and then filtered by a mathematical method to remove the effects of the vocal tract. This method produces an estimate of the waveform of the pressure pulse which again inversely indicates the movements of the vocal cords. The other kind of inverse indication are the electroglottographs, which operates with electrodes attached to the subjects throat close to the vocal cords. Changes in conductivity of the throat indicate inversely how large a portion of the vocal cords are touching each other. It thus yields one-dimensional information of the contact area. Neither inverse filtering nor EGG are thus sufficient to completely describe the glottal movement.


Phonetics (from the Greek word phone = sound/voice) is the study of speech sounds (voice). It is concerned with the actual nature of the sounds and their production, as opposed to phonology, which operates at the level of sound systems and linguistic units called phonemes. Discussions of meaning (semantics) do not enter at this level of linguistic analysis. Phones, the objects of study in phonetics, are actual speech sounds as uttered by human beings. While written languages and alphabets are obviously (in most cases) closely related to the sounds of speech, strictly speaking, phoneticians are more concerned with the sounds of speech than the symbols used to represent them. So close is the relationship between them however, that many dictionaries list the study of the symbols (more accurately semiotics) as a part of phonetic studies.

The Branches of Phonetics

Phonetics has three main branches

  • Articulatory phonetics, concerned with the positions and movements of the lips, tongue, and other speech organs in producing speech.
  • Acoustic phonetics, concerned with the properties of the sound waves.
  • Auditory phonetics, concerned with speech perception.


Of all the speech sounds that a human vocal tract can create, different languages vary considerably in the number of these sounds that they use. Languages can contain from 2 (Abkhaz) to 55 (Sedang) vowels and 6 (Rotokas) to 117 (!Kung) consonants. The total number of phonemes in languages varies from as few as 10 in the Pirahã language, 11 in Rotokas (spoken in Papua New Guinea), 12 in Hawaiian and 30 in Serbian to as many as 141 in !Xu (spoken in southern Africa, in the Kalahari desert). These may range from familiar sounds like /t/, /s/ or /m/ to very unusual ones produced in extraordinary ways (see: clicks, phonation, airstream mechanism). The English language has about 13 vowel and 24 consonant phonemes (depending upon dialect), some of which have multiple allophones. This differs from the lay definition based on the Latin alphabet, where there are 21 consonants and 5 vowels (although sometimes y and w are included as vowels).

Phonetic history

Phonetics was studied as early as 2500 years ago in ancient India. Tamil grammer book Tolkaappiyam (c. 2nd Century C.E.) describes the place and manner of articulation of consonants. Most Indian languages group and order their consonants based on place and methods of articulation.

See also

External links


*Some information provided in whole or in part by http://en.wikipedia.org/

  • Editor's Note: Some information provided in whole or in part by Looking Glass Society.