When listening to our voices, we get a specific perspective of how we sound like and we don’t really get a complete overview of our vocal qualities. 

The sound we produce is vibrating in our head, and this is how we hear it. On the other hand, when we listen to our voice on a recording or video, we listen to it from the outside, which is mainly why it sounds completely different.

Let’s talk in scientific terms, though.

Generally speaking, there are two ways sounds reach our ears, and each style has a completely different result in the way people perceive them. Inside our ears, there is the cochlea, a spiral-shaped, tube-like bone. Cochlea gathers sound waves and translates them into individual frequencies so the brain can understand what we hear.

The first way for sound to reach cochlea is bone conduction, where sound travels through the head’s tissues. The second way is the transmission from the environment: air-conducted sound travels from the external auditory canal to the eardrum, and then makes its way to the cochlea.

When you’re listening to yourself speaking, your voice travels from your vocal cords directly to the cochlea, while bouncing on different structures, which enable low-frequency and deep vibrations. At the same time, you hear your voice externally as it’s spreading in the air and reaches your ears. This is what we hear when we listen to ourselves talk—a mix of internally generated and air-transmitted sound.

To listen to yourself only from within your body, you can simply plug up your ears and start talking. This deep, low-pitched sound is your body generated voice.

When you’re listening to a video or an audio recording, all this internal process disappears, and you only hear your voice from the external environment right into your ears. 

Now that you know that your voice really sound like it does when recording, we need to talk about the obvious.