MULTIMEDIA BASICS: UNDERSTANDING SOUND

by Steven G. Estrella, Ph.D.

Today many intrepid educators are taking the plunge into multimedia. Multimedia development environments like HyperCard for the Macintosh, Toolbook for Windows, and the World Wide Web can help educators create educational programs that are motivating and fun. Creating sound for use in these environments requires a minimal understanding of the science behind sound and a knowledge of some of the jargon involved in multimedia.


What is Sound?

If a tree falls in the forest and no living creature is there to hear it, does it make a sound? The answer is no. Sound is a perceptual phenomenon only. When a tree falls, a person speaks, or a violin string vibrates, the surrounding air is disturbed causing changes in air pressure that are called sound waves. When sound waves arrive at our ears they cause small bones in our ears to vibrate. These vibrations then cause nerve impulses to be sent to the brain where they are interpreted as sound.


Figure 1 - Sound Waves
A vibrating string on a violin causes disturbances in the surrounding air.
These sound waves cause our ears to send nerve impulses to the brain
which interprets the disturbance as sound.


How is Sound Recorded?

Sound waves can be transduced (converted to another form) using a microphone. A microphone is similar to the human ear in that it has a diaphragm which vibrates in response to changes in air pressure. The movements of the diaphragm within an electromagnetic field cause changes in electrical voltage. These voltage changes can be directed to a tape recorder which alters the magnetic particles on the tape to correspond to the voltage changes. A "picture" of the sound then exists on the tape. When you press play on the tape recorder, the "picture" is read back as a series of voltage changes which are then sent to a speaker. The voltage changes cause an electromagnet within the speaker to push and pull on a diaphragm. The movement of the diaphragm then causes air pressure changes which our ears interpret as the original sound. This process is known as analog recording because the picture of the sound on the tape is analogous to the original changes in air pressure caused by the sound event.

Figure 2 - analog recording
When sound waves strike a microphone,
they are converted to an electrical signal
which is then etched onto a magnetic tape.

Usually we represent sound visually as a waveform. The height is called the amplitude and represents volume. The distance between cycles is called the period or wavelength. The number of cycles per second is called frequency and is interpreted by our ears as pitch. Frequency is measured in Herz (Hz) or kilohertz (kHz).

 

 

The waveform above is a simple sine wave. Typical sounds are more complex in appearance. Here is a waveform of a short spoken phrase. Note the frequent changes in wavelength, amplitude, and frequency.

 

Digital recording differs from analog recording in that the "picture" of the sound is created by measuring the voltage changes coming from the microphone and assigning numbers to each measurement. The term "sampling" is used to describe the process of measuring an electrical signal's voltage thousands of times per second at a given level of precision (resolution). The number of measurements per second is called the "sampling rate" and is expressed as kilohertz (kHz). A rate of 11,000 measurements per second is thus designated as 11 kHz. Sampling rates range from 5 kHz to 48 kHz with higher rates being used for the best quality recordings. Harry Nyquist (1889-1976), a Swedish-born U.S. communications engineer, discovered that the frequency range of a digitized sound is limited to one-half the sampling rate. Since humans can hear frequencies in a range of 20 herz to about 20 kiloherz, it is necessary to sample at more than 40 kiloherz to capture the full range of frequencies perceptible to the human ear.

The number of measurements per second, however, is only part of the picture. The degree of precision within each measurement is also important. This is known as "sampling resolution". Sampling resolution is used to divide the total range of the electrical voltage into discrete parts. Common sampling resolutions in use today are 8-bit and 16-bit. Sampling at 8-bits divides the voltage into 256 parts (2 to the 8th power). Sampling at 16-bits divides the voltage into 65,536 parts (2 to the 16th power). Using a higher sampling resolution creates cleaner recordings with less background noise. Higher sampling resolutions also capture a wider dynamic range. For example an 8-bit digitizer will only capture sounds up to 48 decibels (DB). Any portion of the sound that is louder than48 DB will be clipped and the resulting sample will sound distorted. 16-bit digitizers, however, capture up to 96 DB of volume. The dynamic range of the human ear extends to 120 DB.

Quantization is the term that describes the process of measuring the amplitude of a sound and rounding off the measurements according to the sampling resolution. For example, an 8-bit sound digitizer will assign integer values of between 0 and 255 for the amplitude of each sample. The result is that the original smooth waveform is reconstructed as a staircase shape with only 256 discrete levels of amplitude and noise is introduced into the signal. 16-bit digitizers, on the other hand, assign amplitude values on a scale of 0 to 65,535. At that level of precision, the reconstructed waveform is almost identical to the original and almost no noise is introduced.

All of these measurements are made by an analog-to-digital converter. The measurements can then be stored as binary numbers in a file on a computer's hard disk. To play back the sound, the computer sends the information in the file to a digital-to-analog converter which reproduces the original electrical signal. That signal is then sent to a speaker which produces the sound as described earlier.

Maximum precision per measurement combined with maximum sampling rates produces the highest quality recordings. To describe a digital recording of a sound, therefore, one can speak of the sampling rate and resolution. For example, sound recorded at a sampling rate of 22 kHz with 8-bit resolution is considered to be of a quality similar to that of a telephone call. Sound recorded at 44 kHz and 16-bits is considered the minimum quality for compact disc recordings because it captures the full range of human hearing. In multimedia production work, 11 kHz, 8-bit sound is sometimes acceptable for speech recordings and 22 kHz, 8-bit resolution or 11 KHz, 16-bit resolution is often considered acceptable for music. For the highest-level multimedia work, however, nothing short of 44 kHz, 16-bit sound is acceptable.

Figure 3 - digital recording

When sound waves strike a microphone, they are converted to an electrical signal which is measured many thousand times per second by an analog-to-digital converter chip. The measurements are stored in the computer as binary numbers.

The higher the quality of sound, the more space it takes to store the sound. A compact disc can store about 74 minutes of stereo sound at 44 kHz, 16-bit. If you reduce the quality to 22 kHz, 8-bit stereo sound, however, you can store approximately 300 minutes of audio on the same disc. In other words, one minute of stereo sound takes 10 megabytes of storage at 44 kHz, 16-bit quality, and only 2.5 megabytes of storage at 22 kHz, 8-bit quality. When producing sound for multimedia, therefore, one must consider not only sound quality, but also how the sound will be distributed. If your multimedia program will be distributed on CD then you may have enough storage space to justify using the best quality. If the program will be distributed on disk or through the internet, however, you would consider using lower quality sound to avoid having to distribute many disks or subject your users to long download times.

Demonstration 1 - Digital Audio Sampling Rates and Resolutions

Sound File Formats

When sound is digitally recorded to a hard disk, a file format is assigned by the recording software. Sound files are either RAM-based or Disk-based. To play back a RAM-based file, your computer must have enough random access memory (RAM) to hold the entire file. For example a computer with 8 megabytes of RAM might not be able to play a large RAM-based sound file but a computer with 16 megabytes of RAM might have no problem with it. As a result, RAM-based sound file formats are appropriate for use with short sound samples. On the Macintosh, System 7 sound and SND resource are common RAM-based file formats. System 7 sounds are used to generate the various beeps and alert sounds used on the Macintosh. SND resources are often used as sound resources in HyperCard stacks. A Macintosh sound recording program, such as MacroMedia's SoundEdit 16 or the freeware SoundHandle 1.0.3 can be used to create SND resources that can be saved directly into the resource fork of a HyperCard stack. System 7 and SND file formats are most commonly used with 22 kHz, 8-bit sound samples.

Disk-based sound file formats allow you to record music of any length and quality. You are only limited by the amount of available storage space on your hard drive. Disk-based sound file formats are ideal for longer and/or higher-quality samples. AIFF (Audio Interchange File Format) is one of the most commonly-used disk-based file formats on Macintosh, Windows, and even Unix computers. Stereo AIFF sound files recorded at 44 kHz, 16-bit quality are ideal for multimedia productions that will be distributed on CD. Monophonic AIFF sound files recorded at 22 kHz, 16-bit quality are better for multimedia productions that will be distributed via the internet because their file sizes are smaller than higher-quality samples. If you use the internet frequently you have probably encountered sound files in WAV and AU formats. The WAV format is used by Microsoft Windows and the AU file format is used by computers running the UNIX operating system. Sound editing software can convert among these and many other file formats.

MIDI

The Musical Instrument Digital Interface (MIDI) is a hardware and software standard that, among other things, allows users to record a complete description of a lengthy musical performance using only a small amount of disk space. Standard MIDI Files can be played back using the sound synthesis hardware of a Mac or PC. Using MIDI, Beethoven's Fifth Symphony uses about 1.3 megabytes of storage and can fit on one floppy disk. Using a digital audio file format like AIFF, the same symphony uses over 300 megabytes of hard disk storage. One problem with MIDI is that the quality of the actual sound you hear will vary depending on the quality of your computer's sound hardware. For educational applications, however, MIDI-generated sound can be used to demonstrate musical ideas quite effectively. Another problem with MIDI in the past was the lack of a standard sound set. A MIDI file designed to be played with piano and flute sounds might be realized with organ and clarinet on another person's computer. This problem was partially solved by the advent of the General MIDI standard which created a standard set of 128 sounds. Virtually all MIDI files today are distributed in General MIDI format. Still it was left to the owner of each computer to be sure their sound hardware could play the General MIDI sounds. Apple Computer solved the problem with the latest version of its QuickTime software.

Demonstration 2 - MIDI and QuickTime

Below is an excerpt from "Doodlin" recorded as a standard MIDI file and converted to a QuickTime movie. This file takes up only 8 kilobytes of storage and would load in a few seconds via modem.

Download this QuickTime movie (1.4 MB) which demonstrates the process of converting a standard MIDI file into a QuickTime movie using the MoviePlayer Pro application available from Apple Computer's web site (www.apple.com or quicktime.apple.com). Step through this movie frame by frame to see screen shots of the process of converting a MIDI file to a QuickTime movie.

The final product is embedded below.


You are welcome to download the original MIDI file, bachinv4.mid, for use in your music sequencing or music notation applications.

Web sites can be used to exchange MIDI files, collaborate on MIDI sequences, and engage in group compositions. If you convert your MIDI files to QuickTime movies then multiple MIDI files can be embedded in a single page, allowing visitors to participate in a jam session with the music elements you provide. The Blues Jam page is an example of this application.

Apple Computer's QuickTime Software

One of Apple Computer's most brilliant innovations is the continuing development of QuickTime. QuickTime began as a set of system extensions to Macintosh System 7 to allow users to play digitized video in a small window on the screen. Today QuickTime is a comprehensive multimedia tool for storing video, animations, and sound in a variety of formats. It is also a cross-platform tool, meaning that QuickTime movies can be viewed and heard using computers running Mac OS, Windows, or even UNIX.

So what does Apple's QuickTime technology have to offer educators? The answer is plenty. The free version of QuickTime, available from Apple's web site at www.apple.com, comes with MoviePlayer to play back QuickTime content. Content creators must purchase the "Pro" version of QuickTime for $30. The "Pro" version comes with MoviePlayer Pro which can convert standard MIDI files into QuickTime movies that can be played back by any Macintosh computer (Mac II or later) or any PC with a sound card and Windows 3.1 or later. QuickTime MIDI movies use just a little more disk storage space than the MIDI files on which they are based. The actual sound is produced by a software synthesizer that QuickTime installs on your computer's hard disk.

MoviePlayer can be used to convert audio from compact discs into QuickTime movies that can be used in multimedia presentations. MoviePlayer can be used to add sound and text tracks to digital video. Using a video recorder, Apple's free Video Player software, and a Mac equipped with video input, you could record a movie demonstrating instrumental techniques and then use MoviePlayer to add a voiceover narrative. You could also add a descriptive voice narrative to a QuickTime MIDI movie containing a full performance of a complex work. QuickTime comes with several software CODECs (compressor/decompressor) to reduce file size while retaining quality. For music, the QDesign Music Compressor is excellent. For speech, try the QualComm PureVoice Compressor is a good choice. For video, the Sorenson compressor does an impressive job of reducing file size for the visual portion of the video. When used in combination with the QDesign or QualComm audio compressors, file size can be made manageable for transmission over the internet. A "Fast Start" feature is also available to allow the movie to begin playing while still downloading to the user's computer. The next version of QuickTime, version 4.0 currently in Beta testing, allows for streaming live content as well.

QuickTime movies can be loaded onto any web server and included in web pages by using the appropriate EMBED code.

<EMBED SRC="doodle16.mov" AUTOPLAY=FALSE WIDTH=150 HEIGHT=24>

For more detailed and advanced editing of video and audio, of course, you might purchase professional software like Adobe Premiere and MacroMedia SoundEdit 16. Using free and shareware software available from Apple and others, however, you can create multimedia presentations to inspire and educate your students.




Figure 4 - QuickTime
Apple Computer's QuickTime software can be used to create movies with any combination of video, audio, MIDI data, text, and animations.



How to Get Started

To begin working with multimedia sound you will need a multimedia computer with sound input and sound output hardware. Every Apple Macintosh in production today comes with all the necessary hardware and software you will need to begin. One some models, however, the PlainTalk microphone is a $30 option. For some PCs running Windows, however, you may need to buy a sound card and have it properly installed by a technician.

A great place to find shareware and freeware audio software for your computer is www.shareware.com. QuickTime software and links to other multimedia software can be found at Apple Computer's QuickTime web site, http://www.apple.com/quicktime/.