by Steven G. Estrella, Ph.D.
Today many intrepid educators are taking the plunge into multimedia. Multimedia development environments like HyperCard for the Macintosh, Toolbook for Windows, and the World Wide Web can help educators create educational programs that are motivating and fun. Creating sound for use in these environments requires a minimal understanding of the science behind sound and a knowledge of some of the jargon involved in multimedia.
If a tree falls in the forest and no living creature is there to hear it, does
it make a sound? The answer is no. Sound is a perceptual phenomenon only. When
a tree falls, a person speaks, or a violin string vibrates, the surrounding
air is disturbed causing changes in air pressure that are called sound waves.
When sound waves arrive at our ears they cause small bones in our ears to vibrate.
These vibrations then cause nerve impulses to be sent to the brain where they
are interpreted as sound.
Figure 1 - Sound Waves
A vibrating string on a violin causes disturbances in the surrounding air.
These sound waves cause our ears to send nerve impulses to the brain
which interprets the disturbance as sound.
Sound waves can be transduced (converted to another form) using a microphone. A microphone is similar to the human ear in that it has a diaphragm which vibrates in response to changes in air pressure. The movements of the diaphragm within an electromagnetic field cause changes in electrical voltage. These voltage changes can be directed to a tape recorder which alters the magnetic particles on the tape to correspond to the voltage changes. A "picture" of the sound then exists on the tape. When you press play on the tape recorder, the "picture" is read back as a series of voltage changes which are then sent to a speaker. The voltage changes cause an electromagnet within the speaker to push and pull on a diaphragm. The movement of the diaphragm then causes air pressure changes which our ears interpret as the original sound. This process is known as analog recording because the picture of the sound on the tape is analogous to the original changes in air pressure caused by the sound event.
Figure 2 - analog recording
When sound waves strike a microphone,
they are converted to an electrical signal
which is then etched onto a magnetic tape.
Usually we represent sound visually as a waveform. The height is called the amplitude and represents volume. The distance between cycles is called the period or wavelength. The number of cycles per second is called frequency and is interpreted by our ears as pitch. Frequency is measured in Herz (Hz) or kilohertz (kHz).

The waveform above is a simple sine wave. Typical sounds are more complex in appearance. Here is a waveform of a short spoken phrase. Note the frequent changes in wavelength, amplitude, and frequency.

Digital recording differs from analog recording in that the "picture" of the sound is created by measuring the voltage changes coming from the microphone and assigning numbers to each measurement. The term "sampling" is used to describe the process of measuring an electrical signal's voltage thousands of times per second at a given level of precision (resolution). The number of measurements per second is called the "sampling rate" and is expressed as kilohertz (kHz). A rate of 11,000 measurements per second is thus designated as 11 kHz. Sampling rates range from 5 kHz to 48 kHz with higher rates being used for the best quality recordings. Harry Nyquist (1889-1976), a Swedish-born U.S. communications engineer, discovered that the frequency range of a digitized sound is limited to one-half the sampling rate. Since humans can hear frequencies in a range of 20 herz to about 20 kiloherz, it is necessary to sample at more than 40 kiloherz to capture the full range of frequencies perceptible to the human ear.
The number of measurements per second, however, is only part of the picture. The degree of precision within each measurement is also important. This is known as "sampling resolution". Sampling resolution is used to divide the total range of the electrical voltage into discrete parts. Common sampling resolutions in use today are 8-bit and 16-bit. Sampling at 8-bits divides the voltage into 256 parts (2 to the 8th power). Sampling at 16-bits divides the voltage into 65,536 parts (2 to the 16th power). Using a higher sampling resolution creates cleaner recordings with less background noise. Higher sampling resolutions also capture a wider dynamic range. For example an 8-bit digitizer will only capture sounds up to 48 decibels (DB). Any portion of the sound that is louder than48 DB will be clipped and the resulting sample will sound distorted. 16-bit digitizers, however, capture up to 96 DB of volume. The dynamic range of the human ear extends to 120 DB.
Quantization is the term that describes the process of measuring the amplitude of a sound and rounding off the measurements according to the sampling resolution. For example, an 8-bit sound digitizer will assign integer values of between 0 and 255 for the amplitude of each sample. The result is that the original smooth waveform is reconstructed as a staircase shape with only 256 discrete levels of amplitude and noise is introduced into the signal. 16-bit digitizers, on the other hand, assign amplitude values on a scale of 0 to 65,535. At that level of precision, the reconstructed waveform is almost identical to the original and almost no noise is introduced.

All of these measurements are made by an analog-to-digital converter. The measurements
can then be stored as binary numbers in a file on a computer's hard disk. To
play back the sound, the computer sends the information in the file to a digital-to-analog
converter which reproduces the original electrical signal. That signal is then
sent to a speaker which produces the sound as described earlier.
Maximum precision per measurement combined with maximum sampling rates produces
the highest quality recordings. To describe a digital recording of a sound,
therefore, one can speak of the sampling rate and resolution. For example, sound
recorded at a sampling rate of 22 kHz with 8-bit resolution is considered to
be of a quality similar to that of a telephone call. Sound recorded at 44 kHz
and 16-bits is considered the minimum quality for compact disc recordings because
it captures the full range of human hearing. In multimedia production work,
11 kHz, 8-bit sound is sometimes acceptable for speech recordings and 22 kHz,
8-bit resolution or 11 KHz, 16-bit resolution is often considered acceptable
for music. For the highest-level multimedia work, however, nothing short of
44 kHz, 16-bit sound is acceptable.

When sound waves strike a microphone, they are converted to an electrical signal
which is measured many thousand times per second by an analog-to-digital converter
chip. The measurements are stored in the computer as binary numbers.
The higher the quality of sound, the more space it takes to store the sound.
A compact disc can store about 74 minutes of stereo sound at 44 kHz, 16-bit.
If you reduce the quality to 22 kHz, 8-bit stereo sound, however, you can store
approximately 300 minutes of audio on the same disc. In other words, one minute
of stereo sound takes 10 megabytes of storage at 44 kHz, 16-bit quality, and
only 2.5 megabytes of storage at 22 kHz, 8-bit quality. When producing sound
for multimedia, therefore, one must consider not only sound quality, but also
how the sound will be distributed. If your multimedia program will be distributed
on CD then you may have enough storage space to justify using the best quality.
If the program will be distributed on disk or through the internet, however,
you would consider using lower quality sound to avoid having to distribute many
disks or subject your users to long download times.
Demonstration 1 - Digital Audio Sampling Rates and Resolutions
When sound is digitally recorded to a hard disk, a file format is assigned by
the recording software. Sound files are either RAM-based or Disk-based. To play
back a RAM-based file, your computer must have enough random access memory (RAM)
to hold the entire file. For example a computer with 8 megabytes of RAM might
not be able to play a large RAM-based sound file but a computer with 16 megabytes
of RAM might have no problem with it. As a result, RAM-based sound file formats
are appropriate for use with short sound samples. On the Macintosh, System 7
sound and SND resource are common RAM-based file formats. System 7 sounds are
used to generate the various beeps and alert sounds used on the Macintosh. SND
resources are often used as sound resources in HyperCard stacks. A Macintosh
sound recording program, such as MacroMedia's SoundEdit 16 or the freeware SoundHandle
1.0.3 can be used to create SND resources that can be saved directly into the
resource fork of a HyperCard stack. System 7 and SND file formats are most commonly
used with 22 kHz, 8-bit sound samples.
Disk-based sound file formats allow you to record music of any length and quality.
You are only limited by the amount of available storage space on your hard drive.
Disk-based sound file formats are ideal for longer and/or higher-quality samples.
AIFF (Audio Interchange File Format) is one of the most commonly-used disk-based
file formats on Macintosh, Windows, and even Unix computers. Stereo AIFF sound
files recorded at 44 kHz, 16-bit quality are ideal for multimedia productions
that will be distributed on CD. Monophonic AIFF sound files recorded at 22 kHz,
16-bit quality are better for multimedia productions that will be distributed
via the internet because their file sizes are smaller than higher-quality samples.
If you use the internet frequently you have probably encountered sound files
in WAV and AU formats. The WAV format is used by Microsoft Windows and the AU
file format is used by computers running the UNIX operating system. Sound editing
software can convert among these and many other file formats.
The Musical Instrument Digital Interface (MIDI) is a hardware and
software standard that, among other things, allows users to record a complete
description of a lengthy musical performance using only a small amount of disk
space. Standard MIDI Files can be played back using the sound synthesis hardware
of a Mac or PC. Using MIDI, Beethoven's Fifth Symphony uses about 1.3 megabytes
of storage and can fit on one floppy disk. Using a digital audio file format
like AIFF, the same symphony uses over 300 megabytes of hard disk storage. One
problem with MIDI is that the quality of the actual sound you hear will vary
depending on the quality of your computer's sound hardware. For educational
applications, however, MIDI-generated sound can be used to demonstrate musical
ideas quite effectively. Another problem with MIDI in the past was the lack
of a standard sound set. A MIDI file designed to be played with piano and flute
sounds might be realized with organ and clarinet on another person's computer.
This problem was partially solved by the advent of the General MIDI standard
which created a standard set of 128 sounds. Virtually all MIDI files today are
distributed in General MIDI format. Still it was left to the owner of each computer
to be sure their sound hardware could play the General MIDI sounds. Apple Computer
solved the problem with the latest version of its QuickTime software.
Below is an excerpt from "Doodlin" recorded as a standard MIDI file and converted to a QuickTime movie. This file takes up only 8 kilobytes of storage and would load in a few seconds via modem.
Download this QuickTime movie (1.4 MB) which demonstrates the process of converting a standard MIDI file into a QuickTime movie using the MoviePlayer Pro application available from Apple Computer's web site (www.apple.com or quicktime.apple.com). Step through this movie frame by frame to see screen shots of the process of converting a MIDI file to a QuickTime movie.
The final product is embedded below.
You are welcome to download the original MIDI file, bachinv4.mid, for use in your music sequencing or music notation applications.
Web sites can be used to exchange MIDI files, collaborate on MIDI sequences, and engage in group compositions. If you convert your MIDI files to QuickTime movies then multiple MIDI files can be embedded in a single page, allowing visitors to participate in a jam session with the music elements you provide. The Blues Jam page is an example of this application.
One of Apple Computer's most brilliant innovations is the continuing
development of QuickTime. QuickTime began as a set of system extensions to Macintosh
System 7 to allow users to play digitized video in a small window on the screen.
Today QuickTime is a comprehensive multimedia tool for storing video, animations,
and sound in a variety of formats. It is also a cross-platform tool, meaning
that QuickTime movies can be viewed and heard using computers running Mac OS,
Windows, or even UNIX.
So what does Apple's QuickTime technology have to offer educators? The answer
is plenty. The free version of QuickTime, available from Apple's web site at
www.apple.com, comes with MoviePlayer to
play back QuickTime content. Content creators must purchase the "Pro" version
of QuickTime for $30. The "Pro" version comes with MoviePlayer Pro which can
convert standard MIDI files into QuickTime movies that can be played back by
any Macintosh computer (Mac II or later) or any PC with a sound card and Windows
3.1 or later. QuickTime MIDI movies use just a little more disk storage space
than the MIDI files on which they are based. The actual sound is produced by
a software synthesizer that QuickTime installs on your computer's hard disk.
MoviePlayer can be used to convert audio from compact discs into QuickTime movies
that can be used in multimedia presentations. MoviePlayer can be used to add
sound and text tracks to digital video. Using a video recorder, Apple's free
Video Player software, and a Mac equipped with video input, you could record
a movie demonstrating instrumental techniques and then use MoviePlayer to add
a voiceover narrative. You could also add a descriptive voice narrative to a
QuickTime MIDI movie containing a full performance of a complex work. QuickTime
comes with several software CODECs (compressor/decompressor) to reduce file
size while retaining quality. For music, the QDesign Music Compressor is excellent.
For speech, try the QualComm PureVoice Compressor is a good choice. For video,
the Sorenson compressor does an impressive job of reducing file size for the
visual portion of the video. When used in combination with the QDesign or QualComm
audio compressors, file size can be made manageable for transmission over the
internet. A "Fast Start" feature is also available to allow the movie to begin
playing while still downloading to the user's computer. The next version of
QuickTime, version 4.0 currently in Beta testing, allows for streaming live
content as well.
QuickTime movies can be loaded onto any web server and included in web pages by using the appropriate EMBED code.
<EMBED SRC="doodle16.mov" AUTOPLAY=FALSE WIDTH=150 HEIGHT=24>
For more detailed and advanced editing of video and audio, of course, you might purchase professional software like Adobe Premiere and MacroMedia SoundEdit 16. Using free and shareware software available from Apple and others, however, you can create multimedia presentations to inspire and educate your students.

Figure 4 - QuickTime
Apple Computer's QuickTime software can be used to create movies with any combination
of video, audio, MIDI data, text, and animations.
To begin working with multimedia sound you will need a multimedia computer with sound
input and sound output hardware. Every Apple Macintosh in production today comes
with all the necessary hardware and software you will need to begin. One some models,
however, the PlainTalk microphone is a $30 option. For some PCs running Windows, however, you may need to buy a sound card and have it properly installed by a technician.
A great place to find shareware and freeware audio software for your computer is www.shareware.com. QuickTime software and links to other multimedia software can be found at Apple
Computer's QuickTime web site, http://www.apple.com/quicktime/.