|
|
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
|
 |
|
|
|
This development phase implements synthesized bongo beats that are played out through speakers, and microphones that ‘listen’ for the data
being sent by the other computers.
Each computer has two different bongo beats (based on pitch) associated with each, making a total of 4 distinct beats. Each computer listens for the other’s beats, while ignoring itself. The actual output of the bongo beats is not a big deal, and only takes the following code to implement:
|
|
|
|
for(i=0; i < packetsize; i++) { a = (int)buf[i]; for(h=7; h >= 0; h--)
{ if((1<<h)&a) write(dsp, &hi, HI_SIZE); else write(dsp, &lo, LO_SIZE); } }
|
|
|
|
The above code plays each packet uaing the two predetermined beats. The ‘hi’ and ‘lo’ are unsigned char arrays that hold the raw
sound stream in memory for quick access which is implemented using the following code:
|
|
|
|
int raw_lo, raw_hi; unsigned int lo[LO_SIZE], hi[HI_SIZE]
raw_lo = open("lo.raw", O_RDONLY, 0); raw_hi = open("hi.raw", O_RDONLY, 0);
read(raw_lo, &lo, LO_SIZE); read(raw_hi, &hi, HI_SIZE);
|
|
|
|
 |
|
|
Sound Recognition
The actual ugly part comes next.
Sound recognition is a little bit tricky since it has to be done in real time, and there can be no room for error. Any single mistake and the whole packet is garbage. Let’s take a quick look at a bongo wave form.
|
|
|
|
 |
|
|
|
The actual pitch of the bongo beats will remain the same (i.e. the spaces between each period will be the same distance apart).
There is only very minor fluctuations in the waveforms, and the pitch recognition algorithms will have to take those into account. The amplitude of the beats continuously decrease. In reality, real bongo beats have more fluctuation in the amplitude, and look more like the following.
|
|
|
|
 |
|
|
|
These fluctuations will also cause problems, and once again the pitch recognition algorithms will have to take care of such cases and have some
sort of threshold on how much fluctuation is allowed.
The complexity in the actual algorithms is that it is required to read data in from the microphone continuously analyze the data in real time. When first started, the Linux boxes send a type of “pre-amble” of alternating high and low bongo beats which the other will listen to, and synchronize itself to the correct pitch. The pre-amble was eventually changed to account for some problems, and will be discussed later. In order to determing the pitch, the zero crossings are found by finding negative/positive going data, and placing them in a sliding buffer. The differences between zero crossings are constantly recorded, and compared to the last. If there is found to be a series of pitch differences the same, then the actual pitch found is compared to the test data and it is determined to be either a logical 1 or a logical 0.
Now the actual pitch determination is fairly easy.
The problem lies in the ‘beat determination’ (i.e. what if we count one bongo beat twice? maybe three times?). There were two different approachs taken to solve this problem; one being more succesful than the other. The original idea was to also record and keep track of the amplitude. By comparing the amplitude of each pitch recording, we could assume that if the pitch was still decreasing, then we were still listening to the same bongo beat. If the amplitude rose sharply, then we would assume that it was a new beat. This method, dubbed the ‘Decreasing Amplitude Method’, was successful, but in some minor cases there were some failures of the algorithm due to higher than expected amplitude modulations. The solution was not as simple as changing the threshold of amplitude comparison, as too much slack would actually cause some beat loss (i.e. two beats were taken as only one). The occasional loss of packets was not acceptable, so a stable method was required.
The SAW technique
As it turns out, Prof. Townsend and Dr. Keppel-Jones, a mathematics professor at the university, had developed some wave analysis techniques that
could be applied to the pitch recognition I needed. The technique called ‘SAW’ (Sliding Averaging Window) has reduced the errors to almost non-existant, and takes care of any sort of amplitude fluctuation.
Lets take a look at how it works. Say we have the following bongo waveform, labelled ‘A’, below:
|
|
|
|
 |
|
|
|
We’ll take a sliding window (in this case it was size 5) and run it over the entire waveform.
At each iteration, the middle of the window (i.e. position 3) will be written with a new value of the average of all 5 values. It is basically a ‘smoothing’ out of the curve. The result does not look too much different as shown below. We’ll call this waveform ‘B’.
|
|
|
|
 |
|
|
|
While both look the same, the magic happens when B is subtracted from A (shown below).
|
|
|
|
 |
|
|
|
The new waveform goes to 0, except at the start of the beat.
The bongo beats have distortion at first, which is caused by the striking of the surface of any sort of object. This distortion now forms a safe way of determining when a bongo has been struck. Because of the nature of the sliding window technique, and the fact that we’re accomplishing pitch recognition in a real-time environment, the algorithms behave very efficiently.
With the pitch recognition working, the only task left to do was to replace the ethernet bridge with the microphones, speakers, and pitch
recognition. The idea was simple.
1) Take any packet from the wire and ‘play it’ through the speakers. 2) Listen for packets with the microphone, and send them out the network card.
With all the phases completed successfully, it only took an hour to tie all the phases together and have a successful ping reply:
|
|
|
|
C:\DOCUME~1\SYN>ping -n 1 -l 1 -w 1000000000 199.212.55.2
Pinging 199.212.55.2 with 1 bytes of data:
Reply from 199.212.55.2: bytes=1 time=139274ms TTL=254
Ping statistics for 199.212.55.2: Packets: Sent = 1, Received = 1, Lost = 0 (0% loss), Approximate round trip times in milli-seconds:
Minimum = 139274ms, Maximum = 139274ms, Average = 139274ms
|
|
|
|
Didn’t think it was THAT easy did you?
Actually a little bit of cheating was required. Since we have high latency and bandwidth, it was found that the ARP requests were flooding
the bongo link.
The ARP requests allow the router and my machine to resolve IP addresses (i.e. convert MAC addresses to IP addresses). To solve this problem, when the Bongo Link is first brought up, the preamble discussed above actually sends the appropriate IP address (from either the router or the PC on the other side of the Bongo Link) to the other Linux box base on the ARP request it receives. These addresses are then taken, and appropriate generic packets are generated with their appropriate checksums. From this point, any further ARP requests are not sent through the Bongo Link, but instead the generic packet is sent back acting as a ARP reply.
|
|
|
|
 |
|
|
|
| Home | Overview | Background | OSI Model | | Phase I | Phase II | Phase III | Phase IV | Pictures | | Algoma University |
|
|
|
|
 |
|
|