{"id":91,"date":"2012-03-05T23:14:42","date_gmt":"2012-03-06T07:14:42","guid":{"rendered":"http:\/\/www.earlevel.com\/main\/?p=91"},"modified":"2012-05-23T23:15:10","modified_gmt":"2012-05-24T07:15:10","slug":"convolution%e2%80%94in-words","status":"publish","type":"post","link":"https:\/\/www.earlevel.com\/main\/2012\/03\/05\/convolution%e2%80%94in-words\/","title":{"rendered":"Convolution\u2014in words"},"content":{"rendered":"<p>Convolution is a convoluted topic\u2014and that\u2019s what it means (<em>convoluted<\/em>, from Merriam-Webster : \u201cExtremely complex and difficult to follow. Intricately folded, twisted, or coiled.\u201d).<\/p>\n<p>Really, it\u2019s more difficult to explain why you would want to use convolution than it is to explain the mathematical function itself. I wrote a more technical article nearly a year ago, and it went unpublished because I didn\u2019t have time to write the interactive and animated graphs that I wanted to accompany it. Revisiting the topic, I decided it was better to explain it in words from an intuitive point of view, followed by an article on the mathematical implementation later, and audio examples.<\/p>\n<h3>Hello!<\/h3>\n<p>I hope that most people are familiar\u2014from either personal experience, or maybe a cartoon\u2014with the effect of an echo off a distant canyon wall. You shout, and moments later you hear your shout repeated back to you, though not as loud (the original in red, the quieter echo in blue):<\/p>\n<p><a href=\"http:\/\/www.earlevel.com\/main\/wp-content\/uploads\/2012\/03\/hello.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.earlevel.com\/main\/wp-content\/uploads\/2012\/03\/hello.png\" alt=\"\" title=\"hello\" width=\"124\" height=\"21\" class=\"alignnone size-full wp-image-95\" \/><\/a><\/p>\n<p>If we gave an impulse\u2014perhaps firing a starter piston\u2014we\u2019d hear a response that has the same spacing and amplitude:<\/p>\n<p><a href=\"http:\/\/www.earlevel.com\/main\/wp-content\/uploads\/2012\/03\/pop.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.earlevel.com\/main\/wp-content\/uploads\/2012\/03\/pop.png\" alt=\"\" title=\"pop\" width=\"137\" height=\"63\" class=\"alignnone size-full wp-image-94\" \/><\/a><\/p>\n<p>Note that we don\u2019t need to go to that canyon to get the same results in a recording studio\u2014we could mix together a shout of \u201cHello!\u201d with an attenuated and delayed copy of it. Our impulse response tells us, precisely, how much to attenuate and to delay the copy.<\/p>\n<p>Now, consider what happens when you continue to shout instead of pausing to hear the reflection:<\/p>\n<p><a href=\"http:\/\/www.earlevel.com\/main\/wp-content\/uploads\/2012\/03\/4score.png\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.earlevel.com\/main\/wp-content\/uploads\/2012\/03\/4score.png\" alt=\"\" title=\"4score\" width=\"608\" height=\"25\" class=\"alignnone size-full wp-image-96\" srcset=\"https:\/\/www.earlevel.com\/main\/wp-content\/uploads\/2012\/03\/4score.png 608w, https:\/\/www.earlevel.com\/main\/wp-content\/uploads\/2012\/03\/4score-300x12.png 300w\" sizes=\"(max-width: 608px) 100vw, 608px\" \/><\/a><\/p>\n<p>A nearby listener would hear the original speech, starting at the beginning (the first pop), and a delayed, quieter copy starting at the time of the second pop. The two speeches would be jumbled together.<\/p>\n<p>Now consider being inside of an empty gymnasium, where you hear not just one discrete echo, but many, including echoes of echoes as the sound bounces between walls. We could get an impulse response of the gym with a starter pistol, and it too would tell us where to overlap copies\u2014of whatever speech or sound might go on in the gym\u2014and their relative volumes.<\/p>\n<p>As you can imagine, piecing together a facsimile from a signal (speech, music\u2026) and the room\u2019s known impulse response gets more complicated (convoluted!) as the impulse response has more features. In the digital realm, our \u201cfeatures\u201d are individual samples, so the complexity is determined by how many samples there are in the impulse response\u2014the longer the impulse response, the greater the number of computations required to scale and add in \u201ccopies\u201d. You won\u2019t want to simulate the results of a reverberant room manually like we did with the single-echo example. Fortunately, we can do much better\u2014we can compute the results exactly, given an exact impulse response. We say that we <em>convolve<\/em> the signal with an impulse response\u2014the process is called <em>convolution<\/em> (just like we <em>multiply<\/em> two numbers in a process called <em>multiplication<\/em>).<\/p>\n<h3>More on getting the impulse response<\/h3>\n<p>There are many ways to generate an impulse. Have you ever gone into a near empty gym or warehouse and clapped your hands together sharply once, to hear the \u201csound\u201d of the room? You were analyzing its impulse response. Popping a balloon is another way. A perfect impulse has equal amounts of all frequencies\u2014like white noise condensed into a spike. It\u2019s impossible to attain this ideal impulse, but we can get close enough to handle the audio band. Often, however, impulse responses of large rooms are taken by sweeping a sine wave through the audio band\u2014a \u201cchirp\u201d\u2014because it\u2019s easier to get a more accurate result, and better signal to noise ratio, than trying to make a loud impulse that\u2019s practically ideal. In essence, a chirp is an impulse spread over time.<\/p>\n<p>In the digital realm, and impulse can be readily approximated by its band-limit version\u2014a single unit sample in the midst of zero samples. To get the impulse response of a digital filter, for instance, run this single-sample impulse through the filter\u2014the impulse response is its output. For an FIR filter, the impulse response is equal to its coefficients (because, conversely, standard FIR filters are normally implemented by convolution).<\/p>\n<p>And, of course, we can compute an impulse response instead of measuring it. We do this routinely for FIR filters. And to combine two serial FIR filters into one, just convolve their impulse responses (which is to say, their taps). We could calculate the response of an imagined room as well, for use as a reverb effect.<\/p>\n<h3>Using convolution for audio effects<\/h3>\n<p>For changing signals such as music, longer delays have less correlation, and sound like echos, while shorter delays cause more frequency cancellation and sound like filtering. This allows us a wide range of tonal and spacial effects for audio via convolution.<\/p>\n<p>And while I use the term \u201cimpulse response\u201d throughout this article, there\u2019s nothing stopping you from convolving any two sounds, including instruments\u2014a trumpet note convolved with a bowed cymbal, for example.<\/p>\n<h4>Limitations<\/h4>\n<p>Convolution is a useful tool for reproducing linear, time-invariant effects.<\/p>\n<p>Linear means that the output simply scales with the input at a constant ratio. An identical input signal half as loud, produces the same output half as loud. Examples of linear effects are typical fixed filters and echos. A distortion pedal is non-linear\u2014playing louder creates not just a louder version of the same sound, but a different sound.<\/p>\n<p>Time-invariant means that the impulse response doesn\u2019t change over time. If you input a signal to a time-invariant system right now, the output will sound the same as doing it five minutes from now\u2014nothing changes except the five minutes. A flanger is not time-invariant. Playing right now, your signal might start at the top of the sweep, while playing at an arbitrary time later it might start mid-sweep or at the bottom.<\/p>\n<p>Convolution is not convenient for time-varying effects, as they would require that the impulse response change constantly. You could do this\u2014cycle through changing, possibly interpolated impulse responses, but that\u2019s not a practical solution for most effects.<\/p>\n<p>Likewise, convolution for non-linear effects would require a different impulse response for different instantaneous levels at the input. To be fully general, that would be for every possible input level (65,636 for 16-bit resolution), though more practically most effects could be done by using much fewer levels and interpolation, because good-sounding audio processes are not completely random\u2014the saturation level of a distortion effect rolls on gradually and monotonically, it doesn\u2019t jump all over the place. Still, convolution loses much of its appeal for non-linear effects, because most non-linear effects can be done more simply other ways.<\/p>\n<h4>Convolution reverb<\/h4>\n<p>Even though convolution has been used in filtering since the dawn of digital audio, most musicians are aware of the term from convolution reverb. Convolution reverb is a boon for giving people access to \u201crealistic\u201d acoustic spaces, but it shares all of the limitations, and more, with algorithmic reverb. It\u2019s an exaggeration to say that it puts your instruments in a real space\u2014more like it puts your instruments through a speaker (or speakers) in a physical space, and gives you the sound mic\u2019d at a point in that space (with multiple mics for multiple impulse responses for stereo and other multi-channel effects processing). And you lose the chance of capturing non-linearities and time variations, which may play a part in some spaces.<\/p>\n<p>Want the effect of a sound coming from within a closed cardboard box? Generate an impulse inside the box, and capture it outside the box. Need the effect of someone shouting for help from inside a storm drain for a movie without making the actor climb into the storm drain? Maybe you can lower a sound generator into a storm drain and mic it from the outside, then convolve the actor\u2019s voice with that impulse response\u2014and no one needs to get dirty. The sound of a telephone or other small speaker? Wire an electrical impulse directly, and mic the output to get the response.<\/p>\n<p>The web has many pre-made impulse responses, so we can use spaces that we don\u2019t have access to. Play your pipe organ sample via the room response of a large cathedral\u2014or play your guitar through a classic spring reverb, played through an antique radio&#8230;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Convolution is a convoluted topic\u2014and that\u2019s what it means (convoluted, from Merriam-Webster : \u201cExtremely complex and difficult to follow. Intricately folded, twisted, or coiled.\u201d). Really, it\u2019s more difficult to explain why you would want to use convolution than it is &hellip; <a href=\"https:\/\/www.earlevel.com\/main\/2012\/03\/05\/convolution%e2%80%94in-words\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14,4,19,15,13],"tags":[],"_links":{"self":[{"href":"https:\/\/www.earlevel.com\/main\/wp-json\/wp\/v2\/posts\/91"}],"collection":[{"href":"https:\/\/www.earlevel.com\/main\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.earlevel.com\/main\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.earlevel.com\/main\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.earlevel.com\/main\/wp-json\/wp\/v2\/comments?post=91"}],"version-history":[{"count":1,"href":"https:\/\/www.earlevel.com\/main\/wp-json\/wp\/v2\/posts\/91\/revisions"}],"predecessor-version":[{"id":97,"href":"https:\/\/www.earlevel.com\/main\/wp-json\/wp\/v2\/posts\/91\/revisions\/97"}],"wp:attachment":[{"href":"https:\/\/www.earlevel.com\/main\/wp-json\/wp\/v2\/media?parent=91"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.earlevel.com\/main\/wp-json\/wp\/v2\/categories?post=91"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.earlevel.com\/main\/wp-json\/wp\/v2\/tags?post=91"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}