So lately I'm into Optical Music Recognition (OMR), and a central part of that is doing staff line removal. That is when you get rid of the staff lines that obscure the musical symbols to make recognition much easier. There are a lot of ways to do it, but I'm going to share with you how I did it (fairly easily) with Hidden Markov Models (HMMs), which will also teach us a good lesson on this wonderfully useful approach.
OMR has been around for ages, and if you're interested in learning about it [Fornes 2014] and [Rebelo 2012] are good summary articles.
The matter of Staff Line Removal has occupied dozens of researchers for as long as OMR exists; [Dalitz 2008] give a good overview. Basically the goal is to remove the staff lines that obscure the musical symbols, so they would be easier to recognize.
But, the staff lines are connected to the symbols, so simply removing them will cut up the symbols and make them hardly recognizable.
So let's see how we could do this with HMMs.
Continue reading "Using Hidden Markov Models for staff line removal (in OMR) [w/code]"
So I needed to speed up / slow down an audio stream I had (speech generated with Flite TTS) and naively I thought it would suffice to simply sample it at the right intervals and interpolate.
I quickly discovered that just re-sampling won't do because changing frequency also changes pitch proportionally. And then I discovered the world of Time Scaling in audio and it's many algorithms and approaches to change the tempo without changing pitch.
To my surprise there were a number of ready made free libraries that do it, but the first one I tried - RubberBand - did not work out, it had too many dependencies I simply couldn't be bothered compiling it for the Mac. But SoundTouch, well it had a Homebrew formula so it won by default.
I wrote a little simple wrapper around it, that interfaces nicely with Qt.
Let's see what's going on there
Continue reading "Touch up your sound with SoundTouch [w/code]"
So, I've been trying to stream audio off of a USB microphone connected to an Arduino Yun.
Looking into it online I found some examples using ffserver & ffmpeg, which sounded like they could do the trick.
However right from the start I've had many problems with playing the streams on Android and iOS devices.
Seems Android likes a certain list of codecs (http://developer.android.com/guide/appendix/media-formats.html) and iOS like a different set of codecs (Link here), but they do have on codec in common - good ol' MP3.
Unfortunately, the OpenWRT on the Arduino Yun has an ffmpeg build which does not provide MP3 encoding... it does have the MP3 muxer/container format, but streaming anything other then MP3 in it (for example MP2, which the Yun-ffmpeg does have) simply doesn't work on the Android/iOS.
From experiments streaming from my PC a ffmpeg/libmp3lame MP3 stream, it looks like the mobile devices are quite happy with it - so I will need to recompile ffmpeg with Lame MP3 support to be able to stream it.
Continue reading "FFMpeg with Lame MP3 and streaming for the Arduino Yun"
This is not proper technical thingy, but I took some time to try out some audio skills by doing somewhat obvious mashup
Came out pretty good, in my opinion
Sara Baraeilles's "Brave" and Katy Perry's "Roar" sound very similar. So I took two acapellas and instrumental and mixed them together.
Enjoy (or.. not)