from Hacker News

Show HN: RiffBank – A reverse guitar tab search engine

by ryrobes on 12/6/13, 4:06 PM with 34 comments

Little side project from the past few months indexing ascii guitar tab sections into pseudo-language "words" (notes, chords) and "sentences" (riffs) that can be queried and indexed with ElasticSearch (Apache Lucene).
  • by adrianh on 12/6/13, 7:33 PM

    Nice! I was thinking of adding a feature like this to my animated-guitar-tabs site Soundslice (http://www.soundslice.com/). I've got time-synced data for each note in a song, with string and fret values, so I'm planning to do stuff like "cliche lick" detection. Searching by riff would be a great addition.

    I talk about this a little bit toward the end of this tech presentation I gave: http://37signals.com/talks/soundslice

    I definitely echo what some other comments have said -- if you made it based on intervals instead of "hard-coded" notes, it'd be a lot more flexible! That seems to be how our own brains store music -- if somebody sings "Happy Birthday," for example, you can instantly join in, regardless of what key it's in. Unless you're tone deaf... :-)

  • by ryrobes on 12/7/13, 2:35 AM

    OP: Thanks everyone for the feedback!

    Just a short explanation of the (still very rudimentary) query "system" (using the term loosely here)...

    Tab file gets scraped, broken down into individual passages based on how it's written (aka the "riffs", even though they might not technically be)..

       P.M.---|  h     P.M.  h      
       |---------------------------|
       |---------------------------|
       |--------7^8--7-------------|
       |--------------------7^8--7-|
       |-0---0-----------0---------|
       |---------------------------|
    
    becomes normalized / encoded to something like

       "5a 5a 3h 3i 3h 5a 4h 4i 4h" 
    
    and inserted into an ElasticSearch cluster, using a non-word analyzer for indexing (simplified a bit here for sake of argument - but I also save all spacing, symbol markup, bar sections and palm muting they just are not being utilized in search currently).

       "settings": {
           "index.analysis.analyzer.nonword.type": "pattern",
           "index.analysis.analyzer.nonword.pattern": "[^\\w]+"
         }...
    
    Upon search - the same encoding function is then applied to the incoming text, exploded and thrown in an ordered SPAN query with diff levels of 'slop'...

       "query": {
        "span_near": {
          "clauses": [
            {
              "span_term": {
                "riff_code": "5a"
              }
            },
            {
              "span_term": {
                "riff_code": "5a"
              }
            },
            {
              "span_term": {
                "riff_code": "3h"
              }
            },
            {
              "span_term": {
                "riff_code": "3i"
              }
            }
          ],
          "slop": 6,
          "in_order": true
        } ....
    
    I cut the score off at a >1.1 or something so that it doesn't show things that are way off.

    At the time it seemed like the best way to detect patterns that are mostly similar and look decent. I also experimented with MoreLikeThis and FuzzyLikeThis query variants, but ultimately the span query gave closer results to what one would EXPECT to see (but still has some scoring and clustering problems).

    Any Lucene / ElasticSearch gurus feel free to suggest differently.

  • by snorkel on 12/6/13, 11:10 PM

    Very nice, but I'm afraid to use it since I may discover some of my own original riffs are the same melodies already used by Nickleback and Britney Spears, and just knowing that would severely damage my creative ego.
  • by valtron on 12/6/13, 5:21 PM

    Maybe it should work by intervals because then it'll also work if you enter a riff in a different key than the original.
  • by DigitalSea on 12/6/13, 11:50 PM

    Wow, this is genius. I just searched the intro riff of Smoke on The Water and was impressed and surprised to see so many other bands that have the same riff in many different ways. Would love to know how this works below the service. What is built on? Are you using any API's for this? Would love more details or even open source it.
  • by dphnx on 12/6/13, 5:16 PM

    I love this idea and would find it a very useful tool, well done.

    I think that textarea input box is key - you should invest time making it look good and easy to use. Could you make it an insert-mode text input that replaces hyphens as you type? Could it auto-expand in width when you get to the end?

    Can’t wait to see this evolve

  • by bryans on 12/6/13, 9:35 PM

    This is very well done, though I think the results could be better organized. For example, if I enter the opening riff to Bullet Hole by The Haunted (http://bit.ly/1bnTBFg), it first lists two songs that have similar riffs, whereas the The Haunted track is listed third even though it is identical to what I entered.

    Also, even though the bitly link ends up being the exact same URL, it sometimes adds a bunch more erroneous results prior to The Haunted, making it listed 8th. But clicking the search button will again return it as the 3rd result.

  • by sssbc on 12/6/13, 4:52 PM

    Works! Well done tech. Get your marketing department to sanitize for prudes, if you care for the prude market.
  • by lightyrs on 12/6/13, 9:11 PM

    This is incredibly cool. Great work.
  • by antonio0 on 12/7/13, 2:40 AM

    Very nice but it sucks on mobile.
  • by almosnow on 12/7/13, 2:11 AM

    oh god! finally!
  • by bitlord_219 on 12/6/13, 4:37 PM

    "Anal" "Goldilocks" "Sloppy"

    close tab