Posts Tagged‘matching’

Sortilio updated main window

Sortilio Update: It’s Just Better All Over.

So like most products that a developer creates with one purpose in mind my first iteration of Sortilio was pretty bare bones. Sure if you had a small media collection that was named semi-coherently it worked fine (like it did for my test data) but past that it started to fall apart rather rapidly. Case in point: I let it loose on my own media collection, you know for the purposes of eating my own dog food. It didn’t take long for it to fall flat on its face, querying The TVDB’s API so rapidly that the rate limiter kicked in almost instantaneously. There was also the issue of not being able to massage the data once it had done the automated matching portion as even the best automated tools can still make mistakes. With that in mind I set about improving Sortilio and put the finishing touches on it yesterday.

Now the first update you’ll notice is the slightly changed main screen with a new Options tab and two extra buttons down in the right hand corner. They all function pretty much as you’d expect: the options tab has a few options for you to configure (only one of them works currently, the extensions one), save will export the current selection to a file for use later and load will  import said file back into Sortilio. The save/load functionality is quite handy if you’d like to manually go in there and sort out the data yourself as it’s all plain XML that I’m sure anyone with half a coding mind about them would be able to figure out. I put it in mostly for debugging purposes (re-running the identification process is rather slow, more on that in a bit) but I can see it being quite useful, especially with larger collections.

As I mentioned earlier whilst the automated matching does a pretty good job of getting things right there are times when it either doesn’t find anything or its got it completely wrong. To alleviate this I added in the ability for you to be able to double click the row to bring up the following screen:

Shown in this dialog is the series drop down which allows you to select from a list of episodes that Sortilio has already downloaded. The list is populated by the cache that Sortilio creates from its queries to The TVDB so if it managed to match one file in the series correctly it will have it cached already so you can just select it and hit update. Sortilio will then identify other files that had the same search term and ask if you’d like to update them as well (since it will have probably got them wrong as well). Should the series you’re looking for not be available you can then hit the search button which brings up this dialog:

From here you can enter whatever term you want and hit search. This will then query The TVDB and then display the results in a list for you. Select the most appropriate one and then hit OK and you’ll have the new series assigned to that file.

Under the hood things have gotten quite a bit better as well. The season string matching algorithm has been improved a bit so that identifies seasons better than it previously did. For instance if you had a file that was like say battlestar.galactica.2003.s01e20.avi Sortilio would (wrongly) identify that as season 20 because of the 2003 before the series/episode identifier. It now prefers the right kind of identifiers and is a little better overall at getting it right, although I still think that the way I’m going about it is slightly ass backwards. Chalk that up to still figuring out how to best do string splitting based on a regex.

Now on the surface if you were to compare this version to the previous it would appear to run quite a bit slower. There’s a good reason for this and it all comes down to the rate limit on The TVDB API. After playing around with various values I found that the sweet spot was somewhere around a 2 second delay between searches. Without any series cached this would mean that every request will incur a 2 second penalty, significantly increasing the amount of time required to get the initial sort done. I’ve alleviated this somewhat by having Sortilio search its local cache first before attempting to head out to the API but that’s still noticeably slower that it was originally. I’ve reached out to the guys behind The TVDB in the hopes that I can get an excerpt of their database that I can include within Sortilio that will make the process lightening fast but I’ve yet to hear back from them.

So as always feel free to grab it, have a play and then send me any feedback you have regarding it. I’ve already got a list of improvements to make on this version but I’d definitely call this usable and to prove a point I have indeed used it on my own media collection. It gets about 90% of the way there with the last 10% needing manual intervention, either within Sortilio or outside cleaning up after it has done its job. If you’ve used it and encountered problems please save the sort file and the debug log and send them to me at [email protected].

You can grab the latest version here.

[NOTE: There is no link currently because gmail barfed at the file attachment I sent myself to upload this morning. Follow me on Twitter to be notified of when it comes out!]