Posts Tagged‘programming’

The Subtle Forms of Not Invented Here Syndrome.

Whilst I haven’t had a real programming job in the better part of a decade I have continued coding in my spare time for various reasons. Initially it was just a means to an end, making my life as an IT admin easier, however it quickly grew past that as I aspired to be one of those startup founders Techcrunch idolizes. Once I got over that my programming exploits took on a more personal approach, the programs I built mostly for my own interest or fixing a problem. I recently dove back into one of my code bases (after having a brainwave of how it could have been done better) and, seeing nothing salvageable, decided to have a crack at it again. This time around however, with far less time on my hands to dedicate to it, I started looking for quicker ways of doing things. It was then I realised that for all the years I’ve been coding I had been suffering from Not Invented Here syndrome and that was likely why I found making progress so hard.

FrustratedProgrammer

The notion behind Not Invented Here is that using external solutions, like in the case of programming things like frameworks or libraries, to solve your problems isn’t ideal. The reasoning that drives this is wide and varied but it often comes down to trust, not wanting to depend on others or the idea that solutions you create yourself are better. For the most part it’s a total fallacy as any one programmer cannot claim to the be the expert in all fields and using third party solutions is common practice worldwide. However, like I found, it can manifest itself in all sorts of subtle ways and it’s only after taking a step back that you can see its effects.

I have often steered clear of many frameworks and libraries mostly because I felt the learning curve for them would be far too steep. Indeed when I’ve looked at other people’s code that makes use of them I often couldn’t make sense of them due to the additional code libraries they had made use of and resided myself to recoding them from scratch. Sometimes I’d do a rudimentary search for something to see if there was an easy answer to my problem but often I’d just wrangle the native libraries into doing what I wanted. The end result of this was code that, whilst functional, was far more verbose than it needed to be. Then when I went back to look at it I found myself wondering why I did things in this way, and decided there had to be a better way.

Fast forward to today and the new version of the application I’ve been working on makes use of numerous frameworks that have made my progress so much faster. Whilst there’s a little bloat from some frameworks, I’m looking at you Google APIs, it’s in the form of dlls and not code I have to maintain. Indeed much of the code that I had to create for handling edge cases and other grubby tasks is now handled much better by code written by people far more experienced with the problem space than I will ever be. Thus I’ve spent far less time troubleshooting my own work and a lot more time making progress.

I have to attribute part of this to the NuGet package management system in Visual Studio which downloads, installs and resolves all dependencies for any framework you want to install. In the past such tasks would fall to the developer and would often mean chasing down various binaries, making sure you had the right versions and repeating the whole process when a new version was released. NuGet broke down that barrier, enabling me to experiment with various frameworks to meet my end goals. I know similar things to this have existed in the past however I really only begun to appreciate them recently.

There’s still a learning curve when adopting a framework, and indeed your choice of framework will mean some design decisions are made for you. However in the short time I’ve been working with things like HtmlAgilityPack and Json.NET have shown me that the time invested in learning them is far smaller than trying to do what they do myself. I’m sure this seems obvious to seasoned programmers out there but for me, who was working with the mentality that I just needed to get things done, it never occurred to me that my approach was completely wrong.

I guess where I’m going with all this is that should you find yourself attempting to solve a problem the first thing you need to do is see if its been solved by someone else first. Chances are they’ve gone through the pain you’re about to put yourself through and can save you from it. Sure you might not understand fully how they did it and they might not do it the way you wanted to but those things mean little if they save you time and emotional capital. I know for myself if I had to redo everything I did previously I would be no where near where I am today and likely would get no further.

Making Things Difficult For Myself (or Game Art is Hard).

I’m no stranger to game development, having dabbled in it when I was at University. It was by far my favourite course as whilst I had a decent amount of programming knowledge translating that into creating something playable seemed like a monumental step that required a whole lot of knowledge I simply did not have. This was long before the time when tools like Unity or GameMaker were considered viable and our lecturer made everything easy for us by providing a simple framework on top of DirectX, allowing us to create simple 2D games without having to learn the notoriously complicated API. Since then I’ve tried my hand at Unity several times over and whilst it seems like a programmer’s dream there’s always one place I come unstuck on: the art.

Bryce LandscapesThis isn’t exactly an unknown issue to me, all my university projects used sprites that were pilfered from various free game resource sites and anything extra was little more than primitive 3D objects whipped up in 3D Studio Max. For my current project however I had the bright idea to try and generate some terrain using one of those fancy bits of software that seem to make good looking landscapes without too much hassle. After wandering through a sea of options I found Bryce seemed to be the one to go for and, better yet, it had the ability to export all the mesh so I could import it directly into Unity without too many hassles. For the $20 asking price I figured it’d be worth it to get me going and hey, should I want to go full procedural down the line it’d be a good introduction into the things I’d need to consider.

Oh how naive I was back then…

Whilst Bryce is everything it claims to be (and the tutorials on using it are really quite good) I just couldn’t seem to get it to create the type of scenery I had in my head. This is entirely a user based problem, one that I’ve suffered with for a long time where the interconnects between my brain and the tools I’m using to create just don’t seem to be able to gel well enough to produce the results I’m looking for. Whilst I was able to generate a decent looking mesh and import it into Unity it was nothing like I wanted it to be and, after sinking a couple hours into it, I decided that it was best left to one side lest I uninstall everything in frustration.

Realistically though the problem was one of expectations where the disjoint between my abilities with a program I had never used before and my expectations of what I’d produce were completely out of alignment. After mulling it over for the past couple days I’ve come to realise that I had set the bar way too high for what I wanted to create and indeed creating such things at this stage of development is actually a distraction from the larger goals I’m trying to achieve. I’ve since settled on just making do with a flat plane for now (as that’s all I’ll really need for the foreseeable future) or, should I really want to put something pretty in there, I’ll just lift a 3D model from a game that’s close enough so I’ve got something to work with.

You’d think after churning through project after project I would’ve become adept at recognising when I was pursuing something that was antithetical to making actual progress but it seems even after so many years I still find myself making things far more difficult than they need to be. What I really need to do is focus on the parts where I can make good progress and, should I make enough in those areas, then look towards doing the parts that are outside my area of expertise. Of course the best solution would be to partner with a 3D artist, but I’d rather wait until I’ve got something substantial working before I try and sell someone else on my idea.

That is unless you’re one and you’ve got nothing better to do with your time ūüėČ

 

Computer Sorting Algorithms – Visualized.

The way I explain learning to code is that it’s akin to trying to learn another language, one that helps you communicate with computers. It should come as little surprise that much of terminology in linguistics and programming shared (syntax, language, context, etc.) which can help ease people into it, especially if they’ve learnt another language in the past. However I’ve always found the best tools to be visual as its one thing to explain a system you built to someone and it’s another thing to show how it operates and links together. Visualizing algorithms can be quite tricky however although one intrepid person has taken it upon themselves to give sorting algorithms, being one of the fundamental yet readily misunderstood programming paradigms, a visual representation, and it’s glorious.

Whilst this isn’t a particularly scientific way of demonstrating which algorithms are the best (they’ve all been designed to run for roughly the same amount of time and you can see that the number of integers they have to sort varies wildly) it does give you a pretty amazing visual view of how each of the sorting algorithms operate. You can see the patterns of their sorting behavior quite easily and the iterations they go through at each step to sort the array. Probably the most fun I got out of it was reading up on Bogosort¬†afterwards as I couldn’t really tell what it was trying to accomplish by looking at it and the subsequent explanation is a great testament to all the things you shouldn’t do while programming. As a programmer that kind of absurdity is hilarious, unless you see it in production code of course.

Making The Most of BizSpark (or Page Compression, How I Yearn For Thee).

Since my side projects (including this blog) don’t really have any kind of revenue generation potential I tend to shy away from spending a lot on them, if I can avoid it. This blog is probably the most extravagant of the lot getting its own dedicated server which, I’ll admit, is overkill but I’d had such bad experiences which shared providers before that I’m willing to bear the cost. Cloud hosting on the other hand can get nightmarishly expensive if you don’t keep an eye on it and that was the exact reason I shied away from it for any of my side projects. That was until I got accepted into the Microsoft BizSpark program which came with a decent amount of free usage, enough for me to consider it for my next application.

Azure BizSpark

The Azure benefits for BizSpark are quite decent with a smattering of all their offerings chucked in which would easily be enough to power a nascent start up’s site through the initial idea verification stage. That’s exactly what I’ve been using it for and, as¬†longtime¬†readers will tell you, my experiences have been fairly positive with most of the issues arising from my misappropriation of different technologies. The limits, as I found out recently, are hard and running up against them causes all sorts of undesirable behaviour, especially if you run up against your compute or storage limit. I managed to run up against the former due to a misunderstanding of how a preview technology was billed but I hadn’t hit the latter until last week.

So the BizSpark benefits are pretty generous for SQL storage, giving you access to a couple 5GB databases (or a larger number of smaller 1GB ones) gratis. That sounds like a lot, and indeed it should be sufficient for pretty much any burgeoning application, however mine is based around gathering data from another site and then performing some analytics on it so the amount of data I have is actually quite large. In the beginning this wasn’t much of a problem as I had a lot of headroom however after I made a lot of performance improvements I started gathering data at a much faster rate and the 5GB limit loomed over me. In the space of a couple weeks I managed to fill it completely and had to shut it down lest my inbox get filled with “Database has reached its quota” errors.

Looking over the database in the Azure management studio (strangely one of the few parts of the Azure that still uses Silverlight) showed that one particular table was consuming the majority of the database. Taking a quick look at the rows it was pretty obvious as to why this was the case, I had a couple columns that had lengthy URLs in them and over the 6 million or so records I had this amounted to a huge amount of space being used. No worries I thought, SQL has to have some kind of built in compression to deal with this and so off I went looking for an easy solution.

As it turns out SQL Server does¬†and its implementation would’ve provided the benefits I was looking for without much work on my end. However Azure SQL doesn’t support it¬†and the current solution to this is to implement row based compression inside your application. If you’re straight up dumping large XML files or giant wads of text into SQL rows then this might be of use to you however if you’re trying to compress data at a page level then you’re out of luck, unless you want to code an extravagant solution (like creating a compression dictionary table in the same database, but that’s borderline psycotic if you ask me).

The solution for me was to move said problem table into its own database and, during the migration, trim out all the fat contained within the data. There were multiple columns I never ended up using, the URL fields were all very similar and the largest column, the one most likely causing me to chew through so much space, was no longer needed now that I was able to query that data properly rather than having to work around Azure Table Storage’s limitations. Page compression would’ve been an easy quick fix but it would’ve only been a matter of time before I found myself in the same situation, struggling to find space where I could get it.

For me this experience aptly demonstrated why its good to work within strict constraints as left unchecked these issues would’ve hit me much harder later on. Sure it can feel like I’m spinning my wheels when hitting issues like this is a monthly¬†occurrence¬†but I’m still in the learning stage of this whole thing and lessons learned now are far better than ones I learn when I finally move this thing into production.

 

Azure Transient Fault Handling and Entity Framework Tracking Issues.

If you’ve ever worked in a multi-tenant environment with shared resources you’ll know of the many pains that can come along with it. Resource sharing always ends up leading to contention and some of the time this will mean that you won’t be able to get access to the resources you want. For cloud services this is par for the course as since you’re always accessing shared services and so any application you build on these kinds of platforms has to take this into consideration lets your application spend an¬†eternity crashing from random connection drop outs. Thankfully Microsoft has provided a few frameworks which will handle these situations for you, especially in the case of Azure SQL.

NuGet Install Enterprise Library Transient Fault Handling FrameworkThe¬†Transient Fault Handling Application Block¬†(or Topaz, which is a lot better in my view) gives you access to a number of classes which take out a lot of the pain when dealing with the transient errors you get when using Azure services. Of those the most useful one I’ve found is the RetryPolicy which when instantiated as¬†SqlAzureTransientErrorDetectionStrategy¬†allows you to simply wrap your database transactions with a little bit of code in order to make them resistant to the pitfalls of Microsoft’s cloud SQL service. For the most part it works well as prior to using it I’d get literally hundreds of¬†unhandled¬†exception messages per day. It doesn’t catch everything however so you will still need to handle some connection errors but it does a good job of eliminating the majority of them.

Currently however there’s no native support for it in Entity Framework¬†(Microsoft’s data persistence framework) and this means you have to do a little wrangling in order to get it to work. This StackOverflow question¬†outlines the problem and there’s a couple solutions on there which all work however I went for the simple route of instantiating a RetryPolicy and then just wrapping all my queries with ExecuteAction. As far as I could tell this all works fine and is the supported way of using EF with Topaz at least until 1.6 comes out which will have in built support for connection¬†resiliency.

However when using Topaz in this way it seems that it mucks with entity tracking, causing returned objects to not be tracked in the normal way. I discovered this after I noticed many records not getting updated even though manually working through the data showed that they should be showing different values. As far as I can tell if you wrap an EF query with a RetryPolicy the entity ends up not being tracked and you will need to .Attach() to it prior to making any changes. If you’ve used EF before then you’ll see why this is strange as you usually don’t have to do that unless you’ve deliberately detached the entity or recreated the context. So as far as I can see there must be something in Topaz that causes it to become detached requiring you to reattach it if you want to persist your changes using Context.SaveChanges().

I haven’t tested any of the other methods of using Topaz with EF so it’s entirely possible there’s a way to get the entity tracked properly without having to attach to it after performing the query. Whether they work or not will be an exercise left for the reader as I’m not particularly interested in testing it, at least not just after I got it all working again. By the looks of it though a RC version of EF 6 might not be too far away, so this issue probably won’t remain one for long.

3 Tips on Improving Azure Table Storage Performance and Reliability.

If you’re a developer like me you’ve likely got a set of expectations about the way you handle data. Most likely they all have their roots in the object-oriented/relational paradigm meaning that you’d expect to be able to get some insight into your data by simply running a few queries against it or simply looking at the table, possibly sorting it to find something out. The day you decide to try out something like Azure Table storage however you’ll find that these tools simply aren’t available to you any more due to the nature of the service. It’s at this point where, if you’re like me, you’ll get a little nervous as your data can end up feeling like something of a black box.

A while back I posted about how I was over-thinking the scalability of my Azure application and how I was about to make the move to Azure SQL. That’s been my task for the past 3 weeks or so and what started out as a relatively simple task of simply moving data from one storage mechanism to another has turned into this¬†herculean task that has seen me dive deeper into both Azure Tables and SQL than I have ever done previously. Along the way I’ve found out a few things that, whilst not changing my mind about the migration away from Azure tables, certainly would have made my life a whole bunch easier had I known about them.

1. If you need to query all the records in an Azure table, do it partition by partition.

The not-so-fun thing about Azure Tables is that unless you’re keeping track of your data in your application there’s no real metrics you can dredge up in order to give you some idea of what you’ve actually got. For me this meant that I had one table that I knew the count of (due to some background processing I do using that table) however there are 2 others which I have absolutely 0 idea about how much data is actually contained in there. Estimates using my development database led me to believe there was an order of magnitude more data in there than I thought there was which in turn led me to the conclusion that using .AsTableServiceQuery() to return the whole table was doomed from the start.

However Azure Tables isn’t too bad at returning an entire partition’s worth of data, even if the records number in the 10s or 100s of thousands. Sure the query time goes up linearly depending on how many records you’ve got (as Azure Tables will only return a max of 1000 records at a time) but if they’re all within the same partition you avoid the troublesome table scan which dramatically affects the performance of the query, sometimes to the point of it getting cancelled which isn’t handled by the default RetryPolicy framework. If you need all the data in the entire table you can then do queries on each partition and then dump them all in a list inside your application and then continue to do your query.

2. Optimize your context for querying or updating/inserting records.

Unbeknownst to me the TableServiceContext class has quite a few configuration options available that will allow you to change the way the context behaves. The vast majority of errors I was experiencing came from my background processor which primarily dealt with reading data without making any modifications to the records. If you have applications where this is the case then it’s best to set the Context.MergeOption to MergeOption.NoTracking as this means the context won’t attempt to track the entities.

If you have multiple threads running or queries that return large amounts of records this can lead to a rather large improvement in performance as the context doesn’t have to track any changes to them and the garbage collector can free up these objects even if you use the context for another query. Of course this means that if you do need to make any changes you’ll have to change the context and then attach to the entity in question but you’re probably doing that already. Or at least you should be.

3. Modify your web.config or app.config file to dramatically improve performance and reliability.

For some unknown reason the default number of HTTP connections that a Windows Azure application can make (although I get the feeling this affects all applications making use of the .NET frameworks) is set to 2. Yes just 2. This then manifests itself as all sorts of crazy errors that don’t make a whole bunch of sense like “the underlying connection was closed” when you try to make more than 2 requests at any one time (which includes queries to Azure Tables). The max number of connections you can specify depends on the size of the instance you’re using but Microsoft has a helpful guide on how to set this and other settings in order to make the most out of it.

Additionally some of the guys at Microsoft have collected a bunch of tips for improving the performance of Azure Tables in various circumstances. I’ve¬†cherry¬†picked out the best ones which I’ve confirmed that have worked wonders for me however there’s a fair few more in there that might be of use to you, especially if you’re looking to get every performance edge you can. Many of them are circumstantial and some require you to plan out or storage architecture in advance (so something that can’t be easily retrofitted into an existing app) but since the others have worked I hazard a guess they would to.

I might not be making use of some of these tips now that my application is going to be SQL and TOPAZ¬†but if I can save anyone the trouble I went through trying to sort through all those esoteric errors I can at least say it was worth it. Some of these tips are just good to know regardless of the platform you’re on (like the default HTTP connection limit) and should be incorporated into your application as soon as its feasible. I’ve yet to get all my data into production yet as its still migrating but I get the feeling I might go on another path of discovery with Azure SQL in the not too distant future and I’ll be sure to share my tips for it then.

Transitioning From an IT Admin to a Cloud Admin.

I’ve gone on record saying that whilst the cloud won’t kill the IT admin there is a very real (and highly likely) possibility that the skills required to be a general IT administrator will change significantly over the next decade.¬†Realistically¬†this is no different from any other 10 year span in technology as you’d struggle to find many skills that were as relevant today as they were 10 years ago. Still the cloud does represent some fairly unique paradigm shifts and challenges to regular IT admins, some of which will require significant investment in re-skilling in order to stay relevant in a cloud augmented future.

The most important skill that IT admins will need to develop is their skills in programming. Now most IT admins have some level of experience with this already, usually with automation scripts based in VBScript, PowerShell or even (shudder) batch. Whilst these provide some of the necessary foundations for working in a cloud future they’re not the greatest for developing (or customizing) production level programs that will be used on a daily basis. The best option then is to learn some kind of formal programming language, preferably one that has reference libraries for all cloud platforms. My personal bias would be towards C# (and should be yours if your platform is Microsoft) as it’s a great language and you get the world’s best development environment to work in: Visual Studio.

IT admins should also look to gaining a deep understanding of virtualization concepts, principles and implementations as these are what underpins nearly all cloud services today. Failing to understand these concepts means that you won’t be able to take advantage of many of the benefits that a cloud platform can provide as they function very differently to the traditional 3 tier application model.

The best way to explain this is to use Microsoft’s Azure platform as an example. Whilst you can still get the 3 tier paradigm working in the Azure environment (using a Web Role, Worker Role and SQL Azure) this negates the benefits of using things like Azure Table Storage, Blob Storage and Azure Cache. The difference comes down to having to manually scale an application like you would do normally instead of enabling the application to scale itself in¬†response¬†to demand. In essence there’s another level of autonomy you take advantage of, one that makes capacity planning a thing of the past¬Ļ.

It’s also worth your time to develop a lot of product knowledge in the area of cloud services. As I mentioned in my previous blog cloud services are extremely good at some things and wildly inappropriate for others. However in my experience most cloud¬†initiatives¬†attempt to be too ambitious, looking to migrate as many services into the cloud as possible whether there are benefits to be had or not. It’s your job then to advise management as to where cloud services will be most appropriate and you can’t do this without a deep knowledge of the products on offer. A good rule of thumb is that cloud services are great at replacing commodity services (email, ERP, CRM etc.) but aren’t so great at replacing custom systems or commodity systems that have had heavy modifications to them. Still it’s worth researching the options out there to ensure you know how the cloud provider’s capabilities match up with your requirements, hopefully prior to attempting to implement them.

This is by no means an exhaustive list and realistically your strategy will have to be custom made to your company and your potential career path. However I do believe that investing in the skills I mentioned above will give you a good footing for transition from just a regular IT admin to a cloud admin. For me I find it exciting as whilst I don’t believe the cloud will overtake anything and everything in the corporate IT environment it will provide us with some amazing new capabilities.

¬ĻWell technically it just moves the problem from you to the cloud service provider. There’s still some capacity planning to be done on your end although it comes down financial rather than computational, so that’s usually left to the finance department of your organisation. They’re traditionally much better at financial planning than IT admins are at capacity planning.

Many thanks to Derek Singleton of Software Advice for inspiring this post with his blog on Cloud Career Plans.

Sortilio Update: It’s Just Better All Over.

So like most products that a developer creates with one purpose in mind my first iteration of Sortilio was pretty bare bones. Sure if you had a small media collection that was named semi-coherently it worked fine (like it did for my test data) but past that it started to fall apart rather rapidly. Case in point: I let it loose on my own media collection, you know for the purposes of eating my own dog food. It didn’t take long for it to fall flat on its face, querying The TVDB’s API so rapidly that the rate limiter kicked in almost¬†instantaneously. There was also the issue of not being able to massage the data once it had done the automated matching portion as even the best automated tools can still make mistakes. With that in mind I set about improving Sortilio and put the finishing touches on it yesterday.

Now the first update you’ll notice is the slightly changed main screen with a new Options tab and two extra buttons down in the right hand corner. They all function pretty much as you’d expect: the options tab has a few options for you to configure (only one of them works currently, the extensions one), save will export the current selection to a file for use later and load will ¬†import said file back into Sortilio. The save/load functionality is quite handy if you’d like to manually go in there and sort out the data yourself as it’s all plain XML that I’m sure anyone with half a coding mind about them would be able to figure out. I put it in mostly for debugging purposes (re-running the identification process is rather slow, more on that in a bit) but I can see it being quite useful, especially with larger collections.

As I mentioned earlier whilst the automated matching does a pretty good job of getting things right there are times when it either doesn’t find anything or its got it completely wrong. To alleviate this I added in the ability for you to be able to double click the row to bring up the following screen:

Shown in this dialog is the series drop down which allows you to select from a list of episodes that Sortilio has already downloaded. The list is populated by the cache that Sortilio creates from its queries to The TVDB so if it managed to match one file in the series correctly it will have it cached already so you can just select it and hit update. Sortilio will then identify other files that had the same search term and ask if you’d like to update them as well (since it will have probably got them wrong as well). Should the series you’re looking for not be available you can then hit the search button which brings up this dialog:

From here you can enter whatever term you want and hit search. This will then query The TVDB and then display the results in a list for you. Select the most appropriate one and then hit OK and you’ll have the new series assigned to that file.

Under the hood things have gotten quite a bit better as well. The season string matching algorithm has been improved a bit so that identifies seasons better than it previously did. For instance if you had a file that was like say battlestar.galactica.2003.s01e20.avi Sortilio would (wrongly) identify that as season 20 because of the 2003 before the series/episode identifier. It now prefers the right kind of identifiers and is a little better overall at getting it right, although I still think that the way I’m going about it is slightly ass backwards. Chalk that up to still figuring out how to best do string splitting based on a regex.

Now on the surface if you were to compare this version to the previous it would appear to run quite a bit slower. There’s a good reason for this and it all comes down to the rate limit on The TVDB API. After playing around with various values I found that the sweet spot was somewhere around a 2 second delay between searches. Without any series cached this would mean that every request will incur a 2 second penalty, significantly increasing the amount of time required to get the initial sort done. I’ve alleviated this somewhat by having Sortilio search its local cache first before attempting to head out to the API but that’s still¬†noticeably¬†slower that it was originally. I’ve reached out to the guys behind The TVDB in the hopes that I can get an¬†excerpt of their database that I can include within Sortilio that will make the process lightening fast but I’ve yet to hear back from them.

So as always feel free to grab it, have a play and then send me any feedback you have regarding it. I’ve already got a list of improvements to make on this version but I’d definitely call this usable and to prove a point I have indeed used it on my own media collection. It gets about 90% of the way there with the last 10% needing manual intervention, either within Sortilio or outside cleaning up after it has done its job. If you’ve used it and encountered problems please save the sort file and the debug log and send them to me at [email protected].

You can grab the latest version here.

[NOTE: There is no link currently because gmail barfed at the file attachment I sent myself to upload this morning. Follow me on Twitter to be notified of when it comes out!]

I’m More Agile Than I Thought.

I was a beautifully warm night in late December down in Jervis Bay. My friends and I had rented a holiday house for the New Years weekend and we had spent most of the day drinking and soaking in the sun, our professional lives a million miles away. We had been enjoying all manner of activities from annoying our significant others with 4 hour bouts of Magic: The Gathering to spending hours talking on whatever subject crossed our minds. Late into the evening, in a booze fueled conversation (only on my end, mind you), we got onto the subject of Agile development methodologies and almost instantly I jumped on the negative bandwagon, something that drew the ire of one my good mates.

You see I’m a fan of more traditional development¬†methodologies, or at least what I thought were traditional ideas. You see I began my software development career back in 2004, 3 years after the Agile Manifesto was brought into existence. Many of the programming methodologies I was taught back then centered around iterative and incremental methods and using them in our projects seemed to fit the bill quite well. There was some talk of the newer agile methods but most of it was written off as being experimentation and my professional experience as a software developer mirrored this.

My viewpoint in that boozy conversation was that Agile methodologies were a house of cards, beautiful and seemingly robust if there’s no external factors on them. This hinged heavily on the idea that some of the core Agile ideals (scrum, pair programming and it’s¬†inherit inability to document well) are detrimental in an environment with skilled programmers. The research done seems to support this however it also shows that there are significant benefits for average programmers, which you are more likely to encounter. I do consider myself to be a decently skilled programmer (how anyone can be called a programmer and still fail FizzBuzz is beyond me) which is probably why I saw Agile as being more of a detriment to my ability to write good code.

After taking a step back however, I realised I was more agile than I previously thought.

There are two methodologies I use when programming that are included in the Agile manifesto. The first of these is Extreme Programming which underpins the idea of “release early release often”, something I consider to be necessary to produce good working code. Even though I don’t announce it on this blog every time I get a new feature working I push it straight into product to see how it fairs in the real world. I also carry with me a copy of the latest iPhone client for testing in the wild to make sure the program will function as expected once its deployed. Whilst I don’t yet have any customers other than myself to get feedback from it still keeps me in the mindset of making sure whatever I produce is workable and viable.

The second is Feature Driven Development, a methodology I believe that goes hand in hand with Extreme Programming. I usually have a notepad filled with feature ideas sitting in front of me and when I get time to code I’ll pick one of them and set about getting it implemented. This helps in keeping me from being distracted from pursuing too many goals at once, making sure that they can all be completed within a certain time frame. Since I’m often coding on the weekends I’ll usually aim to get at least one feature implemented per weekend, accelerating towards multiple features per weekend as the project approaches maturity.

Whilst I haven’t yet formally¬†conceded¬†to my friend that Agile has its merits (you can take this post as such, Eamon ;)) after doing my research into what actually constitutes an Agile methodology I was surprised to find how much in common I shared with them. Since I’m only programming on my own at the moment many of the methods don’t apply but I can’t honestly say now that when I go about forming a team that I won’t consider using all of the Agile Manifesto to start off with. I still have my reservations about it for large scale solutions, but for startups and small teams I’m beginning to think it could be quite valuable. Heck it might even be the only viable solution for small scale enterprises.

Man I hate being wrong sometimes.

The Killer Proprietary Cloud.

So I might be coming around to the whole cloud computing idea despite my history of being sceptical of the whole idea. Whilst I’ve yet to go any further than researching the possibilities the potential of the cloud to eliminate a lot of the problems encountered when scaling would mean a lot less time spent on those issues and more on developing a better product. However whilst the benefits of the cloud are potentially quite large there’s also the glaring issue of vendor dependency and lock in as no two cloud providers are the same nor is there any real standards around the whole cloud computing idea. This presents a very large problem for both cloud native services and those looking to migrate to the cloud platform as once you’ve made your choice you’re pretty much locked in unless you’re willing to pay for significant rework.

Right now my platform of choice is looking to be Windows Azure. Primarily this is because of the platform familiarity as whilst the cloud might be a whole new computing paradigm services built on the traditional Microsoft platform won’t have a hard time being migrated across. Additionally they’ve got a fantastic offer for MSDN subscribers, giving them a small instance and whole swath of other goodies to get them off the ground. This is good news for aspiring entrepreneurs like myself as Microsoft offers a free MSDN Premium subscription to start-ups (called BizSpark) who’ve been together for less than 3 years and less than $1 million in revenue. However as I was comparing this to the other cloud providers out there I noticed that no two were alike, in fact they were all at odds with each other.

Take the biggest cloud provider out there, Amazon’s EC2. Whilst the compute instances are pretty comparable since they’re just the operating system the other services (file storage, databases) are worlds away from each other and it’s not just the API calls that are different either. Amazon’s cloud offering is architecturally different from that of Microsofts, so much so that any code written for Azure will have to be wholly rewritten in order to function on their cloud. This means that when it comes time to move your service into the cloud you’ll have to make sure you trust the provider that you’re going to be going with. You could also double the entire budget and keep half of it in reserve should it ever come to that, but I doubt many people have that luxury.

Like the format wars that have raged for the past century such incompatibility between cloud providers only serves to harm the consumers of such services. Whilst I’m not suggesting the notion that there be one and only one cloud provider (corporations really can’t be trusted with monopolies and I’d hate to think what a government sanctioned cloud would look like) what’s missing from the current cloud environment is a level of interoperability that we’re so used to seeing in this Internet enabled age. The good news is that I’m not the only one to notice issues like this and there are several movements working towards an open set of standards for cloud operators to adopt. This not only provides the level of interoperability that’s currently lacking in the cloud world but would also give customers more confidence when working with smaller cloud operators, knowing that they wouldn’t be left high and dry should they fail.

Like any technology in its infancy cloud computing still has a ways to go before it can before it can be counted amongst its more mature predecessors. Still the idea has proven itself to be viable, usable and above all capable of delivering on some of the wild promises it made back when it was still called SaaS. With the d-day fast approaching for Lobaco I’ll soon be wandering into the cloud myself and I’m sure I’ll have much more to write about it when the time comes. For now though I’m happy to say that my previous outlook on the cloud was wrong, despite the problems I’ve cited here today.