Sunday, December 8, 2013

Dynamic LINQ Sorting

In the rebuild of my company's intranet platform, I was moving towards an actual API - based platform, not just a system that got JSON from .aspx requests. I also wanted this API to be more RESTful in the way that it handled GET requests - instead of taking a POST request of parameters and returning JSON, I wanted the application to receive all parameters from the querystring.

No big deal, a little more work in standardization perhaps. Using an Order object as an example, these are the types of parameters we can expect in most requests:
  • Filters - what the user is looking for (Orders for specific customer, in a certain date range, etc)
  • Sort - how the user wants the results ordered (first by date range, then by size)
These are two different entities entirely, and must be treated as such. The majority of filter parameters will be sent to the Model and sent to the database query - but sorting (at least complex sorting) needs to be handled not at the data level, but at the application level.

Instinctively, going to the database for sorting makes sense. It's obviously the best equipped to handle sorting of records - and it works great if you're building dynamic SQL within your applications (what the fuck are you thinking). But if you're using a stored procedure-based approach, it's problematic

You may be able to get away with doing an CASE statement sort based on a parameter passed to the stored procedure, but it's awkward and bloats your SP code - and that's if you can even accomplish your sort within a CASE block. More than two and your SP gets increasingly impossible to understand.

So, that sucks. I really enjoy the SQL ORDER BY syntax and how easy it is to write & understand. I wanted something that would do the same thing against my objects.

Since learning how LINQ (for objects) works, I've used it in all my applications. I think it's totally brilliant if you have a reasonable number of objects / server power ratio. Before moving on, let me just say I don't use LINQ to SQL. I'm too much of an asshole control freak to do that. I write my own T-SQL code & optimize it the way I see fit. End rant. LINQ does provide methods to sort a list of objects, but it is difficult to do dynamically - especially with different types of objects. Out of the box, you need to know at compile time what options exist for sorting and apply them in order. For example:

var sortedOrdersList = orders.OrderBy(o => o.OrderID).ThenBy(o => o.OrderDate);

The sortedOrdersList object would contain elements first ordered by order id, then by order date. Ok, whatever, that'll work if that's the only way it would/could be sorted, but what if the user should be able to decide how it's sorted? That's a problem. You'd have to code all possible options and then select which one based off of some user-provided variable:
            switch (sortoption)
            {
                case "order":
                    sortedOrdersList = orders.OrderBy(o => o.OrderID);
                    break;
                case "date":
                    sortedOrdersList = orders.OrderBy(o => o.OrderDate);
                    break;
                case "dateandorder":
                    sortedOrdersList = orders.OrderBy(o => o.OrderDate).ThenBy(o => o.OrderID);
                    break;
            }

I've implemented something like the above in a production application. It seemed like the best idea at the time, but it really bloats your code and looks quite amateur... this won't fly with the "NO BULLSHIT CODE" motto for this application rebuild.

Requirements so far:

  • Sorts done within the application, not the database server
  • Uses SQL ORDER BY syntax, including chained statements and DESC option
  • Dynamic - can sort by any property of the enumerated object
Ok. Well, obviously we'll have to use some Reflection to get the properties of whatever object we're sorting against (in this case Order). We'll have to parse the SQL-Like order-by statement as well to make sure we're only sorting on properties that exist.

After hours of trying to make this work myself, I realized that using the built in functionality of LINQ alone, this was impossible (I've seen a few posts about using some lambda expressions with even more reflection chain default OrderBy() and ThenBy() methods, but that looks like more work than it's worth). I looked around for some NuGet package that would help, and stumbled across the Dynamic Linq library. This allows you to pass multiple sort arguments in one string. I admitted defeat and installed it.

However, I never trust the user's input. So I built some helper methods that utilize the dynamic LINQ library and sanitize the input (make sure the sort fields actually exist in the target object). The expected string is something like this (if we wanted to sort on Order Date, then by Order Size descending): "orderdate, ordersize desc".

These are the helper methods:
public static Dictionary<PropertyInfo,bool>GenerateSortsFromString(string sortList,Type objType)
        {
            Dictionary <PropertyInfo,bool> fields = new Dictionary<PropertyInfo,bool>();
            //key=object's property, bool=is descending or not
            if (sortList == null)
                return fields;
            //parse string list
            foreach (string s in sortList.Split(new string[] { "," }, StringSplitOptions.RemoveEmptyEntries))
            {
                bool desc = false;
                string name = "";
                string[] parts = s.Split(' ');
                switch(parts.Length)
                {
                    case 0:
                        continue; //invalid sort specification
                    case 1:
                        name = parts[0].Trim(); //only got a field name
                        break;
                    case 2:
                        name = parts[0].Trim(); //field name
                        desc = parts[1].Trim().Equals("desc",StringComparison.CurrentCultureIgnoreCase) ||
                             parts[1].Trim().Equals("descending",StringComparison.CurrentCultureIgnoreCase); //decending
                        break;
                }
                //make sure the property exists in the object
                PropertyInfo prop = objType.GetProperties().First(
                    p => p.Name.Equals(name, StringComparison.CurrentCultureIgnoreCase));
                if (prop == null)
                    continue; //property didn't exist
                fields[prop] = desc; //add/update the property
            }
            return fields;
        }
public static IEnumerable<T> DynamicSort<T>(string sortList, IEnumerable<T> list)
        {
            StringBuilder sb = new StringBuilder(); // for generating final sort string
            Dictionary<PropertyInfo,bool> sortedProps = GenerateSortsFromString(sortList, typeof(T)); //validate fields
            foreach (PropertyInfo prop in sortedProps.Keys)
            {
                if (sb.Length > 0)
                    sb.Append(",");
                sb.Append(prop.Name + " " + (sortedProps[prop] == false ? "" : "descending")); //add descending tag if it is descending
            }
            return list.AsQueryable<T>().OrderBy(sb.ToString());
}

How to use: if I had a list of orders and wanted to generate a new list, sorted by Order Date then Order Size descending, I would generate the following code:

List<Order> orders = db.GetOrders(); //get orders from source
orders = DynamicSort("orderdate, ordersize desc", orders).ToList();

Down, set, go

This post will outline the background of the application on which the forthcoming posts will all reference. 
STUDY UP, NUBS.

Up until a couple years ago, the application landscape of the company I work for was a nightmare. For whatever reason, the company maintained countless tiny applications that were built to accomplish one task. Not necessarily a bad thing, but problematic when updates were made system-wide, particularly data-driven applications. When a database schema changed, each application's data model would have to be updated as well, and this caused numerous problems in both usability & development costs.

I was a culprit as well - I had continued on creating applications in this way - single serving, one-off, quickly-built programs that did one thing and quit. The only difference between my new applications and the old ones were that mine were based around web technologies rather than Windows forms & Microsoft Access.

The more I built, the more I realized how awful this was becoming. It was unbearable to have to change application in the company because I added a field to a database table or added an optimization parameter to a stored procedure. What I needed was a platform that provided all the business logic in API form that I could build individual applications on top of. I needed a goddamn client-platform agnostic way of delivering information (managing .Net runtime versions on client machines is AWFUL - and we had Macs). But this was a rather large undertaking, and without a working prototype, difficult to justify on timesheets.

Then, a couple years ago I was approached with the task of creating another one of these single-use applications. Not that it matters yet, but it was an application to monitor & notify customers of backordered items with the times they would arrive at our warehouse. Highly useful to the people who needed it ('killer app' potential). This application was simple and would require a small amount of time - but instead of quoting the amount of time it would take to actually accomplish, I (somewhat deceptively) bloated the development time to about four times what it would have been - this application would be the start of something new.

Now, in internal business environments, creating something new does not imply that it will be better. In fact, it usually doesn't. Users are accustomed to a certain way of doing things and by creating a new system, you are in fact uprooting the way they perform their jobs.

In a corporate setting, users don't give a shit how 'siq' the new app is. 

This lesson took me a long time to learn, mostly because in the consumer space, the 'new' thing is the only thing. What you have to consider in the corporate space is that you are directly impacting a user's performance by implementing a new system - and in some cases, their paycheck. So, whatever system I'd create for users to run apps against did not just need to work - it needed to work well - and decidedly better than the previous system.

So what would be the new 'platform'? I knew Windows forms & the Windows client APIs well. Very well. I had made a few extremely large Windows forms applications that people used daily with almost no problems. However, considering the cross-platform requirement (and the fact that I'm a one-man shop so unmanaged code [c, c++] is out of the question), I decided to go to a web platform. This would also get my web dev chops up to par.

I was a noob to full-stack web development. I knew most pieces well (HTML, CSS, code-behind, SQL), but I knew very little about turning that knowledge into a cohesive hosted application. I didn't know anything else but the MS Web Forms architecture existed. I sucked at javascript and had never heard of JSON. I knew I needed to learn these technologies not just for this project, but for my career - so I pushed through it. I made some truly horrible, ignorant design decisions in the process. But really, there is no better way of learning anything you care about. 

Knowing that I wouldn't have time to build a whole application platform out of the gate, I decided to iteratively develop the platform as new applications needed new features (which is one of the best things about web development - there is no need to deploy new versions of code - you just publish to a URL and everyone is up to date). I built the APIs that were needed for the first application that would live on the platform (the backordered items app) - Order, Purchase Order & Customer models & get/setters. I created some really awful HTML templates to display this data. I mean, really awful. Version 1 of this application uses a javascript file named "displayControllers.js" that contains the templates for every different entity type the application uses. Each template takes an object instance and makes decisions based on the instance on how to display it, and returns one huge HTML string back to the caller. This js file is hundreds of KB in size. Bad, bad call. That's just one example of many.

In this setup, the platform would provide & manage services like authentication & authorization. A user would log into the platform before being allowed to do anything.

The platform's applications would have a lifecycle like this:
  1. Page request made
  2. Authentication & authorization checks performed
  3. Initial HTML returned from web form (including page behaviors in javascript)
  4. User interacts with javascript (which interacts with server-side APIs)
Once "complete", I had a working prototype of both the platform and requested application. When deployed, users of the application would be unknowing users of the platform. And this is why you need a 'killer app' in these scenarios - if you have a system that is new, probably bug-ridden and different than the previous system, users have a lot more patience when the system still performs a great task. This backorder system was the 'killer app.' Users would use this application, and it would work (most of the time). When it didn't, they were more likely to report issues & bugs because they "enjoyed" the app, because they had an interest in the thing succeeding.

And the rest is history. I built a large amount of applications into this platform, as well as extended it's use into things I never could have imagined. Today, the platform is trembling under its own weight, which is why (now armed with superior skills & knowledge), I am rebuilding it using solid design methodologies and the latest client-side plugins & technologies. The subsequent posts here will include my interactions with .NET MVC architecture, knockout.js MVVM design & SQL Server. 

Thursday, November 21, 2013

Three Evils

After a great deal of procrastination & lollygagging, I've finally decided to commit to a blogging platform. I've chosen blogger, mostly because I'm cheap and already have a knack for CSS & web design - and Google allows for custom domains without having to pay extra.

I'm planning on only covering development topics here. I'll be avoiding stirring in political discussion just because dealing with comments becomes so tiresome so quickly.

The development topics will cover mostly real-world solutions to problems I find to be interesting. There will be discussion on almost all of the Microsoft dev stack, including Visual Studio, SQL Server & IIS. There will also be considerable emphasis on front-end technologies, such as HTML, CSS & javascript/jQuery.

STAY TUNED, LOSERS.

-Mike