Thursday, April 2, 2009

Issues with current trend

In my last post I have introduced the code classification used by the Helsinki declaration (as opposed to MVC used by JEE):
  • User Interface (UI) code: all code that creates UI and responds to events in the UI, the same as JEE's View and Control
  • Data Logic (DL) code: all code that maintains data integrity constraints, a well defined subset of JEE's Model
  • Business Logic (BL) code: all other code, the remaining part of JEE's Model, query and/or transaction composing and executing code
I would like to point out before continuing with this post that these three code classes may appear to you as somewhat 'conceptual'. But every WoD application somehow performs these functions. Together they are what a WoD application is all about. So you may have a WoD application that does not reflect these code classes at all, but the point that I make here is that somehow your code in your WoD application is doing these same three things: execute UI, BL and DL code. Show me a line of your code, and I can tell you in what Helsinki code class it falls.

I concluded my previous post by showing you how the three Helsinki code classes interact with each other.


Using this picture I can visualize the current trend of building WoD applications as follows:


We do not use the DBMS anymore. All code, in particular all BL and DL code, is implemented within a du-jour technology (XYZ above) that lives outside of the DBMS.

In this post I'd like to explain how this current trend of not using the DBMS (our first observation) combined with the Yafet technology explosion (our third observation) leads to, what I consider, two rather serious issues in todays WoD applications.

Let's start with the first issue.

Scalability and performance issues

Implementing BL and DL code outside the DBMS leads to chatty applications: these applications call the DBMS many times. Let's demonstrate this using that same picture again. I have drawn a vertical yellow line in it vizualizing the current trend: everything to the left of this line is implemented outside the DBMS.


Say the end user has an order-entry page. He/she enters a few orderlines, and then presses the Save button. This generates one context switch (one call) from UI-code to BL-code. The BL-code then starts processing these orderlines. This causes the generation of an order of a magnitude more than one (which is ten) context switches from BL-code to the DBMS. Per orderline however a couple of constraints are likely to be involved. The majority of these constraints require DL-code that queries the current state of the DBMS. So we have another order of a magnitude more calls to the DBMS.

So one event in the top of the WoD code-stack generates two orders of magnitude more events lower down in the WoD code-stack. Mind you, this is not WoD application software specific, but a general phenomenon in all software. The same is also true for instance in an operating system, which has many layers of code classes too.

In todays WoD applications it is often worse than just the two orders of magnitude described above:
  • If you put BL in middle tier, you’ll call the DBMS an order of magnitude more
  • If you also put DL in the middle tier, you’ll call the DBMS two orders of magnitudes more
  • But, if you then also don’t use SQL effectively, you’ll call the DBMS three orders of magnitudes more
I have given you an rBL (read-Business Logic) example of not using SQL effectively at the end of this post. Let me share another experience in this area. A few years ago I was asked to help analyze a performance issue in a very simple user-enters-search-criteria, user-hits-search-button, WoD-app-displays-rows scenario. This involved a 'search customers' page where the user would enter a leading string of the customer name. It took almost a full minute for the application to come up with the first set of twenty rows to be displayed. This WoD app obviously was built in the current trendy manner. So what do you do? You try to find out where time is spent. I performed a sqltrace of this action. And guess what: the trace reveiled that the application was sending in the order of 70,000 SQL statements to the DBMS.

70,000...

Yes folks. 70,000 SQL select statements to come up with the first twenty customers. I am not kidding you. Unbelievable. Obviously the DBMS is not the problem here: it is servicing 70,000 queries in just less than 60 seconds! The problem is the architecture of the WoD application.

This particular scenario could be done with just three calls to the DBMS which would take a fraction of a second to execute on top of a good relational database design.
  1. Open cursor
  2. Perform array fetch of first twenty rows
  3. Close cursor
(actually with the current SQLNet protocol this might even be less than three roundtrips)

But using a 'black box' that is designed to instantanously provide the data that is to be displayed, would be so uncool. No, ... executing 70,000 queries and doing lots of application logic yourself, is way more cool.

This chatty application behaviour leads to:
  • more latency hits: every time you go back and forth to the DBMS you will be hit with some latency. If this happens a lot, it starts impacting the performance.
  • more 'system' cpu: context switches are accompanied by 'fixed costs' incurred by the network, OS and DBMS software layers that are constantly creating and destroying contexts. System cpu is pure overhead. It's not adding any value to the business using the WoD application.
  • more data transfer out: since complex SQL processing (that is any SQL involving more than one table) is effectively performed outside the DBMS, you are sending more data out of the DBMS to that place where the SQL execution plan is now effectively implemented.
This all impacts the performance and scalability of the WoD application.

What this chatty application behaviour also implies, is that the required iron power to run these applications is absurd. I have seen deployments of not too big WoD applications that require farms of application servers, and n-way RAC configurations on top of expensive mid-end hardware configurations. And when I investigate the load coming from the end-user population I start thinking: had this application been architected differently, my bet is it would run just smoothly on a single, let's say, 2 quad-core Intel server that can be bought for about 10K Euro.

Let's move on to the second issue caused by the current trend of not using the DBMS and the ongoing Yafet technology explosion.

WoD application TCO (Total Cost of Ownership)

The second issue is about how the current trend of building WoD applications is impacting the TCO of such an application. Given the ongoing Yafet technology explosion, if you implement all BL and DL code in the volatile XYZ technology, then:
  • your application is legacy (technology-wise) within a year
  • it will be hard to find XYZ knowledgeable people
  • it will be very, very hard to migrate to the next sexy Yafet since it involves migrating BL and DL code (often the only option is to throw away, and code again)
I know of WoD applications built in the mid-nineties in a database-centric way, that today almost 15 years later still run the majority of the BL and DL code unchanged inside the (current version of the) PLSQL virtual machine. I have yet to see the first Java-centric WoD application that runs Java code unchanged in the current versions of the frameworks used to implement the M, the V and the C parts of that application. Often the MVC frameworks used 10 years ago are just not there anymore. So often the WoD applications built 10 years ago still run on those now by long desupported versions of the framework. There is no easy migration possible. It always involves a major effort impacting the TCO of that application.

So in conclusion:
  • Todays WoD applications suffer from performance issues. This is caused by not pushing down work that could have been taken care of lower in the technology stack, which in turn causes orders of magnitudes more events (context switches) lower down in the technology stack.
  • Todays WoD applications have a high cost of ownership. This is due to the high technology change rate at the upper end of the technology stack, where all application logic is implemented (which changes a whole lot less fast). Businesses are faced with either, having to rebuild major parts of the application using the newest Yafet, or having to pay high rates for scarce programmers still willing to work in the old-fashioned Yafet.
The opposite of DBMS independence is what we need!

The ugly state of affairs with regards to todays WoD applications is not in the least also caused by that other popular belief. The one that dictates applications should be built in a database independent manner. In the Helsinki declaration I cannot but conclude that quite the opposite is what our customers need. They do not need DBMS independence, by far not. They need Yafet independence. Yafets change all the time, they come and go. The DBMS is probably the most stable factor within the IT landscape of a business. So we need to architect their WoD applications in such a way that they (the applications) are as immune as possible for this ongoing technology change outside the DBMS.


So what this all leads to is that BL and DL code should be implemented inside the DBMS.

Before I talk more about that, the next post will first map the Helsinki code classes accross the client, middle and data tiers (like I did with the MVC code classes here) and revisit the seven thin/fat-thin/fat-thin/fat alternatives again.

10 comments:

  1. Please add some article to convince me that PL/SQL is indeed..."robust", especially in terms of business logic maintainability.

    I've been doing PL/SQL for 12 years and have highly contrast opinion.

    ReplyDelete
  2. This is the question I always get: how do we prevent that our PLSQL code becomes one big bucket of unmaintainable PLSQL spaghetti?

    Please be patient, I will address your concerns in a future post.

    ReplyDelete
  3. To anonymous,

    PLSQL is as robust as any other 3 generation language. Most of the problems I have encountered with PLSQL is that we do not threat it as 3 generation language. But more a a scripting language with out a good need of source code control. Also wait for the next instalment of this blog.

    ReplyDelete
  4. PL/SQL problems would usually stem from two programming errors (in my experience):

    1. Record by record processing (far too much reliance on the 'Procedural' aspect of PL/SQL)
    2. WHEN OTHERS THEN NULL (and generally incorrect use of EXCEPTION handling)

    ReplyDelete
  5. I have been coding for last 12 years and though PL/SQL gives faster performance, but admit it, it is still a scripting language. Doing a modular oop concept programming in PL/SQL is very difficult. Error handling is limited.
    Your issue about an app being 'chatty' can be reduced greatly by good app design. There is no substitute for good design. Anyone can write a pl/sql proc as bad as incident you mentioned in your issue #1.
    What I have found ideal is to have bulk processing of data to be delegated to DB (using stored proc/functions) and main BL to external code. Sometimes the lines are blurred and BL creeps into stored proc. That works too.

    ReplyDelete
  6. FWIW, The second "Anonymous" is not me, .. the first "Anonymous"

    ;)

    ReplyDelete
  7. Toon,

    I absolutely agree with what you said here and with your Helsinki Conclusion.
    I have seen similar examples like you (not 70.000 SQLs but near ...) over and over.
    I still have no clue, WHY DO WE SEE THIS TREND ?

    Applications come, applications go, what stays is the data. Is that so hard to see ?
    I don't know why most developpers don't see this simple point.
    Have you an answer ?

    Best regards
    rogel-at-web.de

    ReplyDelete
  8. > WHY DO WE SEE THIS TREND ?

    I still need to investigate this...
    Sometime before the millenium change, SUN invented this thing called Java and J2EE. And for some reason it took off in a huge way. Why?
    I don't know why (yet).

    Thoughts that I have on this are twofold:

    1) the IT industry had been talking about OO since the early-nineties. But it never did anything. It was all proprietary stuff. Until Java happened... SUN made it available to the masses. And the masses, I think, embraced it because of it's WORA property (Write Once, Run Anywhere, http://en.wikipedia.org/wiki/Write_once,_run_anywhere).

    Without stopping to think about: does this help us build WoD applications?

    2) The big corporations continually need to create "shareholders value". They *need* to embrace new "paradigms" every x years in order to achieve this. So when they saw this Java thing happening, BINGO, that was their new paradigm to jump onto. And they all did...

    What disappoints me, is that after the big corporations jumped onto it, then too, the academic world jumped onto the Java bandwagon.

    They, of all people, ought to know better...

    ReplyDelete
  9. love this stuff toon! its great that you are putting together such an argument that can only be found sporadically on the internet.

    some observations that i have about the posts here:
    1) pl/sql is not a scripting language! it is a procedural language! like C, pascal etc... billions of lines of procedural code exist out there. since when did procedural languages become viewed as some kind fo plague? lol

    2) i too find it fantastical that software ‘engineers’ have now forgotten an axiom of computing science… that data centric applications are modeled using ER methods and that the resulting model IS GOD to the application. that all code must be simply pass through to the model if the application is to achieve any degree of ELEGANCE. but, of course, software ELEGANCE is also a concept that seems to have disappeared from computing circles.

    in case its not covered in one of toons blogs I have not read yet, the “Vietnam of Computer Science” (http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx) should be enough for ‘software engineers’ to abandon object relational modeling of data centric applications (modeling twice)... but it appears not to be.

    toon, I think that one has to reach back into history to see the story of the rise of the fat middle tier unfold… it kinda goes like this I believe:
    1) java is developed with a major premise being that C++ was too hard (I still love C++… hybrid baby!).
    2) a major language is needed to compete with microsoft which owns many of the languages that it builds tools for and thus controls
    3) everyone jumps on the java bandwagon because of 2
    4) the advent of j2ee signals to vendors a whole new revenue opportunity that didn’t exist before
    note that none of these events have anything to do with developer productivity!

    ReplyDelete
  10. Charlie,

    Thanks for your contribution. I wanted to do just that: put together the whole argument. The more people can contribute to it, the better (hopefully in the end).

    Never heard of the Vietnam of CS. Will read up on it.

    Toon

    ReplyDelete