Friday, April 17, 2009

Helsinki code layers in the DBMS

Ok, let's continue with the second part of "The Helsinki Declaration". That would be the part where I zoom in on the DBMS and show you how best to do this database centric thing.

We have seen that the DBMS is the most stable component in everybodies technology landscape. We have also concluded that the DBMS has been designed to handle WoD application BL-code and DL-code. And current DBMS's are really good at this, if the underlying database design is a sound relational design and a good fit for the requirements of the WoD application. Now the latter part is up to the qualities of the person that designs the database. The former part is all about education.

That's why I repeat here, that in order for the Helsinki approach to succeed, you'll require:
  • Educated database designers
    I specifically emphasize educated. You need to educate yourself to prevent fud when designing databases. Database design is all about set theory and predicate logic. Again this is what Ted Codd gave us. There are plenty of books by Chris Date to read and so to educate yourself in this area. I can of course also recommend that you study Lex' and mine book.
  • Experienced plsql developers
    I specifically emphasize experienced since I see that many plsql developers are still stuck in the mid-nineties. They have not evolved their knowledge of the DBMS, which is in the case of Oracle is now at it's 11th major release. Compared to Oracle7 (of the mid-nineties) a lot has been added, and a lot has been improved. Many then-Myths, are no longer Myth now.
With these two prerequisites in place, here's how you implement a WoD application the Helsinki way: you'll need to introduce a layers of code inside the DBMS.



In this layered approach we materialize the WoD code classification that I have introduced: UI-logic (often outside the DBMS), BL-logic, and DL-logic. But we also introduce a separate layer, called UI API-objects, between the UI and BL code layers, and a separate layer between the BL code and the tables in the database design (database design API objects).


Layer 1: UI API objects

In the Helsinki declaration you need to introduce API objects inside the DBMS whose specific purpose is to be called by UI-code. These, and only these, objects will be the entry points, or API if you like, for UI-code to communicate with the DBMS. Per WoD-page you need to design these API objects. Every object in this layer is specific for some page of the WoD application. This layer is a very important layer. Why? Because these (and only these) objects might require re-engineering in case you switch to a new yafet. This layer will give you agility and enable you to easily take on new yafets every so many years. I'll give examples of how to design this layer in future posts.

Layer 2: Business logic

Nothing new here. This is the materialization of BL-code

Layer 3: Database design API objects

In this layer you implement "data services" that are shared by multiple BL-code modules. Whenever you are writing BL, and you see yourself copy-pasting or rewriting a piece rBL or wBL code that you have written before, you need to introduce a database design API object for it. Write it once, and reuse it in different BL-code sections. This layer is somewhat less important, but often found inside your software architecture.

Layer 4: Database design

This layer is sort of pro-forma. It only holds the tables (and indexes) that are part of the database design for the WoD application.

Layer 5: Data logic

This is the materialization of DL-code. Also a very important layer. The reason why data logic is visualized beneath the database design (layer 4), has to do with how Helsinki implements data logic. We'll use database triggers to implement data logic.

And now you say: "Oh no! Triggers? Toon are you serious? Tom Kyte says all triggers are evil, therefor I cannot and will not use them. "

Let's have a brief intermezzo on this topic.

The reason why Oracle rockstar Kyte is against triggers is (from what I understand) because they:
  1. make things happen automagically, and
  2. are rarely implemented correctly.
I understand his point, but would like to introduce some nuance around this ubiquitous opinion on the use of triggers. And I can give you the nuance because of the clear code classification that was introduced for WoD applications.

Listen.

Tom's reason 1 is due to people implementing triggers that hold wBL (write-BL) code. And I fully agree on this point: when you put wBL code in your table triggers, and you then perform dml against the table, things happen automagically. The wBL-code will in its turn perform more dml that you, the issuer of the dml against the table, might not have realized. And if there are many tables with triggers of this kind, then a cascading effect will take place, making more magic happen (which is worse). wBL code should never be executed via triggers.

<related observation>
This automagic behaviour is also why I never use cascade-delete on foreign keys, nor the default clause on columns of tables, nor the fixed-length CHAR datatype. They introduce magic when parent rows are deleted, when a (partial) row is inserted, or a less-than-the-fixed-length value is provided.
</related observation>

Tom's reason 2 is due to people implementing DL-code in triggers, and thinking they have done it right. Implementing DL-code is inherently complex matter. It is complex due to concurrency issues that need to be solved, and due to efficiency levels you would want to have (DL-code shouldn't validate a constraint when the transaction at hand doesn't necessitate it). In chapter 11 (of the book), I fully explain the issues that need to be solved when implementing DL-code. Also, a few years ago, I have demonstrated an example constraint implementation using triggers on Tom's blog. You can find it here. The point I want to make is that just because implementing DL-code via triggers is complex, doesn't mean triggers are evil and it shouldn't be done. To me it means that you need to educate yourself so that you can use triggers correctly. By the way: implementing DL-code not via triggers, but embedded inside wBL-code sections, is complex too. Maybe slightly less complex, but still complex. Triggers have the big advantage that they offer a once-and-for-all type of implementation. Having to always go through a wBL-code section (TAPI's, VAPI's, or whatever you would like to call them), is just not good enough for me.

<related observation>
I'd like to give you one closing thought here. Suppose, just suppose, Oracle would have supported SQL's CREATE ASSERTION statement, thereby providing you with a means to implement all constrainsts declaratively. Just like FK's and PK's are declarative means of implementing a constraint. You would not have to write a single line of DL-code: Oracle accepts your assertion and deduces (computes) when and how the constraint is best validated. Would you use ASSERTIONS? I think yes. You're using PK's and FK's now too, aren't you? And the way declarative constraints are validated "under the hood", is conceptually the same as you building triggers that hold DL-code: at the appropiate moments per row, per statement, Oracle fires the necessary DL-code that's part of the DBMS to validate the constraint.
</related observation>

End of intermezzo.

I want to emphasize that layer 1 (UI API objects) and layer 5 (DL code) need to be separated from the BL and DB-design API layers. Again this all boils down to BL being "the rest". Implementing DL separately, and introducing the design-by-contract UI API layer, will ensure that "the rest" becomes maintainable (thus addressing the Helsinki adversaries' main argument).



Preferably I would use a separate schema inside the database to hold the UI-API objects: I often refer to this schema as the FE (FrontEnd) schema. The yafet would connect to the database with yet another schema: one that only as CREATE SESSION system privilege and all necessary object-privileges on the UI-API objects inside the dedicated FE-schema.

In the next post I will demonstrate the Helsinki code layers with an example database design, some constraints, and an example Window-on-Data page on top of that design.

Hopefully that will be posted in less than two weeks this time, but I do have a real life and a real job next to this blogging too :-).

Sunday, April 12, 2009

People ask the wrong question

People who know me, know that I am enthusiastic about Apex. But I am certainly not an Apex expert. By far not. The DBMS is where my knowledge is. But because they know of my enthusiasm, I often get the question whether Apex is mature enough for building a critical or large-scale WoD application.

I then (sigh and) reply by saying: "You are asking the wrong question."

Pay attention please.

In the Helsinki approach of building a WoD application the technology choice for UI-code is almost irrelevant. Why? Because you will use the yafet to implement UI only. All BL and DL will go into the DBMS. The development of a WoD application in the Helsinki way, will first and foremost be a DBMS project.


The majority of time spent developing a WoD application in the Helsinki way, will be spent on designing and developing the PLSQL and SQL inside the DBMS to implement all necessary BL and DL code.

So the question you should have asked, is:

"Is the DBMS mature enough to support this critical and large-scale WoD application project?"

To which the answer of course would be: 100% affirmative. This was the point of my first observation. The DBMS is mature.

So you see, the yafet that you use to implement UI-code is a second order question. It is still a relevant question of course, but for different reasons than usually intended by the one who poses the wrong question. Their intention usually is about the yafet's capabilities to do large scale BL and DL development. Which in Helsinki is irrelevant. The yafet should only be well capable of rendering UI. Most Yafets (even the sexy new ones like Apex) can do that, no problem. If you intend to also do BL and DL inside the yafet you're on your own. And you need to go ask a different person (I told you I'm not an Apex expert).

The reason why the question is not totally irrelevant has to do with the capabilities of the people involved in building the WoD application. Since choosing the yafet is of second order importance, you might as well choose one that those people are comfortable with.



Remember observation four, the investment required for developers to become productive? Helsinki helps you here: you would only need to learn how to use the UI features of the yafet. And forget about its BL and DL capabilities.

Adverseries of the Helskinki approach always bring up the issue of how to prevent that all DBMS hosted plsql and sql code (for the BL and DL part of the WoD application) becomes one big unmaintainable bucket of spaghetti. The answer to that question is: you need to materialize the code classification and create layers of code inside the DBMS. I will start blogging about that in my next post.

By now I have reached the end of part 1 of my 2-hour presentation and break by leaving my audience with the following closing guideline.


A beautiful example of practicing this exact guideline is Oracle's exabyte storage server. Without going into the details of it, what Oracle does here is push bits of a SQL execution plan further down into the technology stack which increases the execution of SQL by an order of a magnitude.

I rest my case.

Tuesday, April 7, 2009

The Helskinki approaches to WoD application development

[continuing from my previous post]

In a very similar way as I did here for MVC, the Helsinki UI/BL/DL code classes can be mapped across the client, middle and data tiers too:


What I do differently here compared to the earlier display of MVC mapping across the tiers, is that whenever the M is distributed across two tiers, I divide the M into BL and DL. The guideline of how to split up the M, is now made explicit: we divide it into BL and DL. Btw. this only happens in option 5.
  • 1 TFT: UI, BL and DL all go into the middle tier.
  • 2 FFT: UI goes into the client tier, BL+DL go into the middle tier.
  • 3 FTT: All go into the client tier.
  • 4 FTF: BL+DL go into the data tier. UI into the client tier.
  • 5 TFF: DL goes into the data tier. Rest in the middle tier.
  • 6 TTF: BL+DL go into the data tier, and UI in the middle tier.
  • 7 FFF: BL and DL in every tier.
  • 8 TxF: Low and behold! A new option. With Apex (Oracle Application Express) we can push UI into the DBMS. So all goes into the DBMS. The lowercase x denotes the absent mid tier in this scenario.
[I will talk about Apex in a separate post at some time, just thought it was worth mentioning here now]

In the Helsinki declaration we only allow the options that have BL and DL in the data tier: option 4 (if you require a fat, responsive client program), option 6 (html-frontend + you really want to do some Java) and 8 (html-frontend + all you have is PL/SQL knowledge).


All other options have some object-oriented design as the foundation of the WoD application. Which is just plain wrong, and therefore prohibited by the Helsinki declaration. I discussed that here: BL and DL just do not benefit from object orientation: WoD applications work best when they sit on top of a relational database design.

Going back to the picture I've shown you before:


What this means is that we map the DBMS like this:


During the presentation I then mention that inside the Java realm only two classes should be allowed: a Row class and a TableOfRows class. In Helsinki, Java is used only to generate UI: it receives the row or table-of-rows to be displayed from the DBMS. All it needs to do is display the data in a user friendly way. Needless to say that this approach completely and utterly disarms the Java developer(s). Your relationship with them turns into a love/hate relationship:

  • They'll love you because you make their job infinitely more easy.
  • They'll hate you because their billable-hours go down the drain.
Of course, if you use Apex, there is no relationship at all...

Advantages of the Helsinki approach

It should not suprise you that this approach completely turns around the main two disadvantages I mentioned earlier when building WoD applications in a mid-tier centric way. Designing and programming a WoD application in the Helsinki way, will greatly reduce the number of roundtrips to the DBMS.


I cannot repeat this enough: if the database design fits your WoD application needs, the Helsinki approach will always outperform any other architecture. Adversaries often bring up the argument, that by pushing 'application logic' (which they never clearly define) into the DBMS layer, that layer will immediately become the bottleneck. I have two responses to this argument.

  1. You will actually put more stress on the DBMS in a mid-tier centric approach. I gave you the example of 70,000 calls to display twenty customers on a page. I know it was an extreme case. But decrease that to 700 calls: I'm willing to bet one monthly salary that that still causes more CPU cycles to be spent inside the data tier compared to an Helsinki approach. Even the other example I gave (GrandParent, Parent, Child report) which calls the DBMS in the order of 50 times causes more activity inside the DBMS that would have been necessary in a Helsinki approach.
  2. By far the majority of the WoD applications built today, will run just smoothly on todays powerful SMP servers. The percentage of (Helsinki) WoD application deployments that will cause the DBMS to become the bottleneck is very, very small. And, being an Oaktable member, I probably should not say this, but one can always consider clustering the DBMS tier (Oracle RAC) to provide more CPU power in that tier.
The second one was about TCO. Building a WoD application the Helsinki way, will create the opportunity to easily migrate to the next sexy du-jour UI technology.


I have experienced this first-hand. One of my customers had built a WoD application around 2002 in a database-centric Helsinki manner. They used the then hot JDeveloper release with its accompanying MVC frameworks: UIX, BC4J and Struts. Two years ago they were faced with a long overdue upgrade of the JEE technology stack: upgrade to newest release of application server, and newest release of JDeveloper with its current versions of the accompanying MVC frameworks. After having spent six months trying to get this done, it was decided to just rebuild all pages in Application Express (the now sexy du-jour UI-technology). Since no DL and BL logic sat outside the data tier, this was achieved in just over two months by four people, without any change in all BL and DL code that sat inside the DBMS.

I think that could be called agile, not?

A collegue of mine back then coined the phrase: UI-code is throw-away code. I could not agree more. It is the top of the technology stack where all the volatility sits, and continuous change happens. Use it thinly and you get agility.

Thursday, April 2, 2009

Issues with current trend

In my last post I have introduced the code classification used by the Helsinki declaration (as opposed to MVC used by JEE):
  • User Interface (UI) code: all code that creates UI and responds to events in the UI, the same as JEE's View and Control
  • Data Logic (DL) code: all code that maintains data integrity constraints, a well defined subset of JEE's Model
  • Business Logic (BL) code: all other code, the remaining part of JEE's Model, query and/or transaction composing and executing code
I would like to point out before continuing with this post that these three code classes may appear to you as somewhat 'conceptual'. But every WoD application somehow performs these functions. Together they are what a WoD application is all about. So you may have a WoD application that does not reflect these code classes at all, but the point that I make here is that somehow your code in your WoD application is doing these same three things: execute UI, BL and DL code. Show me a line of your code, and I can tell you in what Helsinki code class it falls.

I concluded my previous post by showing you how the three Helsinki code classes interact with each other.


Using this picture I can visualize the current trend of building WoD applications as follows:


We do not use the DBMS anymore. All code, in particular all BL and DL code, is implemented within a du-jour technology (XYZ above) that lives outside of the DBMS.

In this post I'd like to explain how this current trend of not using the DBMS (our first observation) combined with the Yafet technology explosion (our third observation) leads to, what I consider, two rather serious issues in todays WoD applications.

Let's start with the first issue.

Scalability and performance issues

Implementing BL and DL code outside the DBMS leads to chatty applications: these applications call the DBMS many times. Let's demonstrate this using that same picture again. I have drawn a vertical yellow line in it vizualizing the current trend: everything to the left of this line is implemented outside the DBMS.


Say the end user has an order-entry page. He/she enters a few orderlines, and then presses the Save button. This generates one context switch (one call) from UI-code to BL-code. The BL-code then starts processing these orderlines. This causes the generation of an order of a magnitude more than one (which is ten) context switches from BL-code to the DBMS. Per orderline however a couple of constraints are likely to be involved. The majority of these constraints require DL-code that queries the current state of the DBMS. So we have another order of a magnitude more calls to the DBMS.

So one event in the top of the WoD code-stack generates two orders of magnitude more events lower down in the WoD code-stack. Mind you, this is not WoD application software specific, but a general phenomenon in all software. The same is also true for instance in an operating system, which has many layers of code classes too.

In todays WoD applications it is often worse than just the two orders of magnitude described above:
  • If you put BL in middle tier, you’ll call the DBMS an order of magnitude more
  • If you also put DL in the middle tier, you’ll call the DBMS two orders of magnitudes more
  • But, if you then also don’t use SQL effectively, you’ll call the DBMS three orders of magnitudes more
I have given you an rBL (read-Business Logic) example of not using SQL effectively at the end of this post. Let me share another experience in this area. A few years ago I was asked to help analyze a performance issue in a very simple user-enters-search-criteria, user-hits-search-button, WoD-app-displays-rows scenario. This involved a 'search customers' page where the user would enter a leading string of the customer name. It took almost a full minute for the application to come up with the first set of twenty rows to be displayed. This WoD app obviously was built in the current trendy manner. So what do you do? You try to find out where time is spent. I performed a sqltrace of this action. And guess what: the trace reveiled that the application was sending in the order of 70,000 SQL statements to the DBMS.

70,000...

Yes folks. 70,000 SQL select statements to come up with the first twenty customers. I am not kidding you. Unbelievable. Obviously the DBMS is not the problem here: it is servicing 70,000 queries in just less than 60 seconds! The problem is the architecture of the WoD application.

This particular scenario could be done with just three calls to the DBMS which would take a fraction of a second to execute on top of a good relational database design.
  1. Open cursor
  2. Perform array fetch of first twenty rows
  3. Close cursor
(actually with the current SQLNet protocol this might even be less than three roundtrips)

But using a 'black box' that is designed to instantanously provide the data that is to be displayed, would be so uncool. No, ... executing 70,000 queries and doing lots of application logic yourself, is way more cool.

This chatty application behaviour leads to:
  • more latency hits: every time you go back and forth to the DBMS you will be hit with some latency. If this happens a lot, it starts impacting the performance.
  • more 'system' cpu: context switches are accompanied by 'fixed costs' incurred by the network, OS and DBMS software layers that are constantly creating and destroying contexts. System cpu is pure overhead. It's not adding any value to the business using the WoD application.
  • more data transfer out: since complex SQL processing (that is any SQL involving more than one table) is effectively performed outside the DBMS, you are sending more data out of the DBMS to that place where the SQL execution plan is now effectively implemented.
This all impacts the performance and scalability of the WoD application.

What this chatty application behaviour also implies, is that the required iron power to run these applications is absurd. I have seen deployments of not too big WoD applications that require farms of application servers, and n-way RAC configurations on top of expensive mid-end hardware configurations. And when I investigate the load coming from the end-user population I start thinking: had this application been architected differently, my bet is it would run just smoothly on a single, let's say, 2 quad-core Intel server that can be bought for about 10K Euro.

Let's move on to the second issue caused by the current trend of not using the DBMS and the ongoing Yafet technology explosion.

WoD application TCO (Total Cost of Ownership)

The second issue is about how the current trend of building WoD applications is impacting the TCO of such an application. Given the ongoing Yafet technology explosion, if you implement all BL and DL code in the volatile XYZ technology, then:
  • your application is legacy (technology-wise) within a year
  • it will be hard to find XYZ knowledgeable people
  • it will be very, very hard to migrate to the next sexy Yafet since it involves migrating BL and DL code (often the only option is to throw away, and code again)
I know of WoD applications built in the mid-nineties in a database-centric way, that today almost 15 years later still run the majority of the BL and DL code unchanged inside the (current version of the) PLSQL virtual machine. I have yet to see the first Java-centric WoD application that runs Java code unchanged in the current versions of the frameworks used to implement the M, the V and the C parts of that application. Often the MVC frameworks used 10 years ago are just not there anymore. So often the WoD applications built 10 years ago still run on those now by long desupported versions of the framework. There is no easy migration possible. It always involves a major effort impacting the TCO of that application.

So in conclusion:
  • Todays WoD applications suffer from performance issues. This is caused by not pushing down work that could have been taken care of lower in the technology stack, which in turn causes orders of magnitudes more events (context switches) lower down in the technology stack.
  • Todays WoD applications have a high cost of ownership. This is due to the high technology change rate at the upper end of the technology stack, where all application logic is implemented (which changes a whole lot less fast). Businesses are faced with either, having to rebuild major parts of the application using the newest Yafet, or having to pay high rates for scarce programmers still willing to work in the old-fashioned Yafet.
The opposite of DBMS independence is what we need!

The ugly state of affairs with regards to todays WoD applications is not in the least also caused by that other popular belief. The one that dictates applications should be built in a database independent manner. In the Helsinki declaration I cannot but conclude that quite the opposite is what our customers need. They do not need DBMS independence, by far not. They need Yafet independence. Yafets change all the time, they come and go. The DBMS is probably the most stable factor within the IT landscape of a business. So we need to architect their WoD applications in such a way that they (the applications) are as immune as possible for this ongoing technology change outside the DBMS.


So what this all leads to is that BL and DL code should be implemented inside the DBMS.

Before I talk more about that, the next post will first map the Helsinki code classes accross the client, middle and data tiers (like I did with the MVC code classes here) and revisit the seven thin/fat-thin/fat-thin/fat alternatives again.