Originally posted in Greek here.
I've build a smart client application for the management of a database table with the classic smart client architecture: The data access layer gets data from the database, wrap it up in a dataset and push it through a web service to the client application. The client caches data between sessions, updates those data when they're changed on the database, sends updates back when requested etc.
This application is a very simple address book (consisting of only one table for this example simplicity). The authorization schema is flat simple, all users have access to all the data and the application is going to be released commercially. That means that the use of the application will vary among the different users: One will use the address book to hold the 100 contacts that used to manage in outlook, one other will use it for the 1,000 contacts of his small company and a third, a big multi-national company, will use it to hold the contact info of it's 1,000,000 clients.
By reading Microsoft examples and anything else I could possible find on the web, I understand that they’re dealing with totally simplistic scenarios, expecting that the users of the smart clients will manage a quite small amount of data. TaskVision, for example, downloads all the data that interests the user in one piece. In my example this is possible only in the cases of 100 or 1,000 contacts. When it comes to the 1,000,000 contacts we face quite a problem, because you can't transfer it in logical time and you can't handle it efficiently on the client altogether.
So, I realize that despite the fact that the application stays exactly the same, the volume of data in the datatable is the factor that specifies the caching and synchronization policy I must follow. I'm thinking that the application must have pre-implemented scenarios which will be activated either by the user, or automatically based on the size of the data table.
I'll try to examine possible scenarios, based on the volume of data that will be transferred in each request:
- All: The Microsoft way. Perfect solution for small amount of data.
- One: If I get only the data row that the client requested each time, I've continuous delays on each request. Furthermore, the added delay for the roundtrip of the request through the web service and back, will give an awful experience to the users. This is the worst case scenario.
- X: How much X? Here I go for some ideas:
- Relevant data. We have to dig in our data and to the business intelligence behind the data. Maybe I should get all contacts from the same category (e.g. clients). Maybe in some cases the user will benefit from this prediction. But what about all the other cases? What about a use scenario that doesn’t use categories to organize contacts?
- Commonly used data. I fill up each data row with usage statistics. So when the client requests a row, the web service will return this row along with others that are commonly requested and the client doesn’t have them.
- What else could exist? You have to swim in the usage stats to figure out a way to predict the next move of the user.
So, by examining this point of view, I come up with the conclusion that the smart client tricks can be effective only when I have a small amount of data.
Let's examine it from another point of view: How much and what data will I cache on the client?
- All: For a small amount of data, this is perfect. I can also output my dataset to an xml file on the user’s hard disk and everybody is happy.
- None: I loose any performance enhancement.
- Y: How much Y? Let's try my head for one more time:
- Commonly used data. Yeap, maybe in some cases it will help. But with others...
- Last Y requested. Maybe this in combination with (a) is a little better. But I can still think a lot of common uses that fail to be benefited from tricks like that.
- What else can you think?
From what I can see, this point of view does not help with large amount of data either.
I sum up, and this is my conclusion:
- When I have a low data scenario, I transfer all the data to the client and cache it all locally.
- When I have a high data scenario, I do one or two tricks that I can think of, and leave all other matters to the mercy of user's bandwidth.
As for the point that separates the "few" data from the "many" data, it may be possible to distinguish it from what I just wrote. Measure the download speed of the client and decide a real time limit.
My real word scenarios are two:
- A web based content management system with categories, articles, products, users, files, images. It's already in the market and quite different usage scenarios have been observed: A user uses it to handle 10 categories, 30 products and 2 users, and another uses it to handle 20 categories, 20,000 products and 10 users.
- The second case is a portfolio application for an insurance company. A user uses it to handle his 150 clients and their contracts, another uses it to handle 5,000 contracts.
As I have understood, the "smart client" slogan is very catchy, but in real world scenarios is quite inapplicable. Even with Microsoft Task Vision, if you sell it to a client that uses it to manage 10,000 tasks, you'll have a lot of problems. From the other hand, if you try to develop a solution that will handle those cases, you end up questioning like me.
Maybe I've not understand something well? Is there anybody who can come up with more and better ideas?
Comment me please...