So, to find the perfect web technology we need a typical web application. In this post, I’m trying to figure out the minimal or common requirements of a typical web application.
As I said, I think most developers never consider all the gotchas of doing an application using web technologies. The following requirements may look like a little too much for “just a web application”, but I think this is the real minimum of any web application, no matter the ‘size’.
- Data processing and storage
- Everything is UTF-8: all textual data (including source code) must be stored, processed and displayed in UTF-8 encoding. The reason is that Unicode is the only sane way of handling data in non-english languages. And from all Unicode standards, UTF-8 is the clear winner in terms of usage and acceptance. This simple innocent requirement immediately discards common used software, like Windows XP and PHP. Isn’t this amusing?
- Validation: it should be natural or at least easy to assign validation rules of data to programming logic. Examples are: not null fields, only numbers/dates or creating a new validation rule like the Chilean “dígito verificador del rut“. It doesn’t need to be a framework: if the language provides a natural solution (ex: a closure/decorator in python) it’s sufficient. It should do data validation in both client and server sides, kudos if the rules are written only once.
- Development
- A joy to develop: editing HTML files is so 90′s… Besides, we are not building pages anymore, but full desktop like applications. So, a modern web technology should encapsulate all the details and talks in terms of reusable components. Examples of this are Google Web Toolkit, some IDE’s with JSF support and some RIA frameworks.
- Fast during development: I’m kind of tired the waiting in many JavaEE IDE’s like eclipse. Everything should be easy to test/run/change during development.
- Debugging: it should be a joy to debug, and I mean, real debugging: pause the application, look around traces and variables, setting breakpoints… you know what I’m talking, right?
- Multi-tier architecture: the web application should be a thin presentation layer for services that do the real work. Technically, the business logic is running on remotes servers via a remote procedure call protocol of your choice (RMI, CORBA, Hessian, JSON, XML-RPC, Web Services, whatever).
- Data exporters: the technology must facilitate the export of data to other formats besides HTML within the same application. Examples are: PDF, Office Documents, Chart reports, and other common formats.
- User experience
- Fast and reliable: the application should load in no time, it must not crash. The user must never see a 500 error page, neither get a trace of what went wrong… no excuses!
- Browser independent: it should work and look almost the same in different browser and operating systems. To put names: Internet Explorer, Firefox, Chrome and Safari; Windows, MacOS and Linux.
- Desktop class: the application shouldn’t look like a page: it should look and behave like a real desktop application does. Bonus if the application can manipulate data offline!
- Security and integrity
- Authentication and authorization: the technology must provide a sane and safe way of authentication.The authorization of certain functionality should be available too, and it should be role-based.
- Auditing: it should be easy to generate audit logs, ideally through a declarative approach. The log should have: the action, logged user, and the data associated.
- Transactional: the web application should not be transaction aware, neither transfer transactional state to the business layer. Every service invoked in the business remote servers is required to be transactional: if the remote operation succeed, transactions are committed; if the remote operation fails, transactions are rolled back. Moreover, this means there will be no long running transactions by one or more web page requests.
- OWASP: the used technology should include countermeasures or help to minimize all common security flaws reported by 2007 OWASP top ten vulnerabilities report, which are:
- CSS (or XSS) attacks.
- Injection attacks, like SQL injection.
- Malicious file execution attacks.
- Insecure direct object reference.
- CSRF (or XSRF) attacks.
- Information leakage and improper error handling.
- Strong authentication and session managment.
- Insecure communications.
- Failure to restrict url access.
- Error handling: any error should be informed to the user, without revealing anything about the internal work of the application. Bonus: a ticket should be assigned, so the user problem may be associated to related traces and logs.
- Logging: must not include secure or very private data, like passwords.
- Performance and operational
- Scalable: the web layer should be prepared to scale. Given that all business logic is on remote servers, this should not be a real problem. However, there is data that only belongs to the web layer and is common that this data does not allow to scale, an example of this is session data.
- Vertical/Horizontal: It should be relatively easy to scale vertically (adding CPU cores to the server) as horizontally (adding more servers).
- Logging: given the multi-tier architecture, logging will be necessary, both in development and production. Ideally, all messages should be redirected to a central location like syslog.
Summary
Well, I have put some basic requirements that any application must/should follow. Your technology of choice may excel at some points, but it’s probably that some points are really bad and some of them it’s not even considered.
Next post, I will define the test application I’m going to create in each language and its requirements. Keep tuned!
Congratulations! Very promising article. Looking forward to further reading
I have also some thoughts regarding “Multi-tier architecture” list item. Don’t you think that being able to move some work to the client side could be beneficial in some situations?
For example doing some calculations on previously fetched data on the client side could save a lot of cpu time and bandwidth in a web application with thousands logged in users.
Lombardi in theirs Blueprint product are going even further and benchmarking in real time which approach to choose, client or server side.
Hi Tomasz, thanks for participating!
My approach goes a little farther than what you say, I mean that we shouldn’t move some work, we must move all work that belongs to the client browser, the web servers and the remote services.
To explain my point, imagine a bank application and a remote service named ‘accountBalanceEntries’: it returns the account balance and a list of ‘unix timestamp, description, amount (positive or negative)’. For example:
(500, [(1234123, 'deposit', 200), (12343234, 'check payed', -450)])
Now, this is not the same as displayed on the user interface and it shouldn’t be, because the requirements are usually different. For the web part, we need a table with a real date (not unix timestamp!), a description column, a debt column, a credit column and a balance column. So, the transformation of above should return something like this:
[('2009-10-01 15:00:30', 'deposit', 0, 200, 950), ('2009-10-05 10:00:23', 'check payed', 450, 0, 500)]
If you have a RIA web framework you can do all of this on the client browser, including sorting by date and a live search over the descriptions. Whether the above list is created in the web server or the client, I think it should be in the web server. You should not exposed the internal working of your core services or applications; besides we may reuse the code if exporting to XLS can only be done in the web server, for example. In either case, the client browser never contacts directly the accountBalanceEntries service.
If you don’t have a RIA framework, it must be done in the web servers as pre-2.0 apps. A hope to clarify more on this on the next post, with the test application requirements.