Living in a Cluster
The Broadchoice Collaboration Platform (BCP) is deployed on a cluster of servers and this has a number of interesting implications for the design of the application as well as the actual deployment process and file system structure used.
I'll be posting several entries on clustering considerations but I wanted to start with something that surfaced with the recent launch of this blog. BlogCFC is part of our standard SVN repository (I'll also be blogging about our source code control processes) and so it is also deployed to the same cluster as the BCP. Each server runs the same code from its own local file system. BlogCFC allows authors to upload images and files that are used as part of the blog entries. If you simply deploy BlogCFC onto each server in the cluster and then create a blog entry and upload an image, that image will be stored directly on the file system of the server on which your request is processed (in fact, the server on which your entire session is processed, since we use "sticky session"). The other servers in the cluster won't know about the image. For user requests that come in to that first server, the image will be served correctly. For user requests that come in to the other servers, the image will be missing.
The BCP has the same issue - it allows authors to upload CSS, images and documents - but we have to ensure that all these uploaded assets are available to all servers automatically and immediately. Our approach was to design the application in such a way that shared assets are stored in specific directory trees that contain nothing but shared assets. We have a NAS - Network Attached Storage - which is mounted to every server as /var/www/html. That contains a documents directory and a custom assets directory - into which all uploaded content is placed. Symbolic links are used to "map" those shared directories to the appropriate place in the deployment directory tree:
/var/www/production/lib -> /var/www/html/lib /var/www/production/wwwportal/custom -> /var/www/html/customWe deploy our source tree to /var/www/production (direct from SVN) with lib and custom ignored by SVN. Each server then shares the same uploaded content without needing to use different file paths to how we would deploy to a non-clustered server.
Tonight, I applied the same fix to BlogCFC, adding blog_images and blog_enclosures directories on the NAS and adding symbolic links back into the BlogCFC deployment tree (as wwwblog/images and wwwblog/enclosures respectively). We have not yet dealt with making blog.init.cfm cluster-safe or the XML file generated by the pod manager. For both of those, we actually keep the files under SVN and handle changes as part of a managed process (i.e., by tickets in Trac).
This is just one of many things that need to be considered when designing applications for clustered environments. I'll be blogging about other clustering issues over the next few weeks.


That being the case, file size doesn't have too much effect on storage and retrieval. What you have to worry about is the speed of transmission between the DB and web server. However, your applications could do some local caching to help speed things up. In affect, treating files the same way CF treats CLIENT vars. If BlogCFC needs an image it doesn't have on a particular node, then it pulls it down from the DB and caches it locally for the next time. The nice benefit of keeping everything in one place (the DB) is that you only need a single backup and replication strategy.
For user uploads (of which there are very few that need mirrored) we always have the process FTP to both. So if a user is adding a PDF to the site's library, the upload process will, regardless of the server you're session is connected to, FTP the file to the same location in both places. We have error handling that will whack the file from the "A" server if the upload to "B" fails for any reason, so there is less chance of orphaned files living out there.
This works well for our two machine cluster (though we're going to be expanding in the near future I think), but it probably won't scale terribly well with larger clusters. But it's been fabulous for us thus far (absolutely zero issues in almost a year of production use).
Keeping files out side of the database makes the database small and very efficient - small indexes, load faster as we only need to load meta data. It also allows us to manage our file grouping, sharing, access permission etc much easier. We also use Google Search Appliance (GSA) to crow the content for search purpose. By having files grouped by account etc in file system, it make GSA configuration easier (what to crow, what not to crow, etc). With all file content in database tables, it would be very hard to achieve all these.
Granted, there have been some issues with S3 in the past (it was down for about half a day last month), but overall its great. We currently have upwards of 50GB of user photos stored on S3 and would not want to have to deal with managing and backing that up ourselves.
If I'm reading the note correctly, it sounds like you use SESSION in your apps rather than CLIENT. Do you run into any issues with sessions across the cluster and/or with AOL users? Years ago, once we were up to 3 or 4 servers in our production CF cluster, we went to all CLIENT vars for browser persistence, simply because a 'down' on one server would smoothly handle transition of a user from one server to the next with no loss of data or work. If I open a form on A, spend 10 minutes filling it out during which A locks up, and then submit the form, the hardware load balancer will automatically drop my post request to B (or C or D ...). At that point the app tries to validate, and, since the CLIENT structs are in the database and validated against the browser tokens, the form still posts fine on B and the user never knows that the server they were just on went off-line. No need to log back in and figure out where you were. I bring up AOL users simply because we had a number of public, non-tech sites we hosted and AOL used to (maybe still does) reserve the right to change a user's IP address during a session, which would totally kill things like SESSION-based shopping carts, etc. Again, use of CLIENT vars completely solved that one as well.
Thanks
Thanks
Small World - I think I may have worked with you in the past on projects at Marketron. Now you are working with all my coldfusion heros:)
Its good to see several ex-Marketron folks finally come to appreciate the technology that upper management wanted to migrate away from.....
This is using Sun Cluster global storage, attaching luns to a group of computers, which is a similar approach to NAS.
the one thing I find lacking in your post is your use of the word cluster. Above I gave one example of a cluster solution, and of course there is the ColdFusion approach to clustering. "Cluster" is to scalable computer what Kleenex is to the consumer - no longer a brand name but a general term for an object. But when it comes down to it, "cluster" needs a little more definition sometimes. :)