Skip to main content

Network Infrastructure v.1.0 - Building Redundancy

About end of 2005, we redesigned our minimalistic single server setup to host some 5-6 websites, team CMS server, email server, SVN code server and an home-brew perl application we use for order processing. The "as-is" situation is shown below.

Important aspects of this configuration
  1. Web services are split over a couple of different machines. So if 1 machine goes down, not all sites are offline.
  2. Non-mission-critical sites (ex. team collab site) are split off on a different machine. So production servers are fully available for public sites.
  3. Mail server is split off on a diff machine. This runs spam, anti-virus programs which are memory hungry, so any performance peaks effect only mail services.
  4. Machines placed beyond FW2 are fairly well protected as they A) Have a fw protection that allows port access on specific ports where services are running, and B) Services are accessible on non-standard ports as there is port forwarding setup on FW2
  5. Servers were configured for only local backups (usually combination of tar/gzip and time-stamp)

Advantages of v.1.0 Architecture
  1. We got a decent level of "horizontal redundancy" as services were well spread out on different machines. If one of the machines went down, other sites and services are still up
Disadvantages or things still lacking in v.1.0
  1. If a machine went down, all it's services were out. In the worst case the disks blow off and there's total loss of data. Local backups we are doing would be ineffective for recovery.
  2. Each service like mail, web and database was installed multiple times on each root-server, once per machine. This is a limiting factor for scalability due to incremental maintenance
Issues to be Addressed in v.2.0

Given that there is a possibility of 2 majorly damaging avoidable scenarios -
  • Total loss of services if a data-center or specifically the root server goes down
  • Total loss of data if there's hardware damage on root-servers
..remain unaddressed, we need to look at improvements.

The improvements aimed in v.2.0 are:
  1. Remote backups on geographically separated machines. It is desirable to have a protocol or policy (include schedule/checklist) for performing backups.
  2. Is it a good idea to build vertical redundancy. This would mean run each type of service on a separate machine and provide a fall-back for it. So all websites are hosted on a single server and it is fall-backed with an exact replica/mirror (consider implications on version management of production-release code). Similarly database, email, SVN..
  3. What are the cost implications of doing full-vertical-redundany? Will it be more economical to do a hybrid of horizontal & vertical redundancy?
  4. Can static content like images be moved to a separate server.