IIT Madras “Server” Uptimes
April 13, 2008
Several years ago, my opinion of a ’server’ was a computer with amazing hardware (that of a fairly affordable workstation 4 years from now) that would open up a black and white terminal with some completely ununderstandable prompt. By my 11th standard, I could ‘understand’ what that prompt meant, after sufficient experience with the pseudo tty’s in Linux. But I still was under the misconception that a server was a computer with amazing hardware. Soon I realised that my system could become a server! All it needed to do was run a daemon that did some job and could accept connections over the network. So a ’server’ was some system, with enough resources for catering to the demands of all clients, running a daemon, with amazingly tight security and huge uptimes of the order of years.
However, my entry into IIT Madras further required a redifinition of server. A ’server’ is defined as follows:
A server, is a machine with amazing hardware resources and no matter how amazing - exploited in all possible ways by the student community, with the most well known security holes left unpatched (because they are never updated :-P), and with uptimes less than that of the average workstation in the hostel zone.
What do the uptimes of these ’servers’ look like? Let’s see a few examples:
- The Central Web Server: 11 days
- The “Updated” Central Web Server: 22 days - This one is called ‘updated’ because it has PHP5, while the central web server is still on some ancient version of PHP4. It runs CentOS kernel 2.6.18-8.el5 (which is well known to be susceptible to the Local Root Exploit, if it has vmsplice or whatever enabled.)
- The Institute Mail Server (smail.iitm.ac.in - so called because it runs squirrel mail): Probably about 3 days. This server manages the institute’s email updates. Important email including those from The Dean Students, IIT Madras to the students are sent to email accounts on this server. All the students of the institute use this server for institute-related email as well. Interestingly, ftp.iitm.ac.in, the local Debian / Ubuntu / Kernel mirror’s harddisk is on this server - so down goes the repository, used India-wide, every 3 days
- The Students’ Server (students.iitm.ac.in) - This one has a whoopingly high uptime of 450+ days. And why? It is student managed!!
- The Physics Dept. DCF server: Now it is 142 days, but earlier, this one had been up for about an year in the recent past. And why? No CC-interference. Managed by Physics Dept. - and that too by brilliant research scholars and professors. Runs CentOS release 4.3, 2.6.9-34 : Looks safe. Atleast here, your mail to the professor about why the server should be updated, if it needs to be, will be well received and not marked as spam.
- Random Workstation in the DCF: 8 days now, was 46 days once upon a time. This one’s one of my favourite comps there. Has TWM, runs Fedora and NOT Ubuntu :-D. This is a workstation, so I don’t care if it is not updated.
- My Computer: 15 days. Better than the Central Web ‘Server’. I run a web server with a lot of shared stuff, so I like to keep this thing on. I try to save power by putting off my monitor, but not a good idea. I have just been lazy to put it on auto hibernate at 1:00 AM and auto wake-up at 7:00 AM. Apparently, the institute spends a whole fortune on power for the hostels, just because of comps like mine
- More worse, some people leave their monitors on. This one runs Debian (testing), but still is on a vulnerable, outdated kernel
(Lazy to update, despite friends threatening to try the local root exploit - because only good friends have logins.)
Interestingly, I find ‘corporate’ web ’servers’ with similar uptimes. 6 days 11:00 hours on this server I have access to. It runs Debian 3.1 but its load is overwhelming with load averages above 9. I wonder how the processor hasn’t gotten fried as yet. Kernel = 2.4.32 - hopefully safe.
April 13, 2008 at 12:57 am
So, you actually did some searching on the thing we were farting about, I still can’t believe our insti-managed stuff is this bad, I mean 11 days is too less. Are you sure about that?
April 13, 2008 at 1:09 am
Logged in and checked.
April 17, 2008 at 9:09 pm
Kumar forwarded a mail from Prof. P. Sriram.
Some servers including the SMail server were experiencing hardware failures very frequently - and he and Mr. Sreekumar from th CC could not trace the cause - the cooling was fine and so was the power. They have relocated the servers now, and the servers are up again. I might’ve been unfairly harsh on the CC in the post, but that is not true - despite his busy schedule Prof. Sriram replies to our email
April 23, 2008 at 8:33 am
Dear Akarsh,
Thanks for your kind comments on the Physics DCF server. We did make an effort to be independent of the CC though there are things which still remain beyond our control (nameservers, campus networking). I am personally not big on huge uptimes on machines — my experience has been that, if and when it crashes, there tends to be serious (usually hardware) problems. The research scholars are thrilled to get uptimes of the order of a year or so for the servers while I recommend that we reboot, run file system checks etc, at the very least, once in three months.
Suresh
April 23, 2008 at 9:14 am
Dear Sir,
Any particular reason for rebooting? Probably we could schedule the filesystem checkers using cron, to run once every day or something of that sort. What kind of hardware problems are normally encountered - i.e., are they disk failures, motherboard failures etc? How does long uptime cause this?