I was just reading and checking out good presentations about Scalability, Redundancy and Availability at this post and had to say something about it.
What I’ll say now is based on my experience running a cluster with about 30 servers that was serving about 1.5million email accounts with something around 12million emails daily. I’ve no idea if this is a big issue, but what I can say right now is that our system was coded in perl and we had only crappy servers except by our NAS storage (NetApp) that was a big hero back then.
Most of stuff on those presentations are obvious stuff but that brings a good overview to those ones that never had a dragon monster asking for I/O and processment on your hands.
The biggest issue when having to scale your system is, without doubt, money. The second one is how good is your framework and application built upon it.
Actually I’d say both are big issues, but with money you can use those big monsters that has a lot of processment power, memory and good I/O.
What I could say is that cheap stuff isn’t worth. As much as you can you should avoid dealing with those things you’re not familiar with, in other words, let people that knows about hardware do their work and change you application to use the power of the hardware they provide.
For example I’d never use DAS again because you have to take care of everything while a good NAS server like NetApp one can do the whole job for you.
I would never use LVM again, even more with XFS. I lost 2 years of my life with these freaking software junk stuff. They’re only for things you can loose without breaking your company.
What I think is useful and could help when scaling without much money to spend:
- Mysql replication with a small cluster set up
- LVS – Linux Virtual Server, do a great job with load-balancing stuff
- Application that can user slave databases for reads and write only on master server
- A mix of static content that looks dynamic
- A good monitoring and provisioning system to warn you before the load rises too much
- A hot/cold caching system, where old and non accessed data can be stored in a cheaper storage
- A good database design (index everything is a big lie, index what needs to be indexed)
- Don’t use ORDER BY RAND() on mysql (learnt few days ago)
This is a small list of what I can remember right now. The point is that you’ll find out a moment where changing your application doesn’t fix performance issues anymore, that’s when money and good hardware is needed.