- Scale horizontally (scale out): Separate system tiers on different environments: database, Solr, memcached, push, web/app servers. Separation helps in scaling up/down tiers individually.
- Use the cloud (Platform as a Service) right away that facilitates scaling out/up/down (Google AppEngine, Heroku, AWS Beanstalk, …
- Monitor usage on each tier to scale up/down in the correct time (e.g. NewRelic)
- Use push instead of server polling (Pusher)
- Don’t use filesystems for storage, unless it is a distributed filesystem (AWS S3)
- Don’t involve your app server in long requests/responses. Slow clients may block your server and cause longer request queues (depends on implementation).
- If you want to receive an upload get it through S3 with some work on the client side.
- If you want to send a huge response, either stream it using a streaming capability of your app server, or generate it using a background job that stores it finally on S3 and sends the direct link when done through the app using push or through email.