This is my first blog post and I don’t know how to write it but still I am trying. So fasten your seat belt and get ready for the roller coaster that I will ride with you all with some of the questions which arises sometimes in our mind.
My today’s question is “What database actually Facebook uses?”
A billion of people are using FACEBOOK.Users are expressing themselves and interacting with their peer and friends through wall posts, uploading their photos, passing information’s about events and other meaningful information and for that reason facebook needs a large scalable database.
I could imagine that is why it is a very popular Google search keyword. 🙂
I have searched a lot on this topic and come out at a conclusion that Facebook use several database techniques. The challenge for Facebook’s engineers has been to keep the site up and running smoothly in spite of handling close to billion active users.
This article takes a look at some of the software and techniques they use to accomplish their mission.
Facebook primarily uses MySQL for structured data storage such as wall posts, user information, timeline etc. This data is replicated between their various data centers.
It is also important to note that Facebook makes heavy use of Memcached,a memory caching system that is used to speed up dynamic database driven websites by caching data and objects in RAM to reduce reading time.Memcached is Facebook’s primary form of caching and greatly reduces the database load. Having a
caching system allows Facebook to be as fast as it is at recalling your data.
If it doesn’t have to go to the database it will just fetch your data from the cache based on your user ID.
The Photos application is one of Facebook’s most popular features. Up to date, users
have uploaded over 15 billion photos which make Facebook the biggest photo sharing website. For each uploaded photo, Facebook generates and stores four images of different sizes, which translates to a total of 60 billion images and 1.5PB of storage. The current growth rate is 220 million new photos per week, which translates to 25TB of additional storage consumed weekly.
Implements a HTTP based photo server which stores photos in a generic object store called Haystack.
The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Facebook uses it for its Inbox search.
Scribe is a flexible logging system that Facebook uses for a multitude of purposes
internally. It’s been built to be able to handle logging at the scale of Facebook, and automatically handles new logging categories as they show up.
Varnish is an HTTP accelerator which can act as a load balancer and also cache content which can then be served lightning-fast. Facebook uses Varnish to serve photos and profile pictures,handling billions of requests every day.
HIPHOP FOR PHP ::
HipHop for PHP is a set of PHP execution engines. HipHop was developed by Facebook and was released as open source in early 2010. To date, Facebook has achieved morethan a 6x reduction in CPU utilization for the site using HipHop as compared with Apache and Zend PHP.Facebook is able to move fast and maintain a high number of engineers who are able to work across the entire codebase.
So, while “What database does Facebook use?” seems like a simple question, you can see that FACEBOOK developers have added a variety of other systems to make it truly web scalable over their 500million users.