Earlier this week I published a data visualization on the Facebook Engineering blog which, to my surprise, has received a lot of media covereage.
I’ve received a lot comments about the image, many asking for more details on how I created it. When I tell people I used R, the reaction I get is roughly what I would expect if I told them I made it with a Microsoft Paint and a bottle of Jägermeister. Some people even questioned whether it was actually done in R. The truth is, aside from the addition of the logo and date text, the image was produced entirely with about 150 lines of R code with no external dependencies. In the process I learned a few things about creating nice-looking graphs in R.
Transparency and Faking It
My first attempt at plotting the data involved plotting very transparent lines. Unfortunately there was just too much data to get a meaningful plot — even at very low opacity, there were enough lines to make the entire image just a bright blob. When I increased the transparency more, the opacity was rounded down to zero by my graphics device and the result was that nothing was drawn.
The solution was to manipulate the drawing order of the lines. I used a simple loop over my data to draw the lines, so it was easy to control which lines are drawn first using order(). I created an ordering based on the length of the lines, so that longer lines were drawn “behind” the shorter, more local lines. Then I used colorRampPalette() to generate a color palette from black to blue to white, and colored the lines according to order they were drawn.
Great Circles
I wrote my own code to draw the great circle arcs, although I later found a CRAN package called geosphere that would have done it for me (albeit with rougher lines near the poles). I drew the great circle arcs in a way that was easy to derive but slow to compute. I bisected the lines recursively, finding their great circle midpoint, until they were short enough to resemble an arc. To find the great circle midpoint, I converted from spherical coordinates to Cartesian, found the midpoint, then converted back to spherical coordinates and extended the radius.
Euclidean Distance
Several observent commenters called me out on using Euclidean distance on the projection for the ordering function. Having the ordering function depend on the distance on the projection seems counterintuitive, as Eucliden distance is wildly distorted near the poles. I accepted this drawback because the exact drawing order wasn’t important, as long as very long lines were drawn below very short ones.


Pingback: Quora
Paul your social networking graph probes that meaningful information can also carry emotional content. Thanks.
For everybody I have share your graph the first reaction has been emotional. Keep the good work. I will wait for your next one.
GREAT work you’ve done with the visualization. At Wosju we also work with the Social Graph, but across Social Networks. I would love to have a chat with you about how we together could visualize even more complex data sets. If You’re interested please feel free to contact me.
Happy Christmas, and all the best
Hans Henrik
Well done! your Facebook map is fantastic!!!!
Cheers,
Fernando
This is awesome. Congratulations!
I’m showing it to all my stats friends who aren’t yet enamored with R.
I love this picture. I will probably use it as an example of great R graphics in my teaching. :)
Do Facebook have similar data from throughout its history? It would be pretty cool if it was possible to make a series of such maps (one for each month since Facebook opened, say) and to use them to visualize how the friendship network developed over time (much like Gapminder’s visuals).
Pingback: James Fee GIS Blog » Blog Archive » Using R to Visualize Facebook Friends
Very nice graph, where did you get the data to graph this?
Pingback: TempusFactor » Blog Archive » links for 2010-12-21
Where did you get the data?
Pingback: Using R to Visualize Facebook Friends | Mapsys.info
This is great.
Any chance we could talk you into posting the code? And are you interested in turning this into a Facebook app so people can plot a map of their own friend pairs?
Beautiful. I just shared this on Facebook. Well done!!
Pingback: Quora
Hi Paul,this picture is really famous in China recently :)
Pingback: Visualizing Friendships with R « Mayte's Blog
This is a fabulous representation that combines raw data with mathematical interpretations.
I just read your visualization article via Facebook. OMG! AMAZING!
Terrific map! Can you approximate what percent of connections are between people in different countries? Or (I think equivalently) what fraction of the lines on your map (weighted by line color) cross national borders? Thanks!
Pingback: Visualizing Facebook Friendships « monarey
Hi! Congratulations for the amazing data processing and geovisualization job you did to get this beautiful map.
I have a question about the weight you associate to your lines. I understand that it depends on both factors : the number of “friendships” between two places and the distance between the places. It seems to me that the message of the map is ambiguous in some way : in the places where there are many people using Facebook, you cannot discern any line. We have just an indicator of Facebook member density. And the only lines that appear thanks to the weight are the long ones that connect very few peoples and could be statistically not significant (you seem to suggest in your article that some of these lines can connect a person to another). Could you give some information about the way you adjust this weight in order to obtain your result.
I thought you could be interested in the sort of “deconstruction” analysis I did of your image-map . As you learn French, you may read it here, if you are interested : http://mondegeonumerique.wordpress.com/2011/01/10/500-millions-damis-la-carte-de-facebook-1-deconstruction. There is also a Google translation page on the right in the case.
I will be very interested in your comment.
Pingback: Otra visualización de la Twitteresfera gallega « Ensaio (mínimo) sobre a preguiza
Unbelievable. I wondered when I read your post you done this with just R, that to be only 150 lines of code. Great work really appreciable.
Mr. Butler: Very nice visualization. Can you generate a similar graph with brightness determined by median income in the smallest area that data is available? I suspect the result would be somewhat similar. It would be interesting to see the differences between the two.
MEC
Hey Paul, came across your blog from a random googling of ‘data visualization’. Glad to see UW students shining in the limelight again :)
Hope you’re enjoying your term at Facebook. Keep up the good work.
Zack
Pingback: Global Migration Maps
Pingback: Quora
Pingback: How to map connections with great circles
Pingback: Fun with R | It all comes out in the wash…
Pingback: What value is cross country GDP correlation? [Part One] | Back Side Smack
Pingback: Weekly Roundup of R/LaTeX Advice and Miscellanea « kRistens.blog
Pingback: Quem teria pensado que a amizade no Facebook pode ser tão bonita? « Sempre Repórter
Pingback: carte des amitiés Facebook réalisée avec R | Jeux De Fille
What’s the licensing on the results? Am a college professor, and would like to use this in course content, including potentially an open (CC-BY-SA probably) mini-textbook of sorts. Is that OK?
Pingback: Interconnectedness | Jenna's Cup
Pingback: Live is beautiful. « IO 9elements
Paul, this is very nice. Looking at it reminds me of pictures of the planet during night time. I even saw a paper which found correlations between those data and economic development. Unfortunately, China (FB forbidden) and most of Russia (why?) seem to be dark in your picture.
With regards to your experiment, I think it does not show that much about geographical relations when put all together unfortunately. I think it would be exciting to be able to pick a country and visualize its own relations with the rest of the world.
Most of Russia is dark; because no people live there.
Pingback: Data Enthusiast
Pingback: Open Flights; Mapping Global Connectivity » Spatial Analysis
I’ve skimmed this article a few times, but may have missed the link. Where’s the code you wrote to generate this data?
Pingback: Calculate and highlight the differences between strings of text with PHP
Pingback: Random art on the web « Statisfaction
Pingback: R tells you where weapons go | nzprimarysectortrade
Pingback: Amy Farrah Fowler and soft power « DannyQuah
Pingback: Visualizing Facebook Friends: Eye-Candy in R « Another Word For It
Pingback: Global Migration Maps | Spatial Analysis
Pingback: (very) basic mapping in R | geotheory.org