Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
School presentation
School presentation
Loading in …3
1 of 29

Cognitive Cities: City analytics



Download to read offline

Presented at Cognitive Cities in Berlin, February 26th 2011.


Related Books

Free with a 30 day trial from Scribd

See all

Cognitive Cities: City analytics

  1. 1. City Analytics Matt Biddulph, Nokia
  2. 2. understanding systems by making models We have always tried to understand systems by creating models of them. We create rules that match reality just closely enough that we can study reality by studying the model. MONIAC is one such example, created at the London School of Economics in 1949 by Bill Phillips. It uses fluid dynamics to model an economy, with the flow between water tanks standing in for the monetary flow between the Treasury, Education and so forth.
  3. 3. “The more we learn about biology, the further we find ourselves from a model that can explain it.” Chris Anderson, “All models are wrong, but some are useful.” — George Box, Statistician, quoted in http:// As our knowledge advances in a field like biology, our inaccurate models give us diminishing returns. In “The End Of Theory”, Chris Anderson argues that the future of science is transitioning to analysing empirical data gathered from observation of the world. He calls this The Petabyte Age, pioneered by companies such as Google who created techniques for large-scale analysis of data out of the necessity to analyse the whole internet. Credit:
  4. 4. people are city biology We can try to study cities with models. But human behaviour, the biology of the city, makes cities too complex to model.
  5. 5. Recent visualisations of the movement of hire-bikes through London emphasise for me the organic, biological nature of human city-data.
  6. 6. “We can’t see how the street is immersed in a twitching, pulsing cloud of data.” Dan Hill: Dan Hill continues, “This is over and above the well-established electromagnetic radiation, crackles of static, radio waves conveying radio and television broadcasts in digital and analogue forms, police voice traffic.  This is a new kind of data, collective and individual, aggregated and discrete, open and closed, constantly logging impossibly detailed patterns of behaviour. The behaviour of the street.” The data that flows through modern cities is not even visible to the human eye. We can’t gather this data with interviews, surveys and clipboards.
  7. 7. city samplers So at Nokia, we’ve been asking the question, can the phone be the entire source of data that allows us to know our cities?
  8. 8. This is plausible because so many people carry a phone with them 24 hours a day, wherever they go in the city. It’s also because the modern mobile phone is packed with sensors. Early phones had a microphone and a radio. Phones today know which way up they are, where they are in the world, can record images and video, and can sense the presence of many other devices, networks and signals.
  9. 9. This brings the city into the Petabyte Age. What allows us to process the data is a technique developed by Google and popularised in open-source in the Hadoop project. Map-Reduce is a system for specifying a data-processing algorithm that allows the work to be split up and distributed to a network of computers to solve in pieces. It maps raw input data to processed output data, then reduces the output data into final results.
  10. 10. With map-reduce, we can run an algorithm on a rack of servers...
  11. 11. ... or a corridor full of racks of servers ...
  12. 12. ... or data-centre full of corridors full of racks of servers. We can start small and scale up our processing capability to keep pace with the scale of our data. It sidesteps the limit we hit with traditional single-machine analytics, when we can no longer process 24 hours of data in 24 hours of CPU time.
  13. 13. learning from search My first example shows what we can learn by looking at what people search for on a map, and where they are when they search.
  14. 14. Ikea Spandau Ikea Schoenefeld Ikea Tempelhof This map of Berlin (made by Nokia’s Josh Devins) aggregates searches made over the last Thursday, January 27, 2011 Ikea geo-searches bounded to Berlin four months for the word “Ikea”. It clearly shows that people all over Berlin look for Ikea, but can we make any assumptions about whatBerlin Ikea stores. that there are obvious clusters near the 3 the actual locations are? kind of, but not much data here clearly there is a Tempelhof cluster but the others are not very evident certainly shows the relative popularity of all the locations Ikea Lichtenberg was not open yet during this time frame
  15. 15. Prenzl Berg Yuppies Ikea Spandau Ikea Schoenefeld Ikea Tempelhof The fourth obvious cluster is a demographic - the young middle-class families who tend to Thursday, January 27, 2011 Ikeain the Prenzlauer Berg district of Berlin. live geo-searches bounded to Berlin can we make any assumptions about what the actual locations are? kind of, but not muchalso shows that people don’t search for Ikea on a Sunday as much as Incidentally, the data data here clearly there is week. This is cluster but the others are not very evident laws and even Ikea is the rest of the a Tempelhof because Germany still has Sunday-closing certainly shows the relative popularity of all the locations not open on Sundays. Ikea Lichtenberg was not open yet during this time frame
  16. 16. learning from maps We can learn plenty about a city just from looking at its maps, and the places on the map.
  17. 17. The “Starbucks Index”, invented by designer Tom Coates, is calculated from the number of Starbucks cafes per square kilometre of the city. By analysing Nokia’s places registry, we can show the difference between difference cities, or different parts of a city, by looking at what companies choose to base themselves there. We could equally well calculate a McDonalds index, or an Italian food index, or a public parks index.
  18. 18. Searches are goal-driven user behaviour - someone typed something into a search box on a phone. But we can even learn from activity that isn’t so explicit. When someone views a Nokia Ovi map on the web or phone, the visuals for the map are served up in square “tiles” from our servers. We can analyse the number of requests made for each tile and take it as a measure of interest or attention in that part of the world.
  19. 19. Searches are goal-driven user behaviour - someone typed something into a search box on a phone. But we can even learn from activity that isn’t so explicit. When someone views a Nokia Ovi map on the web or phone, the visuals for the map are served up in square “tiles” from our servers. We can analyse the number of requests made for each tile and take it as a measure of interest or attention in that part of the world.
  20. 20. LA attention heatmap This is the attention map of Los Angeles, California. We can clearly see several important hotspots such as Downtown, Hollywood and LAX airport.
  21. 21. LA driving heatmap If we turn to the navigation logs, we get another map of Los Angeles. This data is recorded whenever someone requests a car route from one place to another. You can clearly see the roads, and it heavily emphasises major roads because that’s what is favoured by route- planning algorithms. It’s also a map made by people who don’t know where they’re going - if they knew exactly what route to take, they wouldn’t be using navigation on their phones.
  22. 22. business perspective City data also reflects business activity. In Berlin our local coffee shop owner uses pen and paper to record every sale he makes. He uses this to optimise his pricing and the kinds of coffee he sells. We can do some of the same analysis on a larger scale.
  23. 23. business context Looking at the check-in and search patterns around coffee shops, we made this map of the San Francisco Dolores Park area. Red circles are coffee shops, and blue circles are other businesses. The larger the circle, the more popular the location is to visit.
  24. 24. usage patterns We discovered we could deduce more than just business information from this data. When we looked at one specific venue, Dolores Park itself, we can tell that San Francisco is cold at night. No matter the time of year, checkins at the park are much lower in the evening and night than in daytime. When we looked at the day of the week that people visit the park, we thought we had a bug in our data collection. Why would Thursday be different from other days for popularity of parks? When we cross-referenced the data with weather records, we realised that this particular Thursday was wet and cold. Like many other examples in this presentation, we were excited by the fact that we can find verifiable real-world information in pure data, without any human guidance.
  25. 25. “Information is quickly becoming a material to design with.” Mike Kuniavsky: In his recent book “Smart Things”, Mike Kuniavsky compares information to traditional materials such as wood and rubber. It has now become a material that we can build with in the real world, to connect the physical and the digital worlds together.
  26. 26. [nod to Matt Jones, for many conversations we had about cities while working together at Dopplr]
  27. 27. Thank you. Matt Biddulph @mattb | After the talk, there were questions from the audience...
  28. 28. Audience question What about individual privacy, and the ethics of profiting from individual user data? 1. We only ever analyse the aggregate, anonymised set of all users’ data. We didn’t track any individuals in any part of this work. 2. I believe that it could only be unethical to profit from analysing user data if you don’t return some value by making them a useful, desirable product in return.
  29. 29. Audience question I’m not uncomfortable with services analysing my data, but I am unhappy if I feel like I don’t own my personal data. In my personal opinion, individual data belongs to the individual. Putting your data into a large service gives you access to economies of scale, allowing it to do useful analysis of the aggregate data that you couldn’t achieve with your data alone. You benefit from this when their service gets better the more you use it. A company you deposit data with should act like a bank: hold it in trust, generate some benefit, give it back when you ask.