I gave this speech at a Prestashop board meeting in Stanford University and a second time for While42 San Francisco #3.
I speak about coding culture, interview process, Lean and Agile methods, how we handle outage and DevOps philosophy.
2. What
about me?
•Operations Engineer @SlideShare since 2011
•Fan of system automation
•Living in San Francisco since 2009
•@SylvainKalache on Twitter
3. Plan: 1. What is SlideShare?
2. Development culture
3. Interview process
4. Lean culture
5. Outage management
6. DevOps philosophy
Operations Engineer @SlideShare since 2011\nFan of system automation\nLiving in San Francisco since 2009\n@SylvainKalache on Twitter\n
1. What is SlideShare?\n2. Development culture\n3. Interview process\n4. Lean culture\n5. Outage management\n6. DevOps philosophy\n
What is slideshare?\n
SlideShare is the world’s largest community for sharing documents\nThe main kind of document uploaded on SlideShare is presentation, we support PowerPoint, Keynote. \nWe also support common documents like PDF, Word, OpenOffice and video.\nWe have a variety of tools to promote share and get values out of your content by driving traffic, getting information and reaching users looking at your content.\n
We have +60M unique visitors per month and growing fast.\nIt’s about 3 billion of slide view per month.\nWe are between in the top 150 most visited website in the world, soon in the top 100.\nWe are in the Top 50 of South America\n
We have one office in New Delhi and another in San Francisco.\nThese numbers only represent the technical employees.\nThere is 12h of difference between the 2 locations\n
Development\n
Simple code: we are looking for simple code that can be read by anyone. Complexity is not synonym of quality.\nComments: we want any class, function commented. A non-commented code is hard and long to understand or debug.\nSyntax: is also very important for a code to be easily readable. Indentation is a big part of it.\n
At SlideShare we track our features, if one is not being used we will simply disable it. \nUnused features make the UI overloaded and confuse the user.\n\nEven if refactoring can be seen as lost of time, it is sometime necessary if the code is outdated or written in a poor way. \nTaking the time to make code refactoring is a long term win for stability of the code.\n
Code review is not blaming people.\nIt’s educating developers by helping each others.\nFor reviewers it’s a way to consolidate their knowledge.\nThis is the way to build a code culture in the company following the points I spoke about previously.\nThis is a long term win for bug free and optimized code.\n
Sometime a task is complicated and require 2 or more brains.\nPair programming is about getting more than one opinion on a problem, sometime a second person will think about solving the issue a way the other person would not have thought about.\nIt increase the probability that the issue is solved the best way since developers can debate their ideas.\nSometime it’s also about context, developers may have different level of knowledge with different parts of the code\n\n
Interview is a critical process for any company. \nAfter all the company is made of workers so all the quality of the service/product depend of this important process\n
We are looking for candidate who know how to code, not how to use libraries. \nOur interview process is mostly composed of basic problem that can be solved in any language, our goal is to see how the candidate can solve a “simple” issues in few hours.\nWe are looking for smart learner, if the candidate is it’s easy to train him to be the “ideal” SlideShare employee\nFor example Ruby is a quite easy language to learn, a fast learner can be comfortable with Ruby within couple months\nThe last important point about the interviewing process is that any employee participate to the process. I mean that it’s not only management or HR, several employees will come and work on an exercises with the candidate to get different points of view.\n
Foosball is part of the SlideShare culture and so of the interviewing process, candidate must score 5 goals at least to... just kidding.\nWe want to make the candidate facing another type of challenge, see how this one react, it’s personality and if the candidate is a team player\n
We are still using lean method, SlideShare is a startup and the founders has been highly inspired by Eric Ries working methods.\nI advice you his book The Lean Startup, a must read for any manager, entrepreneur or person interested in finding more efficient way to drive projects.\n\n
Scrum is a standing meeting with maximum 10 people.\nWe do this every morning, it should be max 2 minutes per person.\nWe are speaking about what we are working on.\nEventually asking for help and taking quick decision.\nIt’s a way to know what everybody is working on.\n
Continuous deployment mean that developers should deploy as often as they can.\nIt should be an easy and fast process. A single command to do the whole process.\nOnce deployed, this modification can be easily measure or a bug fix since there is not many new code involved.\n
They key of continuous deployment can be include in a 4 step process\nbuild code with little iterations, adding granularly feature or modification\ndeploy these modification and measure them. (performance, stability, UX) New Relic, Nagios, Logs, New Relic...\nlearn about these modification, does it make the site slower, does the customer like this new feature? \n(little iterations can be considered at different scale, for a set of performance modification this would mean iterations every hour since performance can be measured right away, for a feature this might be about a day scale since you need to measure user usage...)\nThen start over!\n
EOD (end of day) is a daily email that you send summarizing what you’ve done during the day, eventually issues you faced. This is a good way to stay in touch with the work being made on the other side of the globe. It’s also a way for the employee to keep track of its own progress over its tasks\nWe also do a weekly phone meeting for every team, sometime emails are not enough.\nEven if there is 12h difference between New Dehli and San Francisco, some code required to be reviewed by the 2 team as it can have strong influence.\n
How we manage our outage and make sure that we learn about them. \nA mistake is only a mistake if we don’t learn from it\n
\n
5 whys has been created by Sakichi Toyoda in Toyota factories.\nIt’s a tool to explore the cause-effect process and find the root cause.\nThis process should be made no more than 24h after the incident so that people still have fresh memory about what happened.\nAny people involved in this outage should be here, any information is valuable, the point is not to blame people but to understand why it happened.\n
Every outage need to be documented: \n-when it happened\n-Who discovered it\n-How it was discovered\n-What were the impact on the service\n-How did the incident has been diagnosed and how we found the issue\n\nThe second step is to make sure that this incident won’t happened again, using the 5 whys result it’s possible to found what needs to be change\nTo be sure that the root cause will be fixed, tasks need to be created and assigned. We use pivotal tracker at slideshare.\n
The devops philosophy\n
Devops is the answer to a problem: developers and operations dude can’t get along. Why that?\n-Developers works and created new feature and change the code that can potentially bring security, instability and loading issues.\n-Operations is in charge of keeping the site up, running smoothly without security holes\nWe can see the conflict of interests here\n\nThe goal is to have both team collaborate, ops people who work with developers on their project, know what’s being created, changed.\nThis can be done before the developing process start by speaking about it, ops can give some advice about how to develop so that it would feat the infrastructure, it can be also done after the development is done by reviewing the code.\n\nThis is creating link between developers and operations guys, it’s also a good thing when those last ones get wake up at 3am and have to fix something related to code.\n\nFinally we also make developers work on little sys admin tasks and this by using automation...\n
Our infrastructure is code. Any aspect of our system configuration is being represented by code, we are using Puppet.\nDevelopers can modify or add element in the manifest and deploy this very easily, plus the syntax is similar to Ruby.\n\nAll our manifests and configuration files are in a versioning system, thanks to this we know why, when and who did a modification. We can also revert back very easily in case a modification went wrong.\n\nThis allow our infrastructure to be secure because we know that the same secure configuration is present on all the node of our system.\nWe can also scale our infrastructure very easily and fast. Managing 1 or 100 server require the same amount of work.\n