Monitor performance issues & errors in your code

#14: Moving from PHP to Python 3 with Patreon Transcript

Recorded on Tuesday, Jun 2, 2015.

00:00 Today you'll learn how Patreon is helping people live their dreams building amazing creations for the rest of us. It's time for Talk Python. This is show number 14. Today, our guest is Albert Sheu. And this episode is recorded Tuesday, June 2nd, 2015

00:00 [music]

00:41 Hello, and welcome to Talk Python To Me- a weekly podcast on Python, the language, the libraries, the ecosystem and the personalities. This is your host, Michael Kennedy; follow me on Twitter where I am at @mkennedy, and keep up with the show and listen to the past episodes at talkpythontome.com.

00:41 This episode we will be talking with Albert Sheu, from Patreon, about migrating to Python.

01:04 I'm really fortunate to have Codeship and Hired sponsoring the show. Let me take just a few seconds to tell you about them. Codeship is a platform for Continuous Integration and Continuous Delivery as a service. The encourage you to 'always keep shipping'. Please take a moment to check them out at codeship.com or follow them on twitter where they are @codeship.

01:04 Hired wants to help you find your dream job! Hired is built specifically for developers looking for a new opportunities. Check them out and get a very special offer at hired.com/talkpythontome. You'll find them on twitter where they're @hired_hq

01:41 Let me tell you how we got to this show.

01:41 It all started with me whining about ban with charges on Twitter. I was saying "It is so expensive to run a podcast, look at this things grow!" And then Justin Spain from Chattanooga Tennessee, who is at @jwsmusic on Twitter, said "Why don't you start a Patreon campaign?" Of course, when I read that I thought "What's a Patreon campaign?" It turns out that Justin was right, Patreon is actually perfect for things like podcasts, blogs, frequent videos and even OpenSource projects. Listeners, readers and so on, put a small amount to each release- in my case each episode, and when one of these episodes ships, a small amount of money is contributed back to me to help keep the podcast going. If you want to learn more about my campaign, you can find it at patreon.com/mkennedy.

02:29 So, fast forward a month, the guys at Patreon noticed my campaign, and happened to be making a companywide transition to Python 3 from PHP, so they reached out to me at Twitter, and we thought it would be cool to talk about their port of their product to Python. And we talk about how that's going. In case you don't make it all the way through the whole show, it's going great.

02:50 I also want to take a moment and say a special thank you to the 30+ Patreons who are already supporting the show. Right now your contributions are covering a little more than a monthly bandwidth expenses for the show. This allows me to use my sponsorship money for extra benefits such as creating searchable transcripts, and things like that. But more importantly, your support has given me the confidence to push harder in making this podcast all that it can be. And of course, there is still a long way to go there, so thank you for the help.

03:17 You might also be interested to hear that Patreon is hiring Python developers and data scientists. Check out Patreon.com/careers, if you want to be part of the sweet team in San Francisco, who are building out a business and putting the dent in the world using Python.

03:35 At some point during the show I make a statement that PHP is one of the most dreaded languages in technologies of 2015. I was quoting from the Stackoverflow 2015 developer survey, you'll find the links in the show notes. And I have a bit of a correction there: sorry, it was Perl that was part of the most dreaded list, not PHP. PHP didn't make the top 9, however I suspect maybe it's a lingering just off stage in place 10 or 11 or 12.

03:35 And in case you are wondering what is the most dreaded technology on the list- it's salesforce.

04:10 All right, finally, the Talk Python To Me T-shirt kickstarter project is still running, it will be open for 9 more days so be sure to visit bitly/pythonshirt to reserve yours. It's a cool and comfortable cotton shirt that will tell the world about your unabashed love for Python.

04:10 Now, let's get to the conversation with Albert and Patreon.

04:30 Albert, welcome to the show.

04:32 Hi, thanks for having me.

04:34 Yeah, I'm really glad that you guys contacted me, and I am looking forward to the conversation. We haven't really had much talk on moving from one technology into Python and that's our main subject today. So, thanks for being here.

04:50 Thanks for reaching out. And we always- Patreon is a creator for platform and we always like to support creators anyway we can.

04:57 Yeah, so we'll get to Patreon in just a second in sort of how this show came to existence. But I always like to start of the show asking my guest who have achieved a lot in their careers: how did you get started in programming and how did you come to start using Python?

05:14 So I actually got started with programming pretty early; I actually borrowed my brother's computer science AB textbook when I was really young and just went through some of the exercises in C++. It wasn't the best way to learn things, but it got me introduced to like a lot of like really small things, like how to print screen, how to do for loops... And back then like I was pretty entertained just by doing really simple programs like little small games where you put your name and someone like output your name right back at you.

05:45 Yeah, I think we have all written those types of apps, they are super fun, just the feeling of satisfaction and accomplishment is surprising for either, right?

05:54 Yeah, and when you are really young and your understanding is lower, but as I sort of like grew up, I started programming lot more on like calculators, basic, I decided to pursue programming as my major in college, and I was still pretty like gung-ho about C++ I thought it was like a hard core language I thought that like all the hard core programmers would do C++. Until freshman year someone introduced me to Python. I thought it was like you know, kind of a weak language because it didn't feel as hard core.

06:23 Yeah, I think that's a common feeling, or maybe misconception is a better way to say it, but a lot of people feel like the really tough guys do C++ and they are right down there in the memory in the middle. And all the other guys are just, you know, fooling around, orchestrating our code or whatever that we write in C++. But I'm not sure that is really true or fair, is it?

06:43 That was my impression at the time, and what really changed me is like when I actually had to do a lot of project courses in college, I realized I could actually do things correctly and quickly and like, in terms of performance were like the standards of the code much better than I could in C++, much better in Python than in C++. And, I just continued using Python whenever I could all through college, I didn't get a chance to use it professionally until about a couple of years ago when I started out a company called 'Quora"

07:14 Quora is at the question the Q&A site?

07:19 Yeah, right, Quora is the question/answer site, I started it pretty early and saw a lot of the really cool things that they would do with Python, a lot of stuff with generators and meta classes and decorators, and it really turned me on to like Python as a really serious language. And part of what I wanted to do at Patreon is take a lot of those like sort of what advance Python techniques and bring them to our infrastructure.

07:42 That's really cool. But when you got to Patreon it wasn't originally in Python, was it?

07:47 No, it was in PHP. Started in PHP because that was the fastest way to bring the product to life in the early stages, but as the product grew and as the engineering team grew, it was starting to be pretty clear that PHP wasn't going to scale with the team-

08:04 Is that from like a performance perspective, or from adding new features and maintainability perspective...?

08:11 The number one thing was maintainability, and it's not a 100% like PHP's fault but rather PHP allows you to do certain things that make the code way messier than you want your core to be. A lot of things with PHP just allow you to do things very quickly, and when you are adding like more and more engineers when you are going from one engineer to 2 to 3 to like 10, you don't necessarily want to do everything the fastest possible way. We did want to do things fast, in fact like the concept of shipping quickly is still important to our engineering culture, but we want to be able to ship quickly in a way that is maintainable and in the way that it like uses the proper abstractions in a way that like isn't copy/pasting, I guess.

08:56 Let's talk a little bit about what those differences are and stuff; but before we do, maybe we could tell everyone- what is Patreon?

09:03 Yeah. Patreon is a subscription funding for creators, and the way it works is that Patreon will pledge a set amount per release; every time a creator publishes new creations so like a web comic, a video, a blog post, a podcast, anything, funds are transferred from Patreon's to the creator. We launched in May 2013, and as of summer 2014 we announced that a Patreon is sending over $1 million monthly to creators through Patreon.

09:33 A million dollars a month, that's amazing.

09:36 Yeah, and we are still growing, we just passed $2 million very recently and the company itself is about 24 people. We are headquartered in San Francisco and actively hiring.

09:46 Ok, and they are doing Python, I suspect you reach a lot of intrigued Python developers out there, so, it's a good message.

09:54 Yeah, that's the hope.

09:54 Yeah. Excellent. So, I'm personally using Patreon and that's how we got to know each other.

09:54 [music]

09:54 Codeship is a hosted Continuous Delivery Service focusing on speed, security, and customizability. You can set up Continuous Integration in a matter of seconds and automatically deploy when your tests have passed. Codeship supports your GitHub and Bitbucket projects. You can get started with Codeship’s free plan today! Should you decide to go with a premium plan, Talk Python listeners can save 20% off of any plan for the next 3 months by using the code: TALKPYTHON

09:54 Check them out now at codeship.com, and tell them thanks for sponsoring the show on twitter where they are @codeship.

09:54 [music]

10:54 I was on Twitter, complaining like I do sometimes, about paying for bandwidth and stuff for y podcast, because it is going up, it's kind of like doubling each month, and it's not a huge number now, but doubling a medium size number becomes a problem really quickly. So I was saying "Oh this is getting pretty expensive!" And a guy I believe his name is Justin, on Twitter, says "Hey why don't you use Patreon?" And that was the first I heard of you guys. So I'm like "Wow, let me go and create an account or a campaign on Patreon" I thought about the ways to kind of do this like maybe kickstarter might be an answer, but to me, kickstarter is the wrong thing, because it is like you fund this thing once and there is like a big bank creation and that doesn't really work for things like ongoing podcast, or sort of stuff like web comics, or things the XKCD type stuff, right?

11:46 Yeah, that's right. Our goal is that people who want to create things are able to make a living of these creations. And, to do that requires having like ongoing support for your ability to create things. And I think it's a lot of the reasons why I joined Patreon is because I believe in the mission and a lot of the people who work here as well also believe in the mission.

12:15 I think it's a really great mission, and it definitely makes having my podcast simpler and I just want to say thanks to everyone out there who has contributed to my campaign so far, and I think what you guys are doing is really great. So I was super excited about when I first heard of it, it just- I hadn't heard of it before.

12:33 So let's talk about PHP a little bit. So it was originally written in PHP, and to be honest I don't have a tone of experience personally with PHP. But, what are some of the pain points that you are solving with Python?

12:45 A lot of organization of code, I think with name spaces I mean it is a very simple feature but being able to segment our code out, into separate modules, allows different teams to work on different aspects of the site. Secondly, with decorators or with like a sort of single points of entry for application, we are currently using Flask. And the way the Flask does routing is like through decorators. And because we are doing everything through this standardized way, if we wanted to change something like if we want to say "start recording response times then graph in that on the chart" we can just do that by modifying decorator rather than trying to input that into each individual PHP script.

13:31 Right. The way that you do decorators- what is it app.route as a decorator on all the methods?

13:37 Yeah, that's right.

13:38 Yeah, that's really cool. We just had Armin on the show for the previous show so, people hear more about him, I'm assuming it's pretty cool but- yeah, I'm a big fan of Flask. And you guys said you are doing Python 3, right?

13:52 That's right, that was sort of one of the small risks that we took, most of the other technologies that we use are pretty boring, or at least pretty standard like Flask, or SQLAlchemy, or Ginger 2, but we decided that we would go with Python 3 going forward because we wanted to make sure that our code base is going to be future proof.

14:11 I think that is a really nice choice, I see more and more people moving to Python 3 and there is still a lot of folks doing Python 2 but it just seems a little- wrong is not the word I am looking for, but it just seems like something is out of balance; where there is a lot of people working on Python 3 to push the future of Python, and then there is so many people who are actually not using it. Some of the biggest users of Python are not using kind of where the community is focusing its effort and so, I personally think any time you get a chance use Python 3 is kind of helping the community move along.

14:49 The way we went about it actually we just we took a look at the wall of super powers and just saw like what are the modules that we need to actually build our application. Most things that we needed are already implemented, the one thing that was missing was waisticledb15:05 and that was pretty easily replaceable on PyMySQL. And so it has been actually a really, really simple switch for us. The biggest sort of hump that we ran into was that like Stackoverflow or Google defaults to Python 2.7, in terms of its health, but it's been a dream to us to work with.

15:22 Yeah, that's really great. So, let's see: you are using Flask, you are using MySQL, it's the back-end, SQLAlchemy- SQLAlchemy is really fantastic, what else are you using? ANy other cool parts of Python or libraries?

15:37 Our starting point was just to mimic as much of PHP as possible, and so, in terms of complicated technologies we actively chose against using any. A lot of really standard modules like Request or Ginger 2, but nothing to alter yet.

15:55 That makes perfect sense. Let's talk a little bit about the whole porting concept, like porting your code is basically a rewrite when it's this divergent of a technology, right? And so that is a pretty big risk, how did you guys decide that that was the path for it?

16:12 Yeah, it was a pretty big risk, and in fact like I've been burned by a lot of ports in the past, a previous project I've had was porting something from Python to Scala and I think that's a much more difficult process to go through because you are putting dynamic language into a static language and that just adds a lot of weight to the process. We decided that PHP wasn't going to be how we wanted to go forward with engineering. The decisions we had to make were like given the current PHP code base could we refactor it into something that is more workable using like a modern framework? Should we port it to Python or should we port it to like Scala or Java.

16:57 We didn't want to- like ecosystem of PHP we found much weaker than that in Python, I think a lot of the main consumers of PHP are mostly revolved around Facebook and a little bit around like Wikimedia foundation, but besides that, not a lot of support. Whereas Python has what feels like a more active developer community and more active ecosystem. It has lot more legitimacy for hiring engineers, it is more consistent to the language and it was just different enough from PHP but close enough that like we felt comfortable doing one to one translation from PHP to Python.

17:35 Right. It was different but it wasn't completely "alien" to make that transformation.

17:40 Yeah, the other option we had with Scala was just too different and like too many unknowns. And when ti comes to ports I think there is usually a pretty high probability that a port will fail. I have seen a lot port projects just like fail in the past, and in order to prevent that from happening, I think we as a company decided we want to take as few risks as possible.

18:02 Yeah that makes a lot of sense. You mentioned like sort of developer happiness. There is a really interesting yearly developer survey by Stackoverflow that comes out and they sort of rank technologies, what is growing, what are the jobs, what technologies people love and what technologies people strongly dislike, let's say. It's a concern that they are working with it, and I think Python was on the list of the beloved technologies. And PHP might have been on the list of the ones to kind of stay away from. So you think it is easier to hire people because you can say "Hey come work on this cool Python 3 Flask project" than it is to say "Come work on this PHP project"?

18:47 When it comes to PHP, PHP has been really easy to develop new things. But it makes it extremely difficult to maintain that code after it has been developed. And I think that the gains you get with PHP are also seen a lot in Python, but the maintainability of Python is so much greater for us. Engineers I think like rank maybe a little bit unfairly certain languages as more like hard core than others, I think like PHP ranks pretty low; I think like there is a lot of unfair like feelings towards PHP especially since it's been like developed and it's been like actively used by like Facebook-

19:29 Yeah, and things like Wordpress, there is some pretty amazing stuff out there.

19:34 Yeah. But like, there is a lot of parts core to language, lot of inconsistencies, a lot of the things like global name spacing, a lot of the things like worry about security or like magic quotes that worked well in the past, but do not really hold up against the modern language.

19:52 Sure. That makes sense. So you choose Python over the other options that you listed and obviously the ones that you did list as well like over all the other possibilities. Does that mean that a lot of the people there had lots of Python experience, and that was kind of what they wanted to move to, or how did you as a group decide?

19:52 [music]

19:52 This episode is brought to you by Hired. Hired is a two-sided, curated marketplace that connects the world's knowledge workers to the best opportunities.

19:52 Each offer you receive has salary and equity presented right up front and you can view the offers to accept or reject them before you even talk to the company.

19:52 Typical candidates receive 5 or more offers in just the first week and there are no obligations ever.

19:52 Sounds awesome, doesn't it? Well did I mention the signing bonus? Everyone who accepts a job from Hired gets a $2,000 signing bonus. And, as Talk Python listeners, it get's way sweeter! Use the link hired.com/talkpythontome and Hired will double the signing bonus to $4,000!

19:52 Opportunity is knocking, visit hired.com/talkpythontome and answer the call.

19:52 [music]

21:19 Yeah. As a group, we actually have a lot of really different skills, some in Java some in Ruby, some in Javascript, and some like only in PHP. There was no language that everyone was going to be really favored on. And as a result we just had to make a decision. We choose Python because we knew we would be pretty safe, and we knew that like Python is a really easy language, just pick up over a weekend, like it is the tutorial on Python or the way that most people learn Python is usually over a weekend. Getting the very basics of it is pretty easy, and getting into depth is like a gradual learning curve.

22:01 I think it is a super easy language to learn, where the real work is learning all the standard libraries and all the popular packages, right, like really mastering things like SQLAlchemy and Flask- that is the real learning curve. But, it's kind of unavoidable.

22:15 Yeah, that's totally right.

22:17 Nice. So you guys sort of chose python because it would more or less make everybody happy and it was a really good safe choice. I think it is pretty fair characterization, I have heard somewhere, I can't remember if it was from one of the guys at PayPal or if I heard it from LinkedIn but, people said that they have chosen Python because it was everybody's second favorite language.

22:42 Like a lot of people really, really like Ruby. I've also worked with Ruby professionally, I like it as a language, but sometimes like-yeah, we had to pick something that was good for everyone, not necessarily great.

22:55 Right, somebody's got their pet language that they are super happy about but other people dislike it, so... Yeah, I think Python fits really well in those situations.

23:06 What really helped us with Python also with making that decision was that the in terms of data science or like with like SciPy and NumPy has been like much greater than a lot of other languages, and we knew that if we wanted to do sort of data driven approach to our projects, that getting our stuff in Python would lead to a lot of- would allow us to hire lot more data science people.

23:37 Yeah, and cross over as well so you might know the whole spectrum even better. It was really interesting PyCon presentation from Fernando Perez the guy who started IPython at PyCon 2014 and he talked about how the adoption of Python and IPython in data science is so taking off. Maybe I'll link to that video in the show notes, but basically he said that if you look at the technologies used in the big data conference in Utah, and the data he had was basically showing that people used to be doing things like R and to large degree that has been replaced by IPython, NumPy, SciPy, matplotlib, all those types of things. It's really amazing.

24:19 So, are you guys thinking of using some of those technologies, you said NumPy. For sure.

24:25 Yeah, definitely. At this stage, we are just collecting a lot of data like page views, but we are pretty comfortable with that, like it's a well trodden path of putting or like integrating NumPys or like matplotlib into our data once we have someone who is like working full time on data science. It's a language that I think sort of like broader P/O outside of engineering, and that is I think really important to us.

24:53 And I think having the data science component there really is powerful for you guys. I mean, you've got all these people who are creators and you've got all these Patreons that are supporting them, and just helping those people align better will just help your business. And it sounds like big data type stuff would be perfect for that.

25:12 Yeah, totally.

25:13 Ok, so you are doing this port to Python, on Flask and Python 3, that's cool; when did you guys start on it?

25:18 We started as from late November in 2014

25:26 Ok, so at the time of this recording, it's June 2015, so that sounds like 7, 8 months.

25:31 Yeah, that's right. The first thing that we did was just getting to the first page. I think in our porting process, we wanted to have like immediate deliverables and like immediately put Python in front of users as soon as possible. So in terms of like the project we just picked one page and fairly low risk page was our “about” page, and just made our only one goal just to see how fast we can get that page rendering in Python to production.

26:01 That's a really interesting way to do it, so you kind of taking your vertical slice instead of trying to horizontally go into this, you can get some piece of functionality ship straight away; that seems like that would help a lot with mitigating the risk of like will this port ever be done, when it goes live will it ever work? Things like that, right?

26:18 Yeah this is totally right. We are really afraid that if we took horizontal approach like we build the infrastructure around the web framework we build all the infrastructure around the models, build the infrastructure on the controllers, at some point we would just have to flip a switch, and that was really scary because that was a lot of risk that we can't necessarily detect with like just unittesting or integration testing sometimes we just want things to be as close to production as possible. And as close to production as possible is sometimes just production.

26:53 Yes, that is pretty close to production. So, you had a pretty interesting approach so you started with the something simple like let's get this whole thing out there- how did that go, you had to get the app written which is not a really big deal but then you had to do kind of DevOps database stuff- what was the process, how smoothly did this go?

27:16 Right. The complicated part here was getting our existing site which was still running on PHP and Apache, to work well with the Python Flask infrastructure. And the way we handled that was in Apache we used the proxy pass on particular routes, in this case for the about page we used proxy pass/about to our Python back end-

27:38 Ok, and so you could just sort of bypass your main app and send it straight over to whatever WSGI server you got running Flask, right?

27:44 Yeah, that's right. We start with entirely different tier using engine, ganacor, Flask but pointing to the same data sources. And we were able to flip the switch on and off of whether or not we wanted a particular page to go to PHP or to Python.

28:01 Wow, this is really cool. So I guess you just flipped that switch at the proxy pass level and it was still living in the original PHP site and then you started adding features to the Python version. And would you like flip it on and off to see like flip it on for a little bit, see if it is ok and then flip it back, or what was that like?

28:19 We tried to just flip it on and just keep it there, although in terms of like mitigating risk it is really comfortable to have to know that you can always flip back and forth between a correct version and like the Python version. Having this, we also did a lot of things with having the separate architecture, and so one way we actually did testing was- and this is our comprehensive test, we were on the site on PHP we opened the site directly by bypassing Apache completely, and go into directly into Python infrastructure, and just like tabing back and forth within the two and if you like the difference is really obvious.

28:57 Just kind of a visual dif if you will, ha?

29:00 Yeah, exactly.

29:01 That's cool. So you went through a whole process of like choosing the next page and so on, we talked a little bit about that before the show, can you maybe talk about that? You started with the about page and then where did you go from there?

29:12 Yeah, so, the process we tried to go by was to start with a low traffic, low complexity page. And this just let us make sure that our web framework was the way that we wanted it to be, that we wanted like you know, we could register the correctness of the site without a lot of risk and that was like what the "about page" was about.

29:35 Right, that almost tests your infrastructure more than anything, right, the everything is hanging together.

29:39 Yeah, exactly. We did session handling correctly, we just wanted to start with like, you know, rigorously testing like one small thing at a time while we were ruling this out.

29:51 Ok, so after "about"?

29:54 So after "about", like we were comfortable with for example like log out and log in traffic being correct, we started going and just porting the high traffic and low complexity pages, so stuff like index page is pretty low complexity, mostly just featured artists, our feature pages also, like feature creator, than our search page... This let us make sure that our scalability was in order. The complexity of these pages like- we didn't have to write a lot of additional code to make those pages correct, but we did have to like make sure like we need how many service do we need in order to support the certain amount of traffic; how much do we need to like mirror the amount of the performance that the PHP site was having in Python.

30:38 So, performance, that's an interesting question. Have you noticed the difference in performance?

30:43 It's actually slightly faster in Python because we are doing certain things lot more efficiently. Like, we have one entry point into our DV as opposed to doing like MySQL_query or a lot of the MySQL_ functions in PHP. That goes along with also cleaning up the code base, but like certain things with refact like with making things correct in Python just had the side effect to make things faster.

31:10 Yeah, that is really interesting, because it is a simpler language. It might be easier to write it correctly or more optimally.

31:18 Yeah. One thing that we did run into was that because the default state of PHP is just stream all the data as it comes out into standard- like from standard out directly to the client's browser and so on the PHP pages you would see like stuff coming in right away, with Python we used Ginger 2 and so we needed entire template to render before we sent it down to the user, so we had to do a little bit of things to make performance fell as fast in Python, as it is in PHP using generators instead of like sending down one large HTML block-

31:54 That's really the problem you experienced with almost- you need HTML template, you are not streaming directly to the browser, you need to execute then send it down. It's a perception thing, right, like it feels slower even if in the end the same amount of stuff is on the page in the same time.

32:12 Yeah. Like, for example, our creation pages in PHP took 6 seconds to render, but you didn't feel it because you would still see like the embed or you would see like the becoming a patreon flow right away, and it ended up like the slow part of the page was like rendering the comments and those also do like number of inefficient things we were doing with the data fetches, but when we ported that straight up from PHP to Python we just wouldn't see a page load for 6 seconds.

32:43 And that's too long. It's definitely too long on the web.

32:45 Yeah, and so we just like we implemented our own version of lazy load and like made sure the comments would coming after all the really relevant parts of the page came in and we also made some of the like I mean after we got that first part through we started making some efficiency like optimization for the performance, but we decided that finishing the thing was first, like finishing the correctness of the- finishing the correct page was the first thing we go for and optimize the data calls later.

33:15 Right. Ok ,maybe we could come back to some of the optimizations. But one of the questions that is coming to mind is are you guys done?

33:21 So we are 98% the way through, and that's not necessarily the ideal situation, but we are about half way through the project we realized that we could actually start implementing new things, on top of the base that we produced. So after we implemented the user page, the creation page and a lot of the others, some of the higher complexity pages with higher traffic, we had most of the models in place that we could start actually building new things. The first thing we implemented was a really simple feature for migrating subwo users from subwo on the patreon, and we have been developing API service layer for the web client and the 34:07 to be unblocked and allow to reach some progress. Where we are stuck right now in terms of the Python port is the long tail of like really low traffic that like serves high complexity pages, so like stuff like the settings page, or stuff like Patreon manager.

34:29 Yeah, are there a bunch of like internal pages that you guys have that are like complex dashboards that from the outside we don't really see but have to be moved over eventually?

34:37 Yeah. We have a lot of internal dashboards, a lot of them are of like very complexity. Those are lower priorities because we wanted to you know, the production pages to be out first, but eventually those will need to be ported to Python as well.

34:49 Yeah, so at some point you want to turn off the PHP site, if possible, and only manage one thing, right?

34:55 Yeah, our number one dream with the Python port is to be able to completely delete the PHP entirely. And that just means that number one like the PHP is not running in production obviously, but secondly it means that like the Python reference is what we are going with going forward.

35:13 So you are already doing new features on top of Python, so it is already serving you pretty well.

35:19 One of the hard stuff with doing a port is that while the port is going on, you can't do a lot of ongoing feature development. And, the product itself will seem to stagnate for a while. It has been sort of understood for like everyone that like that is the way it has to happen, like it can't do development in PHP and then like sort of backport that in the Python later, it's going to slow down development of both systems.

35:44 Yeah and you guys came out you said when, 2013.

35:49 Yeah, May 2013.

35:49 Yeah, and if you are going to do this in 2014 you have got a year old product and you say we are not going to change this for 6 month. That is kind of crazy in web time, so you had to do something, right?

36:00 Yes, it was kind of painful at like trying to sell that at first, but the idea is that we are going for long term value. And the long term value depends on like engineering feeling like they are working with the language that makes them happy.

36:13 yeah, and that's worth a lot, right? I mean, that means keeping the good developers, people are excited about the project, like, it's really a big deal. Probably easier to fix bugs as well.

36:23 Yeah definitely. We use external service caller rovar that just like lets us identify like- errors that happen in production just immediately go to that service and produces like a stack trace in like as much information as we can about the request and the headers and it just lets it debug really quickly.

36:43 What other infrastructure have you put in place, you have like automated tests and stuff like that?

36:47 Yeah, we do the simple things like unit testing, we have a staging environment, we do some of the more creative things we do is like we have this concept for 36:56 test, where for every route that we implement on the site, we just have a bunch of users of different profiles, like we have a dummy patreaon account or a dummy creator account and just have them load the page and that just makes sure that like the page loads with the 200 it's the really light way of testing if like every code path or like every possible viewer of the page will actually render. But, it is really high leverage test.

37:28 It's for the amount of effort that's surprisingly effective, because if something goes wrong, a lot of times it's a 500.

37:33 Yeah, exactly. And there is something like 40 lines of code actually implement that but it catches the surprising amount of bugs.

37:39 Yeah. You don't want people to see 500s on your site that doesn't encourage confidence.

37:46 The second thing that we did was that we did this thing called dark loading, which was while the PHP servers were still running or while the PHP website was still running, we would take all the request for the Python pages that we were porting, so like if like we were porting over the slash user page we would record all the requests that were coming in the slash user. And then we would just replay all that traffic, onto a detached Python instance. And so before we flip the switch or before we flip the proxy pass from Apache to point/user at the actual Python backend, we would sort of run dummy traffic through our Python code base with the read only database user and just make sure that no matter- like we would take real traffic and make sure at least that they all rendered.

38:36 Wow, that's pretty cool. So, another super light way to do testing but you just take all the traffic and you feed it over there and again it had better not return 500 and things like that, right, like these are the actual types of query string parameters we are seeing, these are the route data that is coming in, that kind of stuff, right?

38:54 Yeah, exactly like this is the cookie, this is logged in user, these are the headers... A lot of these things we were able to leverage our existing system and like if we didn't have an existing user base, we have had a harder time with that, but because we did have an actual live product it gave us a lot of confidence that when we ship something live that it would comfortably stand up to the traffic.

39:17 Nice, did you have to do anything special for scaling?

39:22 Surprisingly no, a lot of the scalability can be handled- we found that scalability could be handled nowadays just by vertically scaling our MySQL instance. It was kind of a little bit disappointing for me because I really like performance tuning. But we are on AWS and we are using RTS for our database.

39:47 Right. And I haven't used RTS on Amazon too much, but that is relational database as a service, right, so you can go over there and can you just turn and say I need more database and you get it?

39:59 Yeah, if you just want to upgrade something basically one switch does it.

40:06 Yeah the cloud is awesome, isn't it?

40:07 Yeah. Do you guys have like Geo replicated anywhere or is it in one particular data center.

40:15 We have replicated across availability but not across regions right now.

40:20 Sure. Ok, it makes sense. Are there any special techniques or things you have learned porting PHP to python, like any tips and things you can share with listeners?

40:31 When we started out, we didn't have a lot of Python experience in the company. So, the first thing we had to do was sort of- well, the first thing that I did was trying to get people really excited about Python. And when I started programming in Python, the thing that turned me on like the most to it, was Peter Norvig's "How to write a spelling corrector" in 21 lines and I think like there is a really good like intro to the advanced features of Python or what like sort of canonical Python looks like, there are a lot of tricks that are not necessarily the most maintainable but is actually it is a really cool demo of the language and of like all the features.

41:20 Yeah, that's really cool, and I will put that in the show links. The norvig.com, yeah, I'll put that in the notes.

41:30 In terms of like what tricks we use from porting from PHP to Python because people are still picking up Python as they were building the different port of pages, we built a lot of different things as sort of vimec, the behavior PHP in Python but in a more correct way. So for example like the context globals in Flask for mimic the sort of globals that would see in PHP. Turn operators we just wrote a separate function that mimic the structure of ternary operators in PHP but using a function in Python. And like whatever it could do to like lower the cognitive load of going from PHP to Python, made things go a lot simpler.

42:16 Yeah, that makes sense. That seems like a really good idea. Did you do anything like PyFlakes or any of the sort of PEP 8 checking tools to see how people are doing there?

42:29 We use PyCharm inside the company that does PEP8 checking for us, and it is actually- I actually really like PyCharm, I think it does a pretty good job of things

42:47 I can't agree more, I use PyCharm all the time, I love PyCharm, and I used to use emacs but it's all about PyCharm for me these days you know, if it takes an extra second to load there is a whole lot more goodness on the other side of that, right?

43:02 Yes, it really saves a lot of time.

43:04 Yeah. If it saves you one bug a week, it saved you an immense amount of time, or actually we are going to have the PyCharm guys on the show pretty soon, so... I am excited about that as well.

43:14 Really cool.

43:15 Yeah, There are doing good work and they have a free version I don't know if anyone knows but they've got like a community version and a pro version, it is pretty affordable I think.

43:24 The pro version is definitely worth it.

43:26 Yeah, it is definitely worth it. But if you are not convinced, you know, the free version is there you can try that.

43:26 Nice. So, overall you feel like you have made a good choice? Looking back you've got 8 months of experience, you've got running site, you have lived with the thing for a while. What are your thoughts?

43:43 I think one thing that I do wish is that the site were 100% ported right now, it would make certain aspects of the operations and architecture, much simpler, but I'm not unhappy like I am actually pretty happy with the place that the port is in right now, I think that the number one thing is always going to be developers' happiness, like if our developers are happy, then that just makes things a lot easier.

44:10 Yes, I think you are totally right, it's very hard to undervalue enthusiasm. Being excited about something can really affect the way that you work when you are doing creative things like programming.

44:22 And things just like small things like onboarding new engineers on the Python is much easier because this is like it is much more consistent language, PHP has a bunch of like really small quarks that are increasingly becoming like very specialized knowledge a lot of the inconsistencies with like the standard library and like very small things are easily learned but all these like sort of active cognitive load of onboarding to a new language and Python has been much better with that, especially Python 3.

44:50 Yes, Python 3 feels pretty cleaned up. I think that makes a lot of sense.

44:50 All right, I think that might cover. Anything else you want to add on this whole adventure you've been on?

45:03 Not much.

45:04 I think we pretty well covered it. So I am sure there is a lot of people out there with PHP sites that have considered moving to Python and I am sure this conversation will be helpful.

45:16 Yes. It was a very serious choice for us, we spent a lot of time like thinking about it, but like once we made a decision, like once we decided that that was the right place to go we just got started right away.

45:29 Were people pretty excited once you kind of decided to do it and started going?

45:33 I think there was a lot of burn out from PHP in this thing and a lot of excitement around Python. So yes, definitely.

45:39 Yeah, very very cool.

45:39 All right, before I let you out of here there is a couple of questions one which you have actually answered already. Before the end I always ask my guests what is your favorite PyPi package or sort of thing out there that you want to tell people about?

45:59 I would have to think about that, I don't want to say Request because I think that is pretty difficult answer

46:06 Request is pretty amazing. I learned from the show with Kenneth Rietz that Request has been downloaded 40 million times. That's unbelievable to me.

46:14 Yeah. Oh, I think for us, Rollbar has been like amazingly good and like amazingly easy to set up.

46:22 Ok, awesome.

46:25 R-O-L-L-B-A-R it is the startup with only few people working on it, it just like saved us so much time in having like building error detecting infrastructure.

46:39 Excellent. And it's just always watching, always got your back, which seems like a really powerful thing when you are doing a port like this and you just probably a little nervous all the time in the beginning.

46:53 Yes, it is like very comprehensive in that, maybe a bit noisy but like really gives us a lot of confidence.

47:00 Yeah, that's fantastic. The other question I always ask is what is your favorite editor, but we already had our PyCharm conversation.

47:06 Yeah, definitely PyCharm, I highly recommend.

47:09 Yeah, so, I definitely recommend people go and check out Patreon, you guys have done a fantastic job and if you are out there creating any sort of thing that has short multiple release cycles, like even if you are doing open source package it seems like that is something you can create a Patreon campaign about it and put up there, if people want to support they can sort of every time you ship a new version you know, here is $5 or something.

47:33 yes, we have people supporting creators that do anything from making You Tube videos to making dwarf fortress, and so like we welcome any creator who wants to use us.

47:42 That's great, yeah. It's been a great experience for me and I definitely recommend if this is the type of thing that would help with your project then definitely check them out at Patreon.com

47:42 Albert, thanks for being on the show, it's been great.

47:55 Yeahy, thank you.

47:55 Yeah, bye.

47:55 This has been another episode of Talk Python To Me. Today's guest was Albert Sheu and this episode has been sponsored by Codeship and Hired.

47:55 Thank you guys for supporting the show!

47:55 Check out Codeship at codeship.com and thank them on twitter via @codeship.

47:55 Don't forget the discount code for listeners, it's easy TALKPYTHON.

47:55 Hired may help you find your next big thing. Visit hired.com/talkpythontome and get 5 or more offers with salary enequity right up front and a special listener's signing bonus of $4000. Also don't forget, awesome T shirts wait you at bitly/pythonshhirt or just visit the website and click on shirt in the footer.

47:55 You can find the links from the show at talkpythontome.com/episodes/show/14

47:55 And be sure to subscribe to the show. Open your favorite podcatcher and search for python. We should be right at the top, you can also find iTunes and direct rss link feeds in the footer of the website.

47:55 This is your host, Michael Kennedy.

47:55 Thanks for listening.

47:55 Smixx, take us out of here.

47:55 [music]

Back to show page
Talk Python's Mastodon Michael Kennedy's Mastodon