It's still hard for beginners to get started with Python
Python is, for a good reason, one of the easiest programming languages to get started with for people new to computational data analysis specifically, and new to programming in general.
But it’s still really hard to get started with. Allen Downey, who’s been teaching Python for years, recently wrote a great post about this. He says,
I have written several books that use Python to explain topics like Bayesian Statistics and Digital Signal Processing. Along with the books, I provide code that readers can download from GitHub. In order to work with this code, readers have to know some Python, but that’s not enough. They also need a computer with Python and its supporting libraries, they have to know how to download code from GitHub, and then they have to know how to run the code they downloaded.
The amount of cognitive overhead needed to be a developer today results in a problem I’ve seen with senior developers sometimes. They forget how much context people need to learn in order to become proficient at development. And they forget that they were beginners once, too.
There’s two recent pieces that adress exactly how much context is needed in order to program. One is Paul Ford’s beautifully voluminous “What is Code,” in which he tries to summarize the enormously complex world of computer programming. He starts, “A computer is a clock with benefits.” 38,000 words later, after moving through logic gates,RAM, Microsoft Word,algorithms,the Go gopher,and OOP, he ends with simply, “Hello, world.”
In a way, that’s how beginners work, too. They wrestle and wrangle with layers and layers of understanding, of local environments and the cloud, the difference between JSON, SQL, and Python, NAND gates, CPUs, mega versus gigabytes, and everything in between, until they finally, finally are able to install Python and print()
that “Hello, World!” to the screen. (By the way, while changing print to a function is a really nice standardizing move appreciated by senior developers, this is a really confusing and annoying change for beginners.)
Think about how complicated computers are in general, and how many layers of things need to work together, and how you need to have at least some high-level understanding of how those layers of things work, and you’ll get a better understanding of what a junior developer is up against. One of my favorite posts on this is this one., which goes through what happens when you go to google.com on your browser, and winds up talking about thinly sliced wafers of highly purified single-crystal silicon ingot.
That is also why it’s so hard for technologists and non-technologists to communicate together: technologists know too much about too many layers and non-technologists know too little about too few layers to be able to establish effective direct communication.
Another - more recent - post that touches on this difference, the boundary between people just at the edge of technical discovery and experts is “Building for muggles.” The author writes that Slack was able to become successful because they understood that most people don’t know how to access IRC, or even what the word client means, and were able to successfully translate the world of distributed chat protocols into something anyone could access.
Each person gradually becoming more technical goes down this path, from not knowing anything, to building serverless applications, by stumbling around and asking a lot of questions. I went down this path when I started my journey into data science (although it definitely didn’t feel like a journey at the time - more like a blindfolded freefall into the darkness of the command line.)
I started installing Python in 2012 on a Windows machine:
And, instead of being able to immediately write code, I was angry a lot, even more than Twitter would have you believe. In fact, it was so frustrating, that I channeled all of my energy into this post.
As someone with a lot of Python experience now, my stance is that it’s still incredibly hard to understand how to install Python for people new to both Python and development. The best way to get an idea of how hard it can be is to do a Google search.
If you’re just Googling for Python installation instructions, you’ll get this page, which tells you how to install 3.6 on Windows, as the first result.
It doesn’t tell you anything about why you should be installing Python 3.6 over anything else. The second link is the official Python download page., which has a helpful link to an overview of why you’d want to use Python 2 or 3, and says that the version you want to use depends on what you want to get done. But what if you don’t know what you want to get done? You can look around, and you’ll see some articles that tell you to start with 3, some that say it doesn’t matter, and as late as 2016, the author of Learn Python the Hard Way didn’t advocate switching, and even when he rewrote the course, there was controversy.
If you don’t have any experienced developers telling you that you should be using 3 (you should be starting with Python 3 😁), how are you supposed to understand what to do?
Fortunately, the third link on Google is Kenneth Reitz’s wonderful guide on Python for beginners. But, once you do come across the Hitchiker’s Guide, there is lots to learn: What’s an interpreter? And, focusing on Mac only, Why do I need Xcode? What’s Homebrew? What are environment variables? And, what is all this stuff? What’s venv? Pipenv? Which one do I need to just write “hello, world?”.
These are some of the questions I could think of that beginners would ask, but most definitely not all of them. Because I’m not a beginner anymore. In fact, I’m so far away from being a beginner that I don’t understand what would be hard for a beginner anymore. This is not a brag, but is actually a problem for senior developers working in teams with people with less experience than them.
A senior developer is able to easily overcome issues that come up. For example, understanding the benefits of Python 3, why print()
should be a function, understanding the issues of installing Python on different operating systems, virtual environments, how two installs of Python living on the same machine would work, and much more.
For example, try asking a novice what this means:
What’s PEP? What’s a symlink? How does Homebrew work? (True story: In writing this blog post, I tried to create a new user on my computer that didn’t have Python so I could see what the experience was like for a beginner and ended up somehow uninstalling and reinstalling Homebrew because I overestimated my understanding of how it works across users on MacOS. )
The main problem with the communication gap between beginners and experts is that junior developers have a whole hierarchy of things they don’t understand, and aren’t even aware of the right way to ask the question.
For example, after you install Python on a Mac, you have to set the Python path. The instructions in this particular tutorial tell you to,
Add PATH to ~/.bash_profile and ~/.zshrc
For someone unfamiliar with Unix systems, this sentence is like a heiroglyph to be deciphered.
A senior developer’s mental model of a language and its environments looks something like the Unknown Unknowns model on the left, whereas for a junior, it looks something like the one on the right:
If you are a senior person and have junior people that you work or interact on online forums with, it’s a good idea to keep these things in mind. The people that have been most important to me in my career are the ones who were able to help me navigate through the maze of questions I had and turn my own mental model to decrease the amount of unknown unknowns I had.
The good news, though, is that we were all beginners once, and we have the power to remember how frustrating things can be, and make it easier for the next person coming up. We can imagine how it felt and say, “yeah, that sucks, here let me explain this to you,” and be the person we needed when we were on our fifth millionth Stack Overflow search.