About the author
Albert Row is a senior software engineer from the San Francisco Bay Area with over 13 years of experience as an individual contributor, technical lead, architect, open-source contributor, and manager.
Albert has been a certified reviewer on PullRequest since December 2019.
Moving from Ruby to Python is an easy transition… most of the time. There are a number of behaviors in Python that differ enough from Ruby that they can cause the experienced Ruby programmer trouble. Whether your team has opted to start writing a new program in Python or you’ve inherited a codebase written in Python, knowing these things will give you a head start in understanding Python and what you’re working with Here are the items I’ve run into most frequently:
Significant Whitespace
In Ruby, blocks of code are defined by do...end
or {}
. Python blocks are defined by their indentation. This is the #1 item that bugs the heck out of programmers from most other languages. Significant whitespace breaks programmers’ brains at an alarming rate who are coming from other languages and can make it slightly more challenging to identify the start and end of nested blocks. Be particularly careful to ensure that statements are under the right if
condition on your first few Python programs.
No Implicit Returns
In Ruby, if there is no return
keyword specified, the last statement in the function or method will be returned to the caller. In Python, there is no such behavior. All returns must be explicit. Many Ruby programmers rely heavily on implicit return, it’s highly idiomatic in the language to do so. Python takes the route of requiring explicit returns and this will trip you up in your first few Python programs if you’re not careful.
A File is a Module
In Ruby, a module is defined by the module
keyword and can span multiple files. In Python, a file represents a module itself. In this way, the behavior is more similar to Node.js than Ruby. This can trip up programmers who are used to the one-class-per-file approach common to Ruby - in Python, it is frequently expected that each module will contain multiple classes, and those classes will share a single file. This can make it hard for Ruby programmers, accustomed to highly specific directory structures, to find the classes they’re looking for.
Circular Imports
As a Ruby programmer, if I have two classes in two different files, they can reference each other and there’s no problem. In Python, two classes referencing each other from different files is likely to result in a circular import. Import statements in Python can only go one way - two files that import each other will raise an error. If two classes need to reference each other, they generally need to be in the same module (e.g. file, oi). On the one hand this is good - it encourages less coupling between classes, but on the other hand it does result in some gnarly debugging sessions tracing the import hierarchy to figure out what’s causing the circular import errors.
Everything is Public
In Ruby, the private
and protected
keywords perform a valuable service, defining which methods on an object can be considered part of a public interface, which can be called by classes inheriting from the current class, and which are private to the current class. Python has no such distinction - all methods are public. To compensate for this issue, the Pythonic approach is to prefix private methods with an _
- this does not actually make a method private, but it does offer a hint to other programmers that the method is not intended for public consumption.
__init__.py
Every Python package is a folder. At its root you will find a file named __init__.py
. Conversely, every folder that contains Python code files is a Python package - if you want everything to behave properly, an __init__.py
file is required. The intended use of this file is to initialize the code in the package, but a common best practice is to leave this file blank, as there may be instances where code should be imported from the package without initializing the system. In some systems, you will find a massive amount of code hidden away in __init__.py
files as well. Either way, gotcha!
Method Arguments are Both Positional and Named
All Python method arguments are named arguments. They are also positional arguments. In this way the behavior of methods is similar to function definitions in R. This is a gotcha because Python methods can be called in two ways - either by passing arguments in the proper order, or by passing named arguments. If named arguments are used, the order becomes unimportant and arguments can be passed in any order. Conversely, in Ruby, arguments are either positional or named, not both at once. Calling a Python method via a mixture of positional and named arguments is also possible, but I really wouldn’t recommend it if you value your own sanity and that of your team.
Argument Defaults are Mutable
The default given to a method argument is a pointer to a mutable object. One might expect that every time a method is called with a default argument, the default value is initialized. This is how Ruby behaves. Instead, the value is reused in Python. If the method pushes to an array that comes from a default argument, the next time the function is called the array will retain the pushed value. This can cause very strange behaviors if one is not careful, so it’s best to copy the value passed in rather than mutating it directly.
Instance Methods take self
as the First Argument
If I had $1 for every time I forgot to add self
as the first argument to an instance method in Python, or saw a candidate do it in a programming challenge during an interview, or saw it on my team while pair programming, I would be eating considerably nicer lunches. It happens multiple times a day. While in Ruby the self
keyword is always available in scope and managed by the language, Python uses the presence of the self
keyword in the method definition to determine that something is an instance method. Consider yourself warned!
Also, be sure to check out: