A lot of people don’t know this, but OpenRefine hides a secret: Python comes already built-in! It turns out that anywhere you can use a GREL expression you can use Python instead by picking it from the “Language” drop-down list:

Screenshot of OpenRefine showing an expression dialog with “Python/Jython” selected as the language
Our first Python script¶
Let’s try it now!
When we first open the “Transform...” option for a column, the default GREL expression is simply:
value
As a reminder:
value
refers to the current value of each cell in the column,
so this expression produces that same value unchanged,
which we can see in the preview.
When using Python,
value
still refers to the current value of each cell.
However,
if we use this on its own in a Python expression
it doesn’t quite work.
When writing Python in OpenRefine,
we need to be more explicit about what the result of our computation is.
The way we do this is to write return <our result>
on a separate line:
return value
Why does this work?
The reason for this is that our Python “expression” is actually a Python script: a series of calculations, one per line, that Python executes one-by-one. Since each of these calculations potentially produces something that we could consider a “result”, we need to explicitly tell Python which is the result we actually want to use. Python returns this result to OpenRefine, and OpenRefine uses it in the transformed column.
From this perspective a GREL expression only ever consists of one calculation, so it’s unambiguous: however complex this calculation is, when it’s finished OpenRefine will use that result.
You might ask why Python can’t do something simpler, like using the result of the last calculation/line. That’s a very reasonable idea but, as we’ll see later, it can be very useful to be more flexible about this, for example when having the program itself decide which result to return.