When to use PythonCaller (or RCaller or TclCaller) with FME
Tue 30 March 2021 by Kevin WellerAccording to Wikipedia, "a domain-specific language (DSL) is a computer language specialized to a particular application domain." As a DSL (among other things), FME is incredibly productive when it's used to solve the key problems in its domain: computation and conversion of geospatial and other data features. In my experience, it's roughly 3 to 5 times faster to implement such a solution with FME than with a traditional general-purpose programming language...even a relatively productive one like Python. Its pipe and filter architecture is ideal for the flow of features, modifying, splitting and merging them from their sources to their destinations.
But often in the course of my work, I find myself facing a challenge that FME was just not designed to address. That's fine, as every tool has its uses and limitations. Any tool that tries to be all things to all people usually winds up doing nothing at all well. That's why FME's PythonCaller, RCaller, and TclCaller transformers exist: to allow a workspace author to dip into a full-blown programming language environment whenever the genuine need arises. With every release, FME becomes more powerful, handling more use cases that once required programming. Yet I still encounter plenty of situations where some programming is necessary, or at least is the better option. Here are three broad categories of such situations.
Attribute processing with dynamic schemas
I often use dynamic readers and writers in workspaces and transformers that need a lot of flexibility to handle frequently changing data schemas. However, to process each attribute individually using standard FME transfomers, one has to split a feature into multiple copies (one per target attribute), apply the necessary transformers to that attribute, then merge the copies back together to produce the required output features Exploding and reassembling lists only multiplies the madness. The result is usually an unintelligible Rube Goldberg contraption with terrible runtime performance.
This is where programming logic shines. The code in a PythonCaller can use all the power of the language's control structures to process the individual attributes in any fashion that is required, in place, cleanly and efficiently. No need to bust a feature into umpteen copies, work with the attributes, then put it all together again. Additionally, the logic can handle dependencies between different attributes much more cleanly than under the split-manipulate-and-merge scenario.
Complex intra-feature manipulation logic needs
Beyond the challenges of dynamic schemas, the "control flow" structure of Python and similar languages lends itself very well to manipulating attributes, lists, and geometries with complex interdependencies. Trying to do the same with FME's pipe-and-filter "data flow" architecture would be an insane effort of mental acrobatics...not only to build it, but also to maintain it. Who wants to twist themselves into a pretzel merely to avoid a little programming?
This is a problem of mismatched control paradigms that signals the need to switch tools. In computer science, there is a similar concept called impedance mismatch, which normally refers to the challenges of taking the object-oriented data structures many programming languages use to hold data in memory, and "mapping" them onto the different paradigm relational databases use to store data on a persistent medium such as a disk. However, this term is often applied to other situations where there's a fault line between two different paradigms, requiring a "mapping" between them. Luckily, FME makes shifting between the "data flow" and the "control flow" paradigms relatively easy, thanks to the existence of the PythonCaller, RCaller, and TclCaller transformers.
Anything FME flat-out can't do
Finally, if there's something FME just can't do, either because the capability hasn't yet been implemented by the good folks at Safe Software, or because it's something FME doesn't (and perhaps shouldn't) do by design, a PythonCaller or one of its brethren might just fit the bill. For example, I've recently had to interact with web services that don't follow standard REST principles. Python to the rescue! In fact, I've used third party Software Development Kits in Python that make quick work of connecting to and using advanced APIs. I'm sure there are plenty of other possible examples, but this should be enough to get the point across.
The Upshot
This is why engaging an FME author with software engineering skills is so critical. And if you are an FME author and would like to acquire or expand your programming skills, you should strongly consider proceeding. That being said, it's still best to dip into traditional programming only when and where necessary, if only because FME is so very powerful and productive for the major portions of data solutions in its domain. But please don't turn an attempt to avoid all programming into a circus act!