Hey folks! I’m trying to create a dataflow templa...
# google-cloud
m
Hey folks! I’m trying to create a dataflow template as part of my pulumi deployment and then trigger a dataflow job using an airflow dag. The airflow part is easy enough, but I can’t seem to wrap my head around how to get the python script be run as part of the deployment. From the command line I can run
Copy code
python my-dataflow-script.py --template_location '<gs://my-template-bucket/my-dataflow-template>'
or I can hardcode the
template_location
in the script itself and just run
python my-dataflow-script.py
which then packages the beam application as a template to be run in dataflow. I’ve tried calling the dataflow script from another python script using
exec(open('my-dataflow-script.py').read())
which works, but trying that in pulumi’s
__main__.py
fails with:
Copy code
TypeError: cannot pickle 'TaskStepMethWrapper' object
I guess
apache_beam
tries to pickle the whole pulumi program or do something else which probably doesn’t make sense. Any experience on dataflow + pulumi and getting this working?
g
Are you trying to (A) call
my-dataflow-script.py
from your Pulumi application or are you trying to (B) call your Pulumi application from
my-dataflow-script.py
? (A) is supported. The best way would be to create a Dynamic Provider to call your script at the right "event" (create, update, etc) - https://www.pulumi.com/docs/intro/concepts/resources/#dynamicproviders. (B) is not supported as the
pulumi
cli (engine) must be the invocation point of your pulumi application.
m
Thanks! (A) would be correct. I haven’t used dynamic providers before. Is there any tutorial/simple example of doing something like this?
m
Thank you! I’ll take a look!
Just to follow up, if anyone happens to stumble on this thread later on, I followed the example in https://www.pulumi.com/blog/deploying-mysql-schemas-using-dynamic-providers/ and managed to do what I wanted. Big thanks!
👍 1