GCP Dataflow pipeline with BigQuery as source and side input in python 
as per https://codelabs.developers.google.com/codelabs/cpb101-bigquery-dataflow-sideinputs/ 
and per https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/data_analysis/lab2/python/JavaProjectsThatNeedHelp.py 
and per https://www.udemy.com/gcp-data-engineer-and-cloud-architect/learn/v4/t/lecture/7598626?start=0 
\
resolve_package_help_score() assigns +1 if fixme or todo is found 
resolve_package_usage() assigns if pkg is mentioned
calculate_composite_score() combines the above \
defaults to dataflow deploy, --local for direct runner 
pipeline logic resides in create_pipeline() 
initial source from bigquery query 
input then passed to create_popularity_view() and create_help_view() which yield 2 separate views 
beam transformations are then applied to those pcollections separately 
create_popularity_view() resolves popularity with resolve_package_usage() 
create_help_view() resolves help score with resolve_package_help_score() 
hence DAG is splitted by calculating help score & usage separately 
DAG is then merged by calculating composite score from the above views 
sinks to cloud storage \
autoscaling will try to increase workers to 11, if you have default quota it will fail, though will proceed with what it has 
Autoscaling: Unable to reach resize target in zone us-central1-b. QUOTA_EXCEEDED: Quota 'CPUS' exceeded.  Limit: 8.0 in region us-central1.

