At the 15# Azurehads meetup we talked about how to integrate our data using Azure Data Factory and from the Azure Data Factory to call an Azure HDInsights cluster that will be created on-demand. After the process ends, the HDInsights cluster will be automatically deleted.
Also we saw how we can copy and transform data from almost any source to almost any location.
Download the presentation: http://bit.ly/AH15Presentation
Download the demo video: http://bit.ly/AH15Video
Download the project files: http://bit.ly/AH15Projectfiles
At the demo, we created a data factory. At the data factory we created a pipeline that reads a Python script from an Azure Storage account folder. THe Azure Data FActory creates an on-demand HDInsights Spark cluster and runs the python script. The python script reads a text and provides for output a text with the word count of the input text. Finally the cluster is automatically deleted.