Getting-Started-With-SnowFlake
Today we will be discussing a new and interesting cloud database technology called SnowFlake.
SnowFlake is a cloud based data warehouse solution which can be hosted in AWS, Azure or Google Cloud platform.
Initial trials of SnowFlake get a $400 credit 30 day trial.
Snowflake breaks out charges into a credit system by compute and storage. You can sign up for a free trial of SnowFlake here.
For these posts I am going to assume that you have signed up and stood up your test instance of SnowFlake.
We are going to cover how SnowFlake handles compute and loading some data from S3 into SnowFlake.
Compute
Compute in SnowFlake is handled by warehouses, warehouses can be sized according to need. For example you may have a warehouse for reporting, and another warehouse for data science query processing. You are not limited by the number of warehouses, these warehouses will consume credits when active and will auto suspend themselves when not in use. All these options can be configured in the Warehouses tab :
Loading Data into SnowFlake
When you choose a source destination to load data from in S3, make sure it matches the region you choose when creating your SnowFlake instance.
Also make sure that you allow a list object, read bucket permissions and possibly get access in your CORS rules for that bucket.
In the above code we created a database, set the database context, created a table and created a stage from which to load a csv file hosted in a s3: bucket. And lastly using the stage copied that csv data into our SnowFlake table.
You can also load data without stages and flatten json formated data in SnowFlake quite easily, which I will cover in future topics.