SnowPro Core Certification: 考试和下一步(7/8)【本系列完结】

SnowPro Core Certification: 通过考试和下一步(7/8)【本系列完结】

感谢看到这系列的最后一篇💗💗💗💗💗💗💗
欢迎点赞,评论,和转发!

这一系列主要是我学习考证SnowPro Core Certification的总结。把分散在Snowflake文档中的知识点归纳到一起。其中很大一部分是我根据模拟考试题内容总结出来的知识点。

希望我的整理对大家的学习和考试有帮助。


Please subscribe YouTube Channel(请订阅油管频道): Data Driven Wealth 数说财富 DDW - YouTube

考试报名、平台

两种考试方式
👉线上:Online Proctoring
报名时选择线上或者线下
👉现场:Onsite Testing Centers
根据个人情况看考场距离

模拟考试

Udemy和Snowflake都提供模拟考试,不同的价位。感觉多做些题还是能够增加信心的。我这里就不具体推荐了。

实战

👉我预约了两周后的在线考试,然后就是集中学习和模考
👉提前测试考试系统(这里我测试时候发现了鼠标的问题,但是以为是测试,没有引起重视。我以前也参加过类似线上考试,没有什么问题)
👉考官查看ID和考试环境(房间,桌面)
👉考试中遇到一个问题:鼠标基本不听指挥,很难移动和点击到想要的答案和进入下一道题(大概一分钟一道题,主要是在瞄准鼠标了👊➽)。重启考试系统两回也没能解决。只好硬着头皮做,根本没有时间回头检查。只好保证一次性尽可能做正确选择。幸运的是在规定时间内都做完了。

可能遇到的问题

另外,通过参加考试,我的感觉是难度有所增加。并不是官方所说的100%内容都在文档之中。有些题目需要在理解的基础上做出判断。

当然了,官方解释所考题目部分是为了统计目的,但是背后具体怎么计算考试成绩并不透明!

证书

通过考试得到一个证书(Passing Score: 750 + Scaled Scoring from 0 - 1000)

✅下一步

👉当然了,这是一个初级证书,是一个基础。高级证书有不同的类别
  • SnowPro® Advanced Data Engineer
  • SnowPro® Advanced Data Analyst
  • SnowPro® Advanced Administrator
  • SnowPro® Advanced Architect
  • SnowPro® Advanced Data Scientist
👉把学到的best practice应用到工作项目中

没有8/8了,哈哈 (0 ~ 7 已经8篇文章了)

SnowPro Core Certification: Data Protection and Data Sharing(6/8)

SnowPro Core Certification: Data Protection and Data Sharing(6/8)



Please subscribe YouTube Channel(请订阅油管频道): Data Driven Wealth 数说财富 DDW - YouTube

CHECK SNOWFLAKE DOCUMENTATION FOR THE LATEST RELEASE/FEATURE

Data Protection

👉Cloud services layer

Snowflake's cloud services layer is its brain and is a reliable, always-on service. Snowflake accounts are only accessible via cloud services. All requests to Snowflake, whether via the Snowflake web UI or SnowSQL, travel through this layer. 

👉Snowflake-provided clients

Snowflake-provided clients, including SnowSQL (command line interface, for linux, windows, macos), connectors for Python and Spark, and drivers for Node.js, JDBC, ODBC, and more.

Connectors available for Snowflake are Python, Kafka, and Spark.
 
Snowflake also provides several drivers like ODBC, JDBC, Node.js, Go,.Net, and PHP PDO. 

The Snowflake SQL API is a REST API that you can use to access and update data in a Snowflake database.

Snowflake SQL API supports Oauth, and Key Pair authentication.

Snowflake SQL API provides operations that we can use to:

  • Submit SQL statements for execution.

  • Check the status of the execution of a statement.

  • Cancel the execution of a statement.

  • Fetch query results concurrently. 


Snowflake has the following Cloud Partner Categories:

  • Data Integration

  • Business Intelligence (BI)

  • Machine Learning & Data Science

  • Security Governance & Observability

  • SQL Development & Management, and

  • Native Programmatic Interfaces.


👉Database Storage

Snowflake's shared storage layer resides on low-cost object cloud storage. Snowflake currently supports AWS S3 storage, Azure Blob Storage, and Google Cloud Storage for data storage.

The benefits of The Data Cloud are Access, Governance, and Action (AGA).

Access means that organizations can easily discover data and share it internally or with third parties without regard to geographical location.

Governance is about setting policies and rules and protecting the data in a way that can unlock new value and collaboration while maintaining the highest levels of security and compliance.

Action means you can empower every part of your business with data to build better products, make faster decisions, create new revenue streams and realize the value of your greatest untapped asset, your data.

👉Time Travel

Depending on the Snowflake edition, the Time Travel duration might range from 1 to 90 days. 

The Standard edition allows for one day of Time Travel. 

Time Travel is possible for up to 90 days in the Enterprise version and above.

Transient and Temporary tables have a maximum of 1 day of Time Travel.

https://docs.snowflake.com/en/user-guide/data-time-travel#data-retention-period

👉Fail-safe


Fail-safe is supported in all Snowflake editions; therefore, the minimum edition with fail-safe support is the Standard edition.

Once the data is in fail-safe storage, only Snowflake support can help retrieve the data. The customer cannot access fail-safe storage. 

Fail-safe is not provided as a means for accessing historical data after the Time Travel retention period has ended. It is for use only by Snowflake to recover data that may have been lost or damaged due to extreme operational failures. 

Data recovery through Fail-safe may take from several hours to several days to complete.

https://docs.snowflake.com/en/user-guide/data-failsafe

Transient and temporary tables don't have any failsafe; this is done to reduce storage costs for temporary and transient data. 

https://docs.snowflake.com/en/user-guide/tables-temp-transient

👉Replication and Failover

  • Database and share replication are available to all accounts.

  • Replication of other account objects & failover/failback require Business Critical Edition (or higher). To inquire about upgrading, please contact Snowflake Support.


This feature enables the replication of objects from a source account to one or more target accounts in the same organization. Replicated objects in each target account are referred to as secondary objects and are replicas of the primary objects in the source account. Replication is supported across regions and across cloud platforms.

https://docs.snowflake.com/en/user-guide/account-replication-intro

👉Private connectivity

Private connectivity enables you to ensure that access to your Snowflake instance is via a secure connection and, potentially, to block internet-based access completely. Private connectivity to Snowflake requires the Business-Critical edition as a minimum.

👉Securable Object

Securable Object is an entity to which access can be granted. Unless allowed by a grant, access will be denied.


👉NETWORK POLICY

Only security administrators (i.e., users with the SECURITYADMIN role) or higher or a role with the global CREATE NETWORK POLICY privilege can create network policies using Snowsight, Classic Web Interface, and SQL.

The SHOW PARAMETERS command determines whether a network policy is set on the account or for a specific user.

For Account level: SHOW PARAMETERS LIKE 'network_policy' IN ACCOUNT;

For User level : SHOW PARAMETERS LIKE 'network_policy' IN USER <username>; 

        Example - SHOW PARAMETERS LIKE 'network_policy' IN USER john;


Network policies currently support only Internet Protocol version 4 (i.e. IPv4) addresses.

👉Tri-Secret Secure 

Tri-Secret Secure refers to the combination of a Snowflake-managed key and a customer-managed key, which results in the creation of a composite master key to protect your data. Tri-Secret Secure requires the Business Critical edition as a minimum and can be activated by contacting Snowflake support. 

https://docs.snowflake.com/en/user-guide/security-encryption-manage

👉encryption keys

By default, Snowflake manages encryption keys automatically, requiring no customer intervention. Snowflake-managed keys are 

--> rotated regularly (at 30-day intervals), and 

--> an annual rekeying process re-encrypts data with new keys. The data encryption and key management processes are entirely transparent to the users. Snowflake uses 

--> AES 256-bit encryption to encrypt data at rest

https://docs.snowflake.com/en/user-guide/security-encryption-manage

Snowflake encrypts all data in transit using Transport Layer Security (TLS) 1.2. This applies to all Snowflake connections, including those made through the Snowflake Web interface, JDBC, ODBC, and the Python connector. 

👉Snowflake-managed keys

All Snowflake-managed keys are automatically rotated by Snowflake when they are more than 30 days old. Active keys are retired, and new keys are created. When Snowflake determines the retired key is no longer needed, the key is automatically destroyed. When active, a key is used to encrypt data and is available for usage by the customer. When retired, the key is used solely to decrypt data and is only available for accessing the data.

https://docs.snowflake.com/en/user-guide/security-encryption-end-to-end

👉Multi-factor authentication

MFA is enabled by default for all Snowflake accounts and is available in all Snowflake editions. 

All Snowflake client tools, including the web interface, SnowSQL, and the various connectors and drivers, support MFA. 

Snowpipe is a snowflake-managed serverless service. A Snowflake user can not log into it; therefore, it doesn't require MFA. https://docs.snowflake.com/en/user-guide/security-mfa

Multi-factor authentication adds additional protection to the login process in Snowflake. Snowflake provides key pair authentication as a more secure alternative to the traditional username/password authentication approach. Additionally, Snowflake offers federated authentication, enabling users to access their accounts via a single sign-on (SSO). Users authenticate using SAML 2.0-compliant single sign-on (SSO) via an external identity provider (IdP).

Snowflake strongly recommends that all users with the ACCOUNTADMIN role be required to use MFA.

Okta and Microsoft ADFS provide native Snowflake support for federated authentication and SSO.

After a specified period of time (defined by the IdP), a user’s session in the IdP automatically times out, but this does not affect their Snowflake sessions. Any Snowflake sessions that are active at the time remain open and do not require re-authentication. However, to initiate any new Snowflake sessions, the user must log into the IdP again.

Snowflake SQL API supports Oauth, and Key Pair authentication.


👉external Tokenization

Snowflake supports masking policies that may be applied to columns and enforced at the column level to provide column-level security. Column-level security is achieved by dynamic data masking or external Tokenization.

https://docs.snowflake.com/en/user-guide/security-column

👉IP Addresses 

If you provide both Allowed IP Addresses and Blocked IP Addresses, Snowflake applies the Blocked List first. This would block your own access. Additionally, in order to block all IP addresses except a select list, you only need to add IP addresses to ALLOWED_IP_LIST. Snowflake automatically blocks all IP addresses not included in the allowed list.

✅Data Sharing

Data Engineer :: Certifications (Lifelong Learning)

Data Engineer :: Certification (Lifelong Learning)

01 AWS Certified Developer - Associate


02 Data Vault 2.0 Practitioner



03 SnowPro Core Certification



ReadingList :: Data Engineer (Lifelong Learning)

ReadingList :: Data Engineer (Lifelong Learning)

读书就像习武之人修炼内功,好的书可以使自己功力大大提升,就像拜师武林泰斗了;

如果能面对面学习,那就更厉害了!

这是我修炼Data Engineer功力的书单和读书进度,加油!

01 Building a Scalable Data Warehouse with Data Vault 2.0 [Paperback]


Reading Plan (to finish by 17/Dec)


Starting from 08/Oct/2023 ✅


15-Oct

  • c1 ✅  09/Oct
  • c2 ✅  10/Oct
  • c3 ✅  16/OCT

Notes: 

  1. Kimball: two-layer data warehouse
  2. Inmon: three-layer data warehouse model
  3. Data sources: ERP, CRM, Files etc
  4. TDQM, DWQ 


 

💪💪😄😄

22-Oct

  • c4 ✅  19/OCT
  • c5 ✅  22/OCT

Entity definitions (Hub, Link, Sat)

SAL (same-as link), various links, sats 

29-Oct

  • c6 ✅  24/OCT
  • c7 ✅  26/OCT

Slow Changing Dimensions (SCD)

Star schemas

Snowflake design (indirect dimensions) 

5-Nov

  • c8 ✅  02/Nov
  • c9 ✅  05/Nov

MDM 

12-Nov

  • c10 ✅  27/NOV ✌

Metrics and Error Marts 

19-Nov

  • c11 ✅  12/Nov

Data Extraction (stage loading: historical/batch) 

26-Nov

  • c12  ✅  16/Nov

DV loading 

3-Dec

  • c13 ✅  28/Nov

Data Quality 

10-Dec

  • c14  ✅  01/Dec 😀😀😀

17-Dec

  • c15 ✅  17/OCT

Business users: Accessing Information Mart to build a multidimensional database

Merry Christmas!

嗯,应该想想怎么奖励自己了,要提前完成任务了!😀

这个假期读几本闲书吧
[The Summer Job]





SnowPro Core Certification: Data Transformations (5/8)

SnowPro Core Certification: Data Transformations (5/8)



Please subscribe YouTube Channel(请订阅油管频道): Data Driven Wealth 数说财富 DDW - YouTube

✅Transforming Data During a Load

https://docs.snowflake.com/en/user-guide/data-load-transform

Snowflake supports transforming data while loading it into a table using the COPY INTO <table> command, dramatically simplifying your ETL pipeline for basic transformations. This feature helps you avoid the use of temporary tables to store pre-transformed data when reordering columns during a data load. This feature applies to both bulk loading and Snowpipe.

The COPY command supports:

  • Column reordering, column omission, and casts using a SELECT statement. There is no requirement for your data files to have the same number and ordering of columns as your target table.

  • The ENFORCE_LENGTH | TRUNCATECOLUMNS option, which can truncate text strings that exceed the target column length.

When loading data into a table using the COPY command, Snowflake allows you to do simple transformations on the data as it is being loaded. During the load process, the COPY command allows for modifying the order of columns, omitting one or more columns, casting data into specified data types, and truncating values. 

While loading the data, complex transformations such as joins, filters, aggregations, and the use of FLATTEN are not supported as they are not essential data transformations. Therefore, joining, filtering, and aggregating the data are supported ONLY after the data has been loaded into a table.

The table stages do not allow basic transformations during the COPY process; thus, basic transformations may only be performed while loading data from external stages, named internal stages or user stages.

👉Supported File Formats

The following file format types are supported for COPY transformations:

  • CSV

  • JSON

  • Avro

  • ORC

  • Parquet

  • XML

To parse a staged data file, it is necessary to describe its file format:

CSV

The default format is character-delimited UTF-8 text. The default field delimiter is a comma character (,). The default record delimiter is the new line character. If the source data is in another format, specify the file format type and options.

When querying staged data files, the ERROR_ON_COLUMN_COUNT_MISMATCH option is ignored. There is no requirement for your data files to have the same number and ordering of columns as your target table.

JSON

To transform JSON data during a load operation, you must structure the data files in NDJSON (“Newline delimited JSON”) standard format; otherwise, you might encounter the following error:

Error parsing JSON: more than one document in the input

All other file format types

Specify the format type and options that match your data files.

To explicitly specify file format options, set them in one of the following ways:

Querying staged data files using a SELECT statement:

  • As file format options specified for a named file format or stage object. The named file format/stage object can then be referenced in the SELECT statement.

Loading columns from staged data files using a COPY INTO <table> statement:

  • As file format options specified directly in the COPY INTO <table> statement.

  • As file format options specified for a named file format or stage object. The named file format/stage object can then be referenced in the COPY INTO <table> statement.

If no File Format object or options are not provided to either the stage or copy statement, the default behaviour will be to try and interpret the contents of a stage as a CSV with UTF-8 encoding.

👉Parameters in copy into

ENFORCE_LENGTH:   
  • If TRUE, the COPY statement produces an error if a loaded string exceeds the target column length.

  • If FALSE, strings are automatically truncated to the target column length. 

TRUNCATECOLUMNS:

  • If TRUE, strings are automatically truncated to the target column length.

  • If FALSE, the COPY statement produces an error if a loaded string exceeds the target column length.

Featured Posts

SnowPro Badges and Certificates

SnowPro Badges and Certificates Online Verification https://achieve.snowflake.com/profile/richardhou888/wallet

Popular Posts Recommended