Docstrings Python
snowflake_ingestion.functions
config_logger()
Configure the global logger.
Reads LOGGER_LEVEL from the module environment and configures the root logging settings (level, format, date format). Intended to be called once at application start.
No return value.
Source code in snowflake_ingestion/functions.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 | |
connect_with_role(user, password, account, role)
Create a Snowflake connection using the specified credentials and role.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
user
|
str
|
Snowflake username. |
required |
password
|
str
|
Snowflake password. |
required |
account
|
str
|
Snowflake account identifier. |
required |
role
|
str
|
Snowflake role to assume for the session. |
required |
Returns:
| Type | Description |
|---|---|
SnowflakeConnection
|
snowflake.connector.connection.SnowflakeConnection: A Snowflake connection object with autocommit enabled. |
Notes
This function opens a network connection to Snowflake. The caller is responsible for closing the connection when it is no longer needed.
Source code in snowflake_ingestion/functions.py
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | |
plural_suffix(count)
Return 's' if count is greater than or equal to 2, else return an empty string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
count
|
int
|
The number of items. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
's' if count >= 2, else ''. |
Examples:
>>> plural_suffix(0)
''
>>> plural_suffix(1)
''
>>> plural_suffix(2)
's'
>>> plural_suffix(3)
's'
Source code in snowflake_ingestion/functions.py
155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 | |
run_sql_file(cur, filepath)
Execute SQL statements from a file using VAR_PLACEHOLDER placeholders.
The function
- Reads the SQL file.
- Finds placeholders in the form VAR_PLACEHOLDER (captures VAR).
- Replaces each found placeholder with the value of the corresponding global variable named VAR (stringified).
- Masks variables containing "PASSWORD" in logger output.
- Splits the file by semicolons and executes non-empty statements.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cur
|
SnowflakeCursor
|
Active cursor. |
required |
filepath
|
Path or str
|
Path to the SQL file. |
required |
Notes
Placeholders that do not match a global variable are replaced with
a string of the form <VAR_NOT_FOUND>.
Source code in snowflake_ingestion/functions.py
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 | |
use_context(cur, WH_NAME, DW_NAME, RAW_SCHEMA)
Set the Snowflake session context: warehouse, database and schema.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cur
|
SnowflakeCursor
|
Active Snowflake cursor. |
required |
WH_NAME
|
str
|
Warehouse name to use. |
required |
DW_NAME
|
str
|
Database name to use. |
required |
RAW_SCHEMA
|
str
|
Schema name to use. |
required |
Raises:
| Type | Description |
|---|---|
SystemExit
|
Exits the process on any exception when setting the context. |
Source code in snowflake_ingestion/functions.py
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 | |
snowflake_ingestion.init_infra_snowflake
create_roles_and_user(cur)
Create the DBT role and user in Snowflake.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cur
|
SnowflakeCursor
|
Active Snowflake cursor. |
required |
Source code in snowflake_ingestion/init_infra_snowflake.py
21 22 23 24 25 26 27 28 29 30 | |
grant_privileges(cur)
Grant required privileges to the TRANSFORMER role in Snowflake.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cur
|
SnowflakeCursor
|
Active Snowflake cursor. |
required |
Source code in snowflake_ingestion/init_infra_snowflake.py
32 33 34 35 36 37 38 39 40 41 | |
main()
Main initialization process for the Snowflake environment.
Establishes connections with appropriate roles (SYSADMIN, SECURITYADMIN, ACCOUNTADMIN) and executes setup steps in order.
Source code in snowflake_ingestion/init_infra_snowflake.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | |
set_data_retention(cur)
Set the data retention period for the Snowflake account.
Checks if the account is Enterprise, then applies the retention time. Logs the result in days, with pluralization handled automatically.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cur
|
SnowflakeCursor
|
Active Snowflake cursor. |
required |
Source code in snowflake_ingestion/init_infra_snowflake.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 | |
setup_data_warehouse(cur)
Create the data warehouse, database, and schemas in Snowflake.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cur
|
SnowflakeCursor
|
Active Snowflake cursor. |
required |
Source code in snowflake_ingestion/init_infra_snowflake.py
10 11 12 13 14 15 16 17 18 19 | |
snowflake_ingestion.scrape_links
get_parquet_links()
Scrape the NYC Taxi data page for Parquet file URLs. Sends an HTTP request to the NYC Taxi,parses the page HTML, and extracts links to Parquet files for the relevant years.
Returns:
| Type | Description |
|---|---|
List[str]
|
list[str]: List of Parquet file URLs. |
Source code in snowflake_ingestion/scrape_links.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | |
get_scraping_year()
Determine the scraping year to use based on environment settings. Uses SCRAPING_YEAR if defined and valid, otherwise selects the previous year when current month β€ 3, or the current year otherwise.
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The year to scrape. |
Doctests: from functions import SCRAPING_YEAR
get_scraping_year() == (int(SCRAPING_YEAR) if SCRAPING_YEAR != '' else current_year) - int(current_month <= 3) True
Source code in snowflake_ingestion/scrape_links.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | |
get_xpath()
Build the XPath expression used to locate Parquet file links. The expression filters NYC Taxi data links by year, starting from the scraping year up to the current year.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
XPath query string. |
Source code in snowflake_ingestion/scrape_links.py
50 51 52 53 54 55 56 57 58 59 60 61 62 | |
main()
Main scraping and metadata update workflow. Connects to Snowflake using the transformer role, initializes context, checks or creates the metadata table, scrapes new file URLs, and updates the metadata accordingly.
Source code in snowflake_ingestion/scrape_links.py
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | |
setup_meta_table(cur)
Ensure the metadata table exists in Snowflake. Executes the SQL script responsible for creating or verifying the metadata table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cur
|
SnowflakeCursor
|
Active Snowflake cursor. |
required |
Source code in snowflake_ingestion/scrape_links.py
83 84 85 86 87 88 89 90 91 92 93 94 | |
snowflake_ingestion.upload_stage
download_and_upload_file(cur, file_url, filename)
Download a Parquet file from URL and upload it directly to Snowflake stage.
This function streams the file content directly to Snowflake without persisting it permanently on disk. It uses a temporary file that is automatically deleted after the upload completes, ensuring no residual files are left behind.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cur
|
SnowflakeCursor
|
Active Snowflake cursor used to execute the PUT command. |
required |
file_url
|
str
|
HTTPS URL of the Parquet file to download. |
required |
filename
|
str
|
Destination filename in the Snowflake stage. |
required |
Raises:
| Type | Description |
|---|---|
HTTPError
|
If the HTTP request fails (non-200 status code). |
Error
|
If the Snowflake PUT command fails. |
Source code in snowflake_ingestion/upload_stage.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | |
main()
Main staging process for Parquet files.
Connects to Snowflake, retrieves metadata for scraped files, downloads each file, uploads it to the stage, and updates the metadata table with the appropriate load status.
Source code in snowflake_ingestion/upload_stage.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | |
snowflake_ingestion.load_to_table
cleanup_stage_file(cur, filename)
Remove the processed file from the Snowflake stage. Args: cur (snowflake.connector.cursor.SnowflakeCursor): Active Snowflake cursor. filename (str): Name of the file to delete from the stage.
Source code in snowflake_ingestion/load_to_table.py
101 102 103 104 105 106 107 108 | |
copy_file_to_table_and_count(cur, filename, table_schema)
Load a Parquet file from stage into the RAW table and count inserted rows. Uses COPY INTO with transformation to generate TRIP_ID using sequence and maps Parquet columns using positional references.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cur
|
SnowflakeCursor
|
Active Snowflake cursor. |
required |
filename
|
str
|
Name of the staged file to load. |
required |
table_schema
|
list
|
Pre-detected schema from create_table function. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
Number of rows inserted into the RAW table. |
Source code in snowflake_ingestion/load_to_table.py
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
create_table(cur)
Create or verify the RAW table dynamically based on staged file schema. Executes SQL to detect the file schema in the Snowflake stage, creates the RAW table if it does not exist, and adds the filename column if needed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cur
|
SnowflakeCursor
|
Active Snowflake cursor. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
list |
List[Tuple[str, str]]
|
The table schema detected from staged files |
Source code in snowflake_ingestion/load_to_table.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | |
handle_loading_error(cur, filename, error)
Handle errors occurring during file loading into the RAW table. Logs the error and updates the metadata table to mark the file as failed during the load step.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cur
|
SnowflakeCursor
|
Active Snowflake cursor. |
required |
filename
|
str
|
Name of the file that failed to load. |
required |
error
|
Exception
|
Exception raised during the loading process. |
required |
Source code in snowflake_ingestion/load_to_table.py
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |
main()
Main process for loading staged Parquet files into the RAW table. Connects to Snowflake, ensures the RAW table exists, retrieves staged files, loads each into the RAW table, updates metadata, and cleans up stage files.
Source code in snowflake_ingestion/load_to_table.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 | |
update_metadata(cur, filename, rows_loaded)
Update the metadata table after successful file loading. Args: cur (snowflake.connector.cursor.SnowflakeCursor): Active Snowflake cursor. filename (str): Name of the loaded file. rows_loaded (int): Number of rows successfully inserted.
Source code in snowflake_ingestion/load_to_table.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
snowflake_ingestion.backup_policy
create_and_set_backup(cur)
Creates the backup policies and backup sets for the data warehouse.
Executes the SQL script to create the monthly backup policies and link them to the target objects (full database, raw table, final schema).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cur
|
SnowflakeCursor
|
Active Snowflake cursor. |
required |
Source code in snowflake_ingestion/backup_policy.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | |
main()
Main initialization process for the Snowflake environment.
Establishes connections with appropriate roles (SYSADMIN, SECURITYADMIN, ACCOUNTADMIN) and executes setup steps in order.
Source code in snowflake_ingestion/backup_policy.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | |
snowflake_ingestion.tests.test_functions
test_connect_with_role_autocommit_enabled()
Unit test verifying that autocommit is always enabled. Verifies that the autocommit=True parameter is systematically passed to the Snowflake connection, regardless of other parameters.
Source code in snowflake_ingestion/tests/test_functions.py
24 25 26 27 28 29 30 31 32 33 | |
test_connect_with_role_parameters_passed_correctly()
Unit test verifying correct parameter forwarding. Verifies that all parameters (user, password, account, role) are correctly forwarded to snowflake.connector.connect without modification.
Source code in snowflake_ingestion/tests/test_functions.py
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
test_connect_with_role_success()
Unit test for connect_with_role on success. Verifies that the function calls snowflake.connector.connect with the correct parameters, enables autocommit, and returns the connection object.
Source code in snowflake_ingestion/tests/test_functions.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | |
test_run_sql_file()
Unit test for run_sql_file with variable substitution. Verifies that SQL placeholders are correctly replaced by the values of the corresponding global variables.
Source code in snowflake_ingestion/tests/test_functions.py
90 91 92 93 94 95 96 97 98 99 100 101 | |
test_run_sql_file_multiple_statements()
Unit test for run_sql_file with multiple statements. Verifies that statements separated by semicolons are correctly split and executed individually.
Source code in snowflake_ingestion/tests/test_functions.py
116 117 118 119 120 121 122 123 124 125 | |
test_run_sql_file_variable_not_found()
Unit test for run_sql_file with an unresolved variable.
Verifies that placeholders with no matching global variable
are replaced by the default value
Source code in snowflake_ingestion/tests/test_functions.py
104 105 106 107 108 109 110 111 112 113 | |
test_use_context(mocker)
Unit test for the use_context function. Verifies that the function executes the 3 expected SQL commands to configure the Snowflake context (warehouse, database, schema) in the correct order.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mocker
|
Mock
|
pytest fixture for mocking |
required |
Source code in snowflake_ingestion/tests/test_functions.py
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | |
test_use_context_exception()
Unit test for error handling in use_context. Verifies that the function raises a SystemExit exception when an error occurs during SQL command execution.
Source code in snowflake_ingestion/tests/test_functions.py
76 77 78 79 80 81 82 83 84 85 | |
snowflake_ingestion.tests.test_init_infra_snowflake
test_create_roles_and_user()
Unit test for the create_roles_and_user function. Tests the creation of Snowflake roles and users as part of infrastructure initialization. Verifies that the appropriate SQL script is executed and success/failure logs are properly recorded during the role and user creation process.
Source code in snowflake_ingestion/tests/test_init_infra_snowflake.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | |
test_grant_privileges()
Unit test for the grant_privileges function. Tests the granting of privileges to the created roles in Snowflake. Verifies that the correct SQL script is executed and appropriate log messages are recorded during the privilege granting process.
Source code in snowflake_ingestion/tests/test_init_infra_snowflake.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | |
test_main_exception()
Unit test for the main function when an exception occurs during initialization. Tests error handling by simulating a connection failure and verifying that the exception is properly caught and logged as an error.
Source code in snowflake_ingestion/tests/test_init_infra_snowflake.py
65 66 67 68 69 70 71 72 73 74 | |
test_main_success()
Unit test for the main function when the infrastructure initialization completes successfully. Tests the complete initialization flow including warehouse setup, role creation, and privilege granting. Verifies that all three connections are made with the appropriate roles (ACCOUNTADMIN, SYSADMIN, SECURITYADMIN) and that all initialization functions are called in sequence with successful logging.
Source code in snowflake_ingestion/tests/test_init_infra_snowflake.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | |
test_set_data_retention()
Unit test for the set_data_retention function. Verifies that the function executes the correct SQL file with the specified retention time and logs appropriate messages when setting data retention policies.
Source code in snowflake_ingestion/tests/test_init_infra_snowflake.py
4 5 6 7 8 9 10 11 12 13 14 15 | |
test_setup_data_warehouse()
Unit test for the setup_data_warehouse function. Tests the creation of warehouse, database, and schemas infrastructure in Snowflake. Verifies that the correct SQL script is executed and appropriate log messages are recorded during the warehouse setup process.
Source code in snowflake_ingestion/tests/test_init_infra_snowflake.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | |
snowflake_ingestion.tests.test_scrape_links
test_get_parquet_links_success()
Test successful extraction of parquet file links from the NYC TLC website HTML content. Verifies that only links with title 'Yellow Taxi Trip Records' are extracted and returned.
Source code in snowflake_ingestion/tests/test_scrape_links.py
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 | |
test_get_scraping_year_with_empty_env_early_month()
Test get_scraping_year behavior when SCRAPING_YEAR is empty and current month is early (January to March). The function should default to the previous year since current year's data may not be fully available yet.
Source code in snowflake_ingestion/tests/test_scrape_links.py
18 19 20 21 22 23 24 25 26 27 | |
test_get_scraping_year_with_empty_env_late_month()
Test get_scraping_year behavior when SCRAPING_YEAR is empty and current month is late (April to December). The function should default to the current year for scraping operations.
Source code in snowflake_ingestion/tests/test_scrape_links.py
29 30 31 32 33 34 35 36 37 38 | |
test_get_scraping_year_with_invalid_env_early_month()
Test get_scraping_year behavior when SCRAPING_YEAR contains invalid non-numeric data in early months. The function should log an error and default to the previous year while handling the invalid input gracefully.
Source code in snowflake_ingestion/tests/test_scrape_links.py
40 41 42 43 44 45 46 47 48 49 50 51 | |
test_get_scraping_year_with_invalid_env_late_month()
Test get_scraping_year behavior when SCRAPING_YEAR contains invalid non-numeric data in late months. The function should log an error and default to the current year while handling the invalid input gracefully.
Source code in snowflake_ingestion/tests/test_scrape_links.py
53 54 55 56 57 58 59 60 61 62 63 64 65 | |
test_get_scraping_year_with_valid_env()
Test that get_scraping_year correctly parses and returns a valid integer value from the SCRAPING_YEAR environment variable. Verifies that when SCRAPING_YEAR is set to '2023', the function returns the integer 2023.
Source code in snowflake_ingestion/tests/test_scrape_links.py
9 10 11 12 13 14 15 16 | |
test_get_xpath()
Test that get_xpath generates the correct XPath expression for locating Yellow Taxi Trip Records links. The XPath should include both the scraping year and current year to capture relevant data files.
Source code in snowflake_ingestion/tests/test_scrape_links.py
67 68 69 70 71 72 73 74 75 76 | |
test_main_file_parsing()
Test that the main function correctly parses filename patterns to extract year and month components. Verifies that URLs like 'yellow_tripdata_2023-07.parquet' are correctly parsed into (2023, 7) tuples.
Source code in snowflake_ingestion/tests/test_scrape_links.py
177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 | |
test_main_with_new_files()
Test the main scraping workflow when new parquet files are detected that don't exist in the metadata table. Verifies that new files trigger INSERT operations into the metadata table with appropriate logging.
Source code in snowflake_ingestion/tests/test_scrape_links.py
126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 | |
test_main_without_new_files()
Test the main scraping workflow when all discovered parquet files already exist in the metadata table. Verifies that no INSERT operations occur and appropriate informational and warning logs are recorded.
Source code in snowflake_ingestion/tests/test_scrape_links.py
154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 | |
test_setup_meta_table()
Test that setup_meta_table correctly executes the SQL script for creating or verifying the metadata table. Verifies that the appropriate SQL file is executed and success/failure logs are recorded.
Source code in snowflake_ingestion/tests/test_scrape_links.py
113 114 115 116 117 118 119 120 121 122 123 124 | |
snowflake_ingestion.tests.test_upload_stage
test_download_and_upload_file_http_error()
Unit test for download_and_upload_file in case of HTTP error. Verifies that the function raises an exception when the HTTP download fails.
Source code in snowflake_ingestion/tests/test_upload_stage.py
40 41 42 43 44 45 46 47 48 49 50 51 52 | |
test_download_and_upload_file_snowflake_error()
Unit test for download_and_upload_file in case of Snowflake error. Verifies that the function raises an exception when the Snowflake upload fails.
Source code in snowflake_ingestion/tests/test_upload_stage.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | |
test_download_and_upload_file_success()
Unit test for download_and_upload_file in case of success. Verifies that the function downloads the file from the URL, uploads it to Snowflake via PUT, and automatically cleans up the temporary file.
Source code in snowflake_ingestion/tests/test_upload_stage.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | |
test_download_and_upload_file_tempfile_error()
Unit test for download_and_upload_file in case of temporary file creation error. Verifies that the function raises an exception when temporary file creation fails.
Source code in snowflake_ingestion/tests/test_upload_stage.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | |
test_main_file_processing_flow()
Unit test for the complete file processing flow. Verifies the order of operations: DB connection, metadata retrieval, download, upload, status update.
Source code in snowflake_ingestion/tests/test_upload_stage.py
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 | |
test_main_with_files()
Unit test for the main function with files to upload. Verifies that the function retrieves the scraped files, downloads them, uploads them to Snowflake and updates the status in the metadata.
Source code in snowflake_ingestion/tests/test_upload_stage.py
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 | |
test_main_with_upload_error()
Unit test for the main function with upload error. Verifies that the function correctly handles upload errors by updating the status to FAILED_STAGE.
Source code in snowflake_ingestion/tests/test_upload_stage.py
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | |
snowflake_ingestion.tests.test_load_to_table
test_cleanup_stage_file()
Tests the removal of a processed file from the Snowflake stage. Verifies the correct REMOVE command is executed and a success log is recorded.
Source code in snowflake_ingestion/tests/test_load_to_table.py
133 134 135 136 137 138 139 140 141 142 | |
test_copy_file_to_table_and_count_copy_error()
Tests the handling of an exception raised during the COPY INTO execution. Verifies that the exception is propagated and not caught within the function.
Source code in snowflake_ingestion/tests/test_load_to_table.py
100 101 102 103 104 105 106 107 108 109 110 111 112 | |
test_copy_file_to_table_and_count_success()
Tests the successful execution of COPY INTO command. Verifies the command execution, correct parsing of the result, logging of success with row count, and the return of the correct number of loaded rows.
Source code in snowflake_ingestion/tests/test_load_to_table.py
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | |
test_copy_file_to_table_and_count_zero_loaded()
Tests the COPY INTO command when no rows are processed. Verifies the function returns 0 when the execution result indicates no files were processed.
Source code in snowflake_ingestion/tests/test_load_to_table.py
86 87 88 89 90 91 92 93 94 95 96 97 98 | |
test_create_table_no_schema()
Tests the behavior when the stage contains no data. Verifies that only the schema detection script runs, a warning is logged, no table creation SQL is executed, and an empty schema list is returned.
Source code in snowflake_ingestion/tests/test_load_to_table.py
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | |
test_create_table_success()
Tests the successful creation of a table with dynamic schema detection. Verifies that all SQL files are executed in the correct order, the correct CREATE TABLE statement is generated, appropriate logs are recorded, and the correct table schema is returned.
Source code in snowflake_ingestion/tests/test_load_to_table.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | |
test_handle_loading_error()
Tests the error handling flow when a file fails to load. Verifies that an error is logged, the metadata table is updated to 'FAILED_LOAD', and a debug log is recorded.
Source code in snowflake_ingestion/tests/test_load_to_table.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 | |
test_main_complete_flow_with_counts()
Tests a complete successful flow with a single file, verifying precise function call sequence. Ensures update_metadata and cleanup_stage_file are called exactly once with the correct arguments.
Source code in snowflake_ingestion/tests/test_load_to_table.py
262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 | |
test_main_connection_error()
Tests the main flow when the initial database connection fails. Verifies that the connection exception propagates and stops the process.
Source code in snowflake_ingestion/tests/test_load_to_table.py
253 254 255 256 257 258 259 260 | |
test_main_exception_handling_in_loop()
Tests error handling within the file processing loop. Verifies that an error on one file triggers handle_loading_error for that file, while other files continue processing normally (update and cleanup are called for successful files).
Source code in snowflake_ingestion/tests/test_load_to_table.py
309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 | |
test_main_multiple_files_different_results()
Tests processing multiple files with different row counts. Verifies that update_metadata is called for each file with its respective row count.
Source code in snowflake_ingestion/tests/test_load_to_table.py
284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 | |
test_main_no_staged_files()
Tests the main flow when no files are found in the stage. Verifies that the stage analysis log occurs but no copy operations are attempted.
Source code in snowflake_ingestion/tests/test_load_to_table.py
219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 | |
test_main_success_flow()
Tests the complete successful main execution flow with two files. Verifies the full sequence: connection, context setting, table creation, fetching staged files, loading each file, updating metadata, and cleanup.
Source code in snowflake_ingestion/tests/test_load_to_table.py
161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 | |
test_main_table_creation_error()
Tests the main flow when table creation fails. Verifies that the exception from create_table propagates and stops the main process.
Source code in snowflake_ingestion/tests/test_load_to_table.py
238 239 240 241 242 243 244 245 246 247 248 249 250 251 | |
test_main_with_loading_error()
Tests the main flow when one file fails to load. Verifies that the error handler is called for the failed file, while successful files continue processing normally.
Source code in snowflake_ingestion/tests/test_load_to_table.py
189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 | |
test_update_metadata()
Tests the successful update of the metadata table after a file load. Verifies the correct UPDATE SQL is executed with the proper parameters and that a debug log is recorded.
Source code in snowflake_ingestion/tests/test_load_to_table.py
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | |
snowflake_ingestion.tests.test_backup_policy
test_backup_configuration_values()
Test backup configuration constants are properly accessed and logged.
Ensures retention days for different backup policies are correctly formatted in log messages.
Source code in snowflake_ingestion/tests/test_backup_policy.py
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | |
test_create_and_set_backup()
Test the create_and_set_backup function execution.
Verifies that the function runs the correct SQL file and logs appropriate messages with retention period information for each backup policy.
Source code in snowflake_ingestion/tests/test_backup_policy.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | |
test_main_exception()
Test error handling in main function when an exception occurs.
Verifies that exceptions are caught and logged as errors without crashing.
Source code in snowflake_ingestion/tests/test_backup_policy.py
67 68 69 70 71 72 73 74 75 76 77 78 79 | |
test_main_success()
Test the main function with successful backup setup workflow.
Ensures connection is made with SYSADMIN role, backup function is called, connection is closed properly, and success is logged.
Source code in snowflake_ingestion/tests/test_backup_policy.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | |
test_main_with_connection_context()
Test proper cursor context management in main function.
Verifies that cursor context manager protocols are followed correctly.
Source code in snowflake_ingestion/tests/test_backup_policy.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 | |