Spark Mirror

This repository contains code for processing delta data using Apache Spark.

Setup

To run this code, ensure you have the necessary and requirements.txt

Decorators

Two decorators are included in the code:

execution_time: Measures the execution time of a function.
error_log: Handles errors and logs exceptions.

Logging

Logging is set up to log messages to a MySQL database. You can create a log database and table using the create_logdb function.

Functions

Several functions are defined for data processing:

get_log: Retrieves log data from a MySQL database.
get_mirror: Retrieves mirror data from a CSV file.
get_delta: Creates a delta between mirror and delta data.
merge_delta: Manages the merging of delta files for a specified table.

Main Function

The merge_delta function orchestrates the merging of delta files for a specified table. It processes delta files one by one, creating a mirror if needed and logging the actions.

Usage

You can use the provided functions to process delta data efficiently using Apache Spark.

Example

To merge delta files for a table named 'data_deltas' with keys ['account_rk'], you can call the merge_delta function with the appropriate parameters.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data_deltas		data_deltas
README.md		README.md
mirror.ipynb		mirror.ipynb
mysql-connector-j-8.3.0.jar		mysql-connector-j-8.3.0.jar
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spark Mirror

Setup

Decorators

Logging

Functions

Main Function

Usage

Example

About

Uh oh!

Releases

Packages

Uh oh!

Languages

iagunov/spark_mirror

Folders and files

Latest commit

History

Repository files navigation

Spark Mirror

Setup

Decorators

Logging

Functions

Main Function

Usage

Example

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages