Big Data 4. Denormalized Schema in Data Warehouses

1. Denormalized Schema A denormalized schema is a database design that reduces the number of joins by combining related data into fewer tables, often at the expense of some redundancy. Optimized for read-heavy analytical workloads Not ideal for OLTP systems (which benefit from normalization) In a typical data warehouse: Data is modeled to maximize query performance and simplify analytics. This contrasts with normalized schemas used in transactional databases, where the goal is data integrity and avoiding redundancy. 2. Star and Snowflake Schemas Two common denormalized designs are: ...

June 4, 2025 · map[name:Minjun Jeon]

Big Data 5. Modern Data Stack

As the cost of cloud storage and compute continues to fall and internet speeds rise, data engineers are rethinking how data is stored, processed, and analyzed. This shift has led to what we now call the modern data stack: a flexible, scalable, cloud-native approach that replaces traditional monolithic systems. In this post, we’ll explore key architectural changes that enabled the modern data stack, from the transition from SMP to MPP, to the decoupling of storage and compute, and the rise of columnar databases, ELT workflows, and the tooling ecosystem that glues it all together. ...

June 4, 2025 · map[name:Minjun Jeon]

Big Data 1. Data Maturity Model

1. Intro The Maslow’s hierarchy of needs models the basic needs of humans as a hierarchy of 5 levels from the bottom to higher-level needs at the top. In this model, one cannot proceed to the next level unless the basic needs of the current level is met. Silmiarly, one can think of data hierarchy of needs. The 5 levels are Data Collection, Data Wrangling, Data Integration, BI and Analytics, and Artificail Intellgence. ...

June 3, 2025 · map[name:Minjun Jeon]

Big Data 2. OLTP vs OLAP

When working with databases, it’s essential to distinguish between two fundamental system types: OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing). These systems serve very different purposes and are optimized accordingly. OLTP — Online Transaction Processing OLTP systems are designed to handle a large number of short, atomic transactions such as inserts, updates, and deletes. They are optimized for speed and reliability in transactional environments. Example Use Cases Banking systems E-commerce checkout Inventory management Booking/reservation systems Characteristics Feature OLTP Main Use Day-to-day operations (CRUD) Queries Simple and frequent Data Volume Current data (live, real-time) Users Many concurrent users Schema Highly normalized Transactions Short and fast Performance Goal High throughput and low latency Storage Type Primarily Row-based ACID Compliance Fully ACID-compliant What is ACID? ...

June 3, 2025 · map[name:Minjun Jeon]

SQL 9. Union and Subqueries

1. UNION/UNION ALL In contrast to JOINS, UNION combines rows instead of columns. a) Basic Syntax The syntax is as follows: SELECT column_name1, column_name2 FROM table_name1 UNION SELECT column_name3, column_name4 FROM table_name2 The number of columns from the first table should match the number of columns from the second table. Columns are matched by the order. For example column_name1 and column_name3 don’t need to be identitcal. The merged table will have column names of the first table, i.e. column_name1 and column_name2. ...

June 3, 2025 · map[name:Minjun Jeon]

1. Big O asymptotic analysis

1. What is good code? Readability : How easily humans can understand, modify, and maintain a piece of code. Scalability: How well the code can handle increased load and complexity. Big O, a shorthand for Big O asymptotic analysis, measures the scalability of code. Scalability consists of runtime and memory. There is usally a trade-off between runtime and memory. In this post, we introduce the Big O notation in terms of runtime and look at how to measure the space complexity of an algorithm at the end. ...

June 2, 2025 · map[name:Minjun Jeon]

SQL 8. Joins

JOIN is used to combine information from multiple tables in one query. There are four types of JOINs: INNER JOIN, OUTER JOIN, LEFT JOIN, and RIGHT JOIN. When combining two tables, you need one common column, so called reference column or join column. 1. INNER JOIN INNER JOIN combines two tables only for the rows where the value in the join column occurs in both tables. If there are m matching rows in the first table and n matching rows in the second table for the join column value, the result will contain m × n rows for that value. ...

June 2, 2025 · map[name:Minjun Jeon]

SQL 4. String Functions

In this post, we are going to look at basic string functions in SQL. 1. UPPER, LOWER, LENGTH UPPER changes all lowercase alphabets to uppercase. LOWER changes all uppercase letters to lowercase. Finally, LENGTH gives the number of characters. The example is as follows: SELECT UPPER(first_name) LOWER(last_name) FROM customer WHERE LENGTH(first_name) > 10 The above code outputs the first names in uppercase and last names in lowercase from the table customer where the length of the first name is more than 10. ...

June 1, 2025 · map[name:Minjun Jeon]

SQL 5. Date and Time Functions

In this post, we are going to look at basic date and time functions in SQL. Before we dive deeper into the date and time functions, let’s look at standard date/time types. 0. Data/Time types Types Description Examples date Just date without time ‘2025-06-01’ time (with/without time zone) Just time without date ‘01:02:03.678+02’ timestamp (with/without time zone) Date and time ‘2022-11-28 01:02:03.678+02’ intervals Time interval ‘3 days 01:02:03.678’ 1. EXTRACT EXTRACT is used to extract parts of timestamp/date. The syntax is as follows: ...

June 1, 2025 · map[name:Minjun Jeon]

SQL 6. Numerical Functions and Operators

1. Operators Operator Description Examples Result + addition 4+3 7 - subtraction 5-3 2 * multiplication 4*2 8 / division (integer division truncates the result) 8/4 2 % modulo (remainder) 10%4 2 ^ exponentiation 2^3 8 For example, 9/4 will give 2 while 9.0/4 will give 2.2500...00. 2. Functions Function Description Examples Result abs(x) absolute value abs(-2) 2 round(x, d) round x to d decimal places round(7.4632, 2) 7.46 ceiling(x) round up to integer ceiling(7.4632) 8 floor(x) round down to integer floor(7.4632) 7

June 1, 2025 · map[name:Minjun Jeon]