SQL 13. Grouping Sets, Rollups, and Self-Joins

Overview In SQL interviews, understanding advanced grouping and join techniques is crucial. This post dives into GROUPING SETS, ROLLUP, CUBE, and types of joins like self-joins, CROSS JOIN, and NATURAL JOIN. We also touch on how to interpret NULLs in grouping results and share interview tips along the way. 1. GROUPING SETS GROUPING SETS let you define multiple GROUP BY combinations in a single query—ideal for multi-dimensional reports. Problem You want to calculate: ...

June 11, 2025 · map[name:Minjun Jeon]

SQL 12. Window Functions

Window functions in PostgreSQL allow you to perform calculations across sets of rows related to the current row, without collapsing rows like GROUP BY does. They’re essential for solving ranking, running totals, moving averages, and row-by-row comparisons. This post reviews key window functions with practical, interview-friendly examples. 1. OVER (PARTITION BY ...) Window functions let you run aggregations without reducing the number of rows. A correlated subquery alternative: SELECT transaction_id, payment_type, customer_id, price_in_transaction, (SELECT SUM(price_in_transaction) FROM sales s2 WHERE s1.customer_id = s2.customer_id) AS total_by_customer FROM sales s1; This works but is inefficient. Instead, use: ...

June 9, 2025 · map[name:Minjun Jeon]

SQL 11. Views & Data Manipulation

If you’re prepping for a SQL or data engineering interview, it’s essential to be fluent not only in writing queries but also in understanding the impact of commands like UPDATE, DELETE, and views (both virtual and materialized). This post covers practical examples and deepens your understanding of data manipulation and view management in PostgreSQL. 1. UPDATE: Modifying Existing Rows To modify data in a table, use the UPDATE statement. Here’s the general syntax: ...

June 8, 2025 · map[name:Minjun Jeon]

SQL 10. Managing Tables & Databases

In previous posts, we focused on querying tables. In this post, we’ll explore data definition in SQL: how to create, modify, and delete database objects like tables, constraints, and databases themselves. Mastering these operations shows that you not only know how to query data, but also how to design and manage the structures that hold it. We’ll cover: Data types Constraints Primary and foreign keys Creating and altering tables and databases We’ll cover views, indexes, and data manipulation (INSERT, UPDATE, DELETE) in separate posts. ...

June 6, 2025 · map[name:Minjun Jeon]

1. Python: map

The map() function applies a given function to all items in an input list (or any other iterable) and returns a map object (an iterator). This is particularly useful for transforming data in a list comprehensively. An iterable is any python object: capable of returning its members one at a time, permitting it to be iterated over in a for-loop. implements __iter__ method, which returns an iterator, or __getitem__ method suitable for indexed lookup. The examples are list, tuple, string, dict, set. An iterator is an object that: ...

June 5, 2025 · map[name:Minjun Jeon]

2. Array/List

Arrays, more commonly known as lists in Python, are one of the most fundamental and widely used data structures. They store items in a sequential, ordered manner, and support a wide range of operations efficiently. This post is designed as a refresher for interviews or technical review, with code examples, time complexity analysis, and clarifications on memory behavior. 1. What is an Array (List)? In Python, the built-in list type is a dynamic array implementation. Arrays: ...

June 5, 2025 · map[name:Minjun Jeon]

2. Python: filter

The filter() function constructs an iterator from elements of an iterable for which a function returns true. It is used to filter out items from a list (or any other iterable) based on a condition. The syntax is similar to map: filter(func1, iterable) Examples: num_list = [i for i in range(1, 13)] # [1, ..., 12] print(list(filter(lambda x: x%2==0, num_list))) # output: [2, 4, 6, 8, 10, 12 ] You can filter with multiple conditions: num_list = [i for i in range(1,10)] #[1,2,...,9] even_and_greater_than_five = list(filter(lambda x: x> 5 and x%2==0, num_list)) print(even_and_greater_than_five) # output: [6, 8] You can perform filter operation on dictionaries: ...

June 5, 2025 · map[name:Minjun Jeon]

3. Custom Package and __init__.py

We can install packages from the Python Package Index (PyPI) through pip(PyPI Install Packages). Furthermore, we can import these through import. But … how can we create our custom package? Let’s walk through how to structure a custom Python package, add subpackages, the roles of __init__.py and how to publish the custom package to PyPI. Additionally, we look at setting up dependencies and version management using Poetry. 1. Basic Package Structure Let’s say we want to build a simple package named mypackage. Here is a minimal project structure: ...

June 5, 2025 · map[name:Minjun Jeon]

Big Data 6. Slowly Changing Dimensions (SCD)

Slowly Changing Dimensions (SCD) Slowly Changing Dimensions (SCDs) refer to attributes in a data warehouse dimension table that change infrequently—yet unpredictably—over time. When a source system updates a dimension attribute, we need a strategy to decide how that change will be reflected (or not) in the data warehouse. In some cases, storing historical values isn’t necessary; in others, retaining a full history is critical for reporting, auditing, or trend analysis. ...

June 5, 2025 · map[name:Minjun Jeon]

Big Data 3. Data Warehouses, Data Lakes, and Lake Houses

1. Data Warehouses A Data Warehouse (DW) is a centralized repository that aggregates data from multiple sources into a unified, consistent store optimized for analytics and business intelligence (BI) activities, such as reporting, forecasting, and data mining. Unlike transactional databases (OLTP), data warehouses are built for analytical workloads (OLAP): they support complex queries over large volumes of data, often using denormalized schemas, dimensions, and facts for performance. Data warehouses typically use: ...

June 4, 2025 · map[name:Minjun Jeon]