|

100 Days of Data Engineering Day 2: SQL

20240402- catching up with this one since I started a day behind. I understand all the concepts here, but should probably reinforce some of the syntax so I can do things without looking at the code


Here is a summary of the video SQL Basics Tutorial for Beginners – Part 1

This video is a part 1 of a series on SQL Basics Tutorial for Beginners. In this video, the instructor will teach you how to set up SQL Server Management Studio and create tables.

Here are the actionable steps on how to set up SQL Server Management Studio and create tables:

  • Download and install SQL Server Management Studio from Microsoft.
  • Download and install a free version of SQL Server Express.
  • Open SQL Server Management Studio and connect to the server.
  • Create a new database.
  • Create a new table by writing a T-SQL script.
  • Insert data into the table.

Timeline of the video:

  • Introduction (0:00)
  • Downloading and installing SQL Server Management Studio (1:00)
  • Downloading and installing SQL Server Express (2:00)
  • Connecting to the server (3:00)
  • Creating a new database (4:00)
  • Creating a new table (5:00)
  • Inserting data into the table (6:00)

Here is a summary of the video Intermediate SQL Joins

The speaker talks about different types of joins and gives use cases for each type.

The video starts with introducing what joins are and what they are used for. Then the speaker goes through different types of joins including inner joins, full outer joins, left outer joins, and right outer joins. For each type of join, the speaker explains what it is and how it works with an example.

Inner joins are used to combine rows from two tables based on a shared column. The output table will only include rows where the data in the join column matches in both tables.

Full outer joins are used to combine all rows from two tables, regardless of whether there is a match in the join column. If there is no match in one table, the corresponding columns in the output table will be filled with null values.

Left outer joins are used to combine all rows from the left table and matching rows from the right table. If there is no match in the right table, the corresponding columns in the output table will be filled with null values.

Right outer joins are used to combine all rows from the right table and matching rows from the left table. If there is no match in the left table, the corresponding columns in the output table will be filled with null values.

The speaker also talks about use cases for joins. For example, a manager can use a join to find out which employee makes the most money. Or an accountant can use a join to calculate the average salary for a particular job title.

Here are the actionable steps from the video:

  • Use an inner join to combine rows from two tables based on a shared column, and you only want to include rows where the data in the join column matches in both tables.
  • Use a full outer join to combine all rows from two tables, regardless of whether there is a match in the join column.
  • Use a left outer join to combine all rows from the left table and matching rows from the right table.
  • Use a right outer join to combine all rows from the right table and matching rows from the left table.

Here is a summary of the video Using Case Statements in SQL

The video is about using case statements in SQL. The speaker, Alex Freiburg, provides two examples of how to use case statements.

In the first example, Alex uses a case statement to categorize employees into different age groups. He creates a new column named “age_group” that labels employees as “old” if their age is greater than 30 and “young” if their age is less than or equal to 30.

In the second example, Alex uses a case statement to calculate salary raises for employees. He creates a new column named “salary_after_raise” that adds a raise amount to the employee’s current salary. The amount of the raise is determined by the employee’s job title. Salesmen get a 10% raise, accountants get a 5% raise, HR gets a 3% raise, and all other employees get a 3% raise.

Here are the actionable steps on how to create a case statement in SQL:

  1. Specify the columns you want to return in the SELECT clause.
  2. Use the CASE statement followed by WHEN to specify the condition you want to check.
  3. Then specify what value you want returned if the condition is met.
  4. You can add multiple WHEN statements to define different conditions and their corresponding return values.
  5. Use the ELSE statement to specify the return value if none of the WHEN conditions are met.
  6. Use END to close the CASE statement.

Timeline of the video:

  • Introduction (0:00)
  • What are case statements (1:00)
  • Example 1: Categorizing employees by age (2:00)
  • Example 2: Calculating salary raises (4:00)
  • How to create a case statement (6:00)
    • Specify columns to return (6:00)
    • Use CASE statement with WHEN (7:00)
    • Specify return value (8:00)
    • Add multiple WHEN statements (9:00)
    • Use ELSE statement (10:00)
    • Close the CASE statement (11:00)
  • Conclusion (12:00)

Self-JOINs and CROSS JOINs

There are two useful kinds of JOINs that we can use to solve queries of medium difficulty. These are called “self JOIN” and CROSS JOIN.

SELF JOINs

Self-joins are not really a new kind of JOIN. They are just a JOIN from a table to itself. However, self-joins tend to come up in certain medium-level questions.

Consider the following employees table:

idnamemanager_id
1AliceNULL
2Bob1
3Chuck1
4Drake2
5Eve2

This company’s structure is:

  • Alice
    • Bob
      • Drake
      • Eve
    • Chuck

In this case, if we wanted to know who the manager is for each employee, all the information we need is already in the employees table.

To show this information, we can join the employees table to itself:

SELECT e.name as employee, m.name as manager
    FROM employees as e LEFT JOIN employees as m
    ON e.manager_id = m.id

Note that we used a LEFT JOIN, since the CEO, Alice, has no manager.

Also note that we use different aliases for the “left” employees table (e) and the “right” employees table (m), so that SQL is not confused as to which “version” of the table we are referring to.

CROSS JOIN

CROSS JOIN is a clause that returns all row combinations within two tables.

For example, if we have the tables cars and colors:

cars

model
Ford
Chevrolet
Toyota

colors

color
Red
Blue
Black

Then the CROSS JOIN will return all of their combinations:

modelcolor
FordRed
FordBlue
FordBlack
ChevroletRed
ChevroletBlue
ChevroletBlack
ToyotaRed
ToyotaBlue
ToyotaBlack

CROSS JOINS tend to generate very large tables, so they are usually costly. However, they virtually allow us to iterate over two tables directly with SQL.

In general, we should try to avoid them, but it’s good to know they exist in case we need them.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *