100 Days of Data Engineering Day 3 SQL Interview
Here is a summary of the video Practicing SQL Interview Questions for Data Scientists and Data Engineers with actionable steps by the speaker:
The speaker recommends practicing on platforms like interview queries. Here are the key takeaways on how to approach these questions and some actionable steps mentioned in the video:
- When approaching a problem, try to solve it in multiple ways. This will help you go beyond memorizing solutions and develop a deeper understanding of the concepts.
- Break down complex problems into smaller steps. This will make the problem easier to solve and understand.
- Pay attention to how you write your queries. Clear and concise code is easier to understand and maintain.
- There are sometimes multiple ways to solve a problem. The best solution may depend on the specific context of the problem.
Here are some actionable steps for practicing SQL interview questions:
- Find a platform that offers SQL interview questions, such as interview queries.
- Set aside time to practice regularly.
- Try to solve the problems in multiple ways.
- Pay attention to the clarity and conciseness of your code.
- Review your solutions with others to get feedback.
Timeline of the video:
- Introduction (0:00)
- Importance of practicing SQL interview questions (1:00)
- Tips for approaching SQL interview questions (2:00)
- Solve in multiple ways (2:00)
- Break down complex problems (3:00)
- Write clear and concise code (4:00)
- Consider different solutions (5:00)
- Actionable steps for practicing (6:00)
- Find a practice platform (6:00)
- Set aside time to practice (7:00)
- Try multiple solutions (8:00)
- Pay attention to code clarity (9:00)
- Review with others (10:00)
- Conclusion (11:00)
9 SQL queries
Learn how to write SQL Queries(Practice Complex SQL Queries)
Learning SQL syntax is very easy but getting comfortable in writing SQL Queries, especially the complex SQL Queries can be tricky and will need a lot of practice.
In this blog, I have listed below 9 SQL Queries which should help you to practice intermediate to complex SQL queries.
You will find below SQL Questions along with the data and table structure required to solve each SQL question. The SQL Query to solve these questions will be attached to an .txt file. You can simple download the file for each question to get the solved SQL Queries.
* Note: Please note, I have used PostgreSQL database to executed all of these queries. I believe these queries would work just fine with any other major RDBMS such as Oracle, MySQL, Microsoft SQL Server. However, if you find any query not working in your RDBMS, then leave a comment below so I could help.**
- Write a SQL Query to fetch all the duplicate records in a table. Table Name: USERS Note: Record is considered duplicate if a user name is present more than once. Approach: Partition the data based on user name and then give a row number to each of the partitioned user name. If a user name exists more than once then it would have multiple row numbers. Using the row number which is other than 1, we can identify the duplicate records.
USERS
Expected Output
There are several ways to write this query. Such as either using the CTID field in PostgreSQL or by using the ROWID field in Oracle, MySQL, Microsoft SQL Server etc but a simpler way to write this query would be using window function.
Try to write this query yourself before looking the query I have written to solve it.
Click on the download button below to download the .txt file which will have the table structure, table data and the solved SQL Query.
-- Query 1:
Write a SQL query to fetch all the duplicate records from a table.
--Tables Structure:
drop table users;
create table users
(
user_id int primary key,
user_name varchar(30) not null,
email varchar(50));
insert into users values
(1, 'Sumit', 'sumit@gmail.com'),
(2, 'Reshma', 'reshma@gmail.com'),
(3, 'Farhana', 'farhana@gmail.com'),
(4, 'Robin', 'robin@gmail.com'),
(5, 'Robin', 'robin@gmail.com');
select * from users;
-- Solution 1:
-- Replace ctid with rowid for Oracle, MySQL and Microsoft SQLServer
select *
from users u
where u.ctid not in (
select min(ctid) as ctid
from users
group by user_name
order by ctid);
-- Solution 2: Using window function.
select user_id, user_name, email
from (
select *,
row_number() over (partition by user_name order by user_id) as rn
from users u
order by user_id) x
where x.rn <> 1;
2. Write a SQL query to fetch the second last record from employee table.
Table Name: EMPLOYEE
Approach: Using window function sort the data in descending order based on employee id. Provide a row number to each of the record and fetch the record having row number as 2.
EMPLOYEE
Expected Output
Again, there are several ways to write this query but this becomes very simple using a window function.
Try to write this query yourself before looking the query I have written to solve it.
Click on the download button below to download the .txt file which will have the table structure, table data and the solved SQL Query.
-- Query 2:
Write a SQL query to fetch the second last record from a employee table.
--Tables Structure:
drop table employee;
create table employee
( emp_ID int primary key
, emp_NAME varchar(50) not null
, DEPT_NAME varchar(50)
, SALARY int);
insert into employee values(101, 'Mohan', 'Admin', 4000);
insert into employee values(102, 'Rajkumar', 'HR', 3000);
insert into employee values(103, 'Akbar', 'IT', 4000);
insert into employee values(104, 'Dorvin', 'Finance', 6500);
insert into employee values(105, 'Rohit', 'HR', 3000);
insert into employee values(106, 'Rajesh', 'Finance', 5000);
insert into employee values(107, 'Preet', 'HR', 7000);
insert into employee values(108, 'Maryam', 'Admin', 4000);
insert into employee values(109, 'Sanjay', 'IT', 6500);
insert into employee values(110, 'Vasudha', 'IT', 7000);
insert into employee values(111, 'Melinda', 'IT', 8000);
insert into employee values(112, 'Komal', 'IT', 10000);
insert into employee values(113, 'Gautham', 'Admin', 2000);
insert into employee values(114, 'Manisha', 'HR', 3000);
insert into employee values(115, 'Chandni', 'IT', 4500);
insert into employee values(116, 'Satya', 'Finance', 6500);
insert into employee values(117, 'Adarsh', 'HR', 3500);
insert into employee values(118, 'Tejaswi', 'Finance', 5500);
insert into employee values(119, 'Cory', 'HR', 8000);
insert into employee values(120, 'Monica', 'Admin', 5000);
insert into employee values(121, 'Rosalin', 'IT', 6000);
insert into employee values(122, 'Ibrahim', 'IT', 8000);
insert into employee values(123, 'Vikram', 'IT', 8000);
insert into employee values(124, 'Dheeraj', 'IT', 11000);
select * from employee;
-- Solution:
select emp_id, emp_name, dept_name, salary
from (
select *,
row_number() over (order by emp_id desc) as rn
from employee e) x
where x.rn = 2;
3. Write a SQL query to display only the details of employees who either earn the highest salary or the lowest salary in each department from the employee table.
Table Name: EMPLOYEE
Approach: Write a sub query which will partition the data based on each department and then identify the record with maximum and minimum salary for each of the partitioned department. Finally, from the main query fetch only the data which matches the maximum and minimum salary returned from the sub query.
EMPLOYEE
Expected Output
Again there are many way to do this and also we can use a few window functions to achieve the same result. As an added challenge, try out solving this query using a different window function and then comment out your query.
Click on the download button below to download the .txt file which will have the table structure, table data and the solved SQL Query.
-- Query 3:
Write a SQL query to display only the details of employees who either earn the highest salary
or the lowest salary in each department from the employee table.
--Tables Structure:
drop table employee;
create table employee
( emp_ID int primary key
, emp_NAME varchar(50) not null
, DEPT_NAME varchar(50)
, SALARY int);
insert into employee values(101, 'Mohan', 'Admin', 4000);
insert into employee values(102, 'Rajkumar', 'HR', 3000);
insert into employee values(103, 'Akbar', 'IT', 4000);
insert into employee values(104, 'Dorvin', 'Finance', 6500);
insert into employee values(105, 'Rohit', 'HR', 3000);
insert into employee values(106, 'Rajesh', 'Finance', 5000);
insert into employee values(107, 'Preet', 'HR', 7000);
insert into employee values(108, 'Maryam', 'Admin', 4000);
insert into employee values(109, 'Sanjay', 'IT', 6500);
insert into employee values(110, 'Vasudha', 'IT', 7000);
insert into employee values(111, 'Melinda', 'IT', 8000);
insert into employee values(112, 'Komal', 'IT', 10000);
insert into employee values(113, 'Gautham', 'Admin', 2000);
insert into employee values(114, 'Manisha', 'HR', 3000);
insert into employee values(115, 'Chandni', 'IT', 4500);
insert into employee values(116, 'Satya', 'Finance', 6500);
insert into employee values(117, 'Adarsh', 'HR', 3500);
insert into employee values(118, 'Tejaswi', 'Finance', 5500);
insert into employee values(119, 'Cory', 'HR', 8000);
insert into employee values(120, 'Monica', 'Admin', 5000);
insert into employee values(121, 'Rosalin', 'IT', 6000);
insert into employee values(122, 'Ibrahim', 'IT', 8000);
insert into employee values(123, 'Vikram', 'IT', 8000);
insert into employee values(124, 'Dheeraj', 'IT', 11000);
select * from employee;
-- Solution:
select x.*
from employee e
join (select *,
max(salary) over (partition by dept_name) as max_salary,
min(salary) over (partition by dept_name) as min_salary
from employee) x
on e.emp_id = x.emp_id
and (e.salary = x.max_salary or e.salary = x.min_salary)
order by x.dept_name, x.salary;
4. From the doctors table, fetch the details of doctors who work in the same hospital but in different specialty.
Table Name: DOCTORS
Approach: Use self join to solve this problem. Self join is when you join a table to itself.
Additional Query: Write SQL query to fetch the doctors who work in same hospital irrespective of their specialty.
DOCTORS
Expected Output: Same hospital different speciality
Expected Output: Same hospital irrespective of speciality
Click on the download button below to download the .txt file which will have the table structure, table data and the solved SQL Query.
-- Query 4:
From the doctors table, fetch the details of doctors who work in the same hospital but in different speciality.
--Table Structure:
drop table doctors;
create table doctors
(
id int primary key,
name varchar(50) not null,
speciality varchar(100),
hospital varchar(50),
city varchar(50),
consultation_fee int
);
insert into doctors values
(1, 'Dr. Shashank', 'Ayurveda', 'Apollo Hospital', 'Bangalore', 2500),
(2, 'Dr. Abdul', 'Homeopathy', 'Fortis Hospital', 'Bangalore', 2000),
(3, 'Dr. Shwetha', 'Homeopathy', 'KMC Hospital', 'Manipal', 1000),
(4, 'Dr. Murphy', 'Dermatology', 'KMC Hospital', 'Manipal', 1500),
(5, 'Dr. Farhana', 'Physician', 'Gleneagles Hospital', 'Bangalore', 1700),
(6, 'Dr. Maryam', 'Physician', 'Gleneagles Hospital', 'Bangalore', 1500);
select * from doctors;
--Solution:
select d1.name, d1.speciality,d1.hospital
from doctors d1
join doctors d2
on d1.hospital = d2.hospital and d1.speciality <> d2.speciality
and d1.id <> d2.id;
--Sub Question:
Now find the doctors who work in same hospital irrespective of their speciality.
--Solution:
select d1.name, d1.speciality,d1.hospital
from doctors d1
join doctors d2
on d1.hospital = d2.hospital
and d1.id <> d2.id;
5. From the login_details table, fetch the users who logged in consecutively 3 or more times.
Table Name: LOGIN_DETAILS
Approach: We need to fetch users who have appeared 3 or more times consecutively in login details table. There is a window function which can be used to fetch data from the following record. Use that window function to compare the user name in current row with user name in the next row and in the row following the next row. If it matches then fetch those records.
LOGIN_DETAILS
Expected Output
Click on the download button below to download the .txt file which will have the table structure, table data and the solved SQL Query.
-- Query 5:
From the login_details table, fetch the users who logged in consecutively 3 or more times.
--Table Structure:
drop table login_details;
create table login_details(
login_id int primary key,
user_name varchar(50) not null,
login_date date);
delete from login_details;
insert into login_details values
(101, 'Michael', current_date),
(102, 'James', current_date),
(103, 'Stewart', current_date+1),
(104, 'Stewart', current_date+1),
(105, 'Stewart', current_date+1),
(106, 'Michael', current_date+2),
(107, 'Michael', current_date+2),
(108, 'Stewart', current_date+3),
(109, 'Stewart', current_date+3),
(110, 'James', current_date+4),
(111, 'James', current_date+4),
(112, 'James', current_date+5),
(113, 'James', current_date+6);
select * from login_details;
--Solution:
select distinct repeated_names
from (
select *,
case when user_name = lead(user_name) over(order by login_id)
and user_name = lead(user_name,2) over(order by login_id)
then user_name else null end as repeated_names
from login_details) x
where x.repeated_names is not null;
6. From the students table, write a SQL query to interchange the adjacent student names.
Note: If there are no adjacent student then the student name should stay the same.
Table Name: STUDENTS
Approach: Assuming id will be a sequential number always. If id is an odd number then fetch the student name from the following record. If id is an even number then fetch the student name from the preceding record. Try to figure out the window function which can be used to fetch the preceding the following record data.
If the last record is an odd number then it wont have any adjacent even number hence figure out a way to not interchange the last record data.
STUDENTS
Expected Output
Click on the download button below to download the .txt file which will have the table structure, table data and the solved SQL Query.
-- Query 6:
From the students table, write a SQL query to interchange the adjacent student names.
Note: If there are no adjacent student then the student name should stay the same.
--Table Structure:
drop table students;
create table students
(
id int primary key,
student_name varchar(50) not null
);
insert into students values
(1, 'James'),
(2, 'Michael'),
(3, 'George'),
(4, 'Stewart'),
(5, 'Robin');
select * from students;
--Solution:
select id,student_name,
case when id%2 <> 0 then lead(student_name,1,student_name) over(order by id)
when id%2 = 0 then lag(student_name) over(order by id) end as new_student_name
from students;
7. From the weather table, fetch all the records when London had extremely cold temperature for 3 consecutive days or more.
Note: Weather is considered to be extremely cold when its temperature is less than zero.
Table Name: WEATHER
Approach: First using a sub query identify all the records where the temperature was very cold and then use a main query to fetch only the records returned as very cold from the sub query. You will not only need to compare the records following the current row but also need to compare the records preceding the current row. And may also need to compare rows preceding and following the current row. Identify a window function which can do this comparison pretty easily.
WEATHER
Expected Output
Click on the download button below to download the .txt file which will have the table structure, table data and the solved SQL Query.
-- Query 7:
From the weather table, fetch all the records when London had extremely cold temperature for 3 consecutive days or more.
Note: Weather is considered to be extremely cold then its temperature is less than zero.
--Table Structure:
drop table weather;
create table weather
(
id int,
city varchar(50),
temperature int,
day date
);
delete from weather;
insert into weather values
(1, 'London', -1, to_date('2021-01-01','yyyy-mm-dd')),
(2, 'London', -2, to_date('2021-01-02','yyyy-mm-dd')),
(3, 'London', 4, to_date('2021-01-03','yyyy-mm-dd')),
(4, 'London', 1, to_date('2021-01-04','yyyy-mm-dd')),
(5, 'London', -2, to_date('2021-01-05','yyyy-mm-dd')),
(6, 'London', -5, to_date('2021-01-06','yyyy-mm-dd')),
(7, 'London', -7, to_date('2021-01-07','yyyy-mm-dd')),
(8, 'London', 5, to_date('2021-01-08','yyyy-mm-dd'));
select * from weather;
--Solution:
select id, city, temperature, day
from (
select *,
case when temperature < 0
and lead(temperature) over(order by day) < 0
and lead(temperature,2) over(order by day) < 0
then 'Y'
when temperature < 0
and lead(temperature) over(order by day) < 0
and lag(temperature) over(order by day) < 0
then 'Y'
when temperature < 0
and lag(temperature) over(order by day) < 0
and lag(temperature,2) over(order by day) < 0
then 'Y'
end as flag
from weather) x
where x.flag = 'Y';
8. From the following 3 tables (event_category, physician_speciality, patient_treatment), write a SQL query to get the histogram of specialties of the unique physicians who have done the procedures but never did prescribe anything.
Table Name: EVENT_CATEGORY, PHYSICIAN_SPECIALITY, PATIENT_TREATMENT
Approach: Using the patient treatment and event category table, identify all the physicians who have done “Prescription”. Have this recorded in a sub query.
Then in the main query join the patient treatment, event category and physician speciality table to identify all the physician who have done “Procedure”. From these physicians, remove those physicians you got from sub query to return the physicians who have done Procedure but never did Prescription.
PATIENT_TREATMENT
EVENT_CATEGORY
PHYSICIAN_SPECIALITY
Expected Output
Click on the download button below to download the .txt file which will have the table structure, table data and the solved SQL Query.
-- Query 8:
From the following 3 tables (event_category, physician_speciality, patient_treatment),
write a SQL query to get the histogram of specialities of the unique physicians
who have done the procedures but never did prescribe anything.
--Table Structure:
drop table event_category;
create table event_category
(
event_name varchar(50),
category varchar(100)
);
drop table physician_speciality;
create table physician_speciality
(
physician_id int,
speciality varchar(50)
);
drop table patient_treatment;
create table patient_treatment
(
patient_id int,
event_name varchar(50),
physician_id int
);
insert into event_category values ('Chemotherapy','Procedure');
insert into event_category values ('Radiation','Procedure');
insert into event_category values ('Immunosuppressants','Prescription');
insert into event_category values ('BTKI','Prescription');
insert into event_category values ('Biopsy','Test');
insert into physician_speciality values (1000,'Radiologist');
insert into physician_speciality values (2000,'Oncologist');
insert into physician_speciality values (3000,'Hermatologist');
insert into physician_speciality values (4000,'Oncologist');
insert into physician_speciality values (5000,'Pathologist');
insert into physician_speciality values (6000,'Oncologist');
insert into patient_treatment values (1,'Radiation', 1000);
insert into patient_treatment values (2,'Chemotherapy', 2000);
insert into patient_treatment values (1,'Biopsy', 1000);
insert into patient_treatment values (3,'Immunosuppressants', 2000);
insert into patient_treatment values (4,'BTKI', 3000);
insert into patient_treatment values (5,'Radiation', 4000);
insert into patient_treatment values (4,'Chemotherapy', 2000);
insert into patient_treatment values (1,'Biopsy', 5000);
insert into patient_treatment values (6,'Chemotherapy', 6000);
select * from patient_treatment;
select * from event_category;
select * from physician_speciality;
-- Solution:
select ps.speciality, count(1) as speciality_count
from patient_treatment pt
join event_category ec on ec.event_name = pt.event_name
join physician_speciality ps on ps.physician_id = pt.physician_id
where ec.category = 'Procedure'
and pt.physician_id not in (select pt2.physician_id
from patient_treatment pt2
join event_category ec on ec.event_name = pt2.event_name
where ec.category in ('Prescription'))
group by ps.speciality;
9. Find the top 2 accounts with the maximum number of unique patients on a monthly basis.
Note: Prefer the account id with the least value in case of same number of unique patients
Table Name: PATIENT_LOGS
Approach: First convert the date to month format since we need the output specific to each month. Then group together all data based on each month and account id so you get the total no of patients belonging to each account per month basis.
Then rank this data as per no of patients in descending order and account id in ascending order so in case there are same no of patients present under multiple account if then the ranking will prefer the account if with lower value. Finally, choose upto 2 records only per month to arrive at the final output.
PATIENT_LOGS
Expected Output
Click on the download button below to download the .txt file which will have the table structure, table data and the solved SQL Query.
“`– Query 9:
Find the top 2 accounts with the maximum number of unique patients on a monthly basis.
Note: Prefer the account if with the least value in case of same number of unique patients
–Table Structure:
drop table patient_logs;
create table patient_logs
(
account_id int,
date date,
patient_id int
);
insert into patient_logs values (1, to_date(’02-01-2020′,’dd-mm-yyyy’), 100);
insert into patient_logs values (1, to_date(’27-01-2020′,’dd-mm-yyyy’), 200);
insert into patient_logs values (2, to_date(’01-01-2020′,’dd-mm-yyyy’), 300);
insert into patient_logs values (2, to_date(’21-01-2020′,’dd-mm-yyyy’), 400);
insert into patient_logs values (2, to_date(’21-01-2020′,’dd-mm-yyyy’), 300);
insert into patient_logs values (2, to_date(’01-01-2020′,’dd-mm-yyyy’), 500);
insert into patient_logs values (3, to_date(’20-01-2020′,’dd-mm-yyyy’), 400);
insert into patient_logs values (1, to_date(’04-03-2020′,’dd-mm-yyyy’), 500);
insert into patient_logs values (3, to_date(’20-01-2020′,’dd-mm-yyyy’), 450);
select * from patient_logs;
— Solution:
select a.month, a.account_id, a.no_of_unique_patients
from (
select x.month, x.account_id, no_of_unique_patients,
row_number() over (partition by x.month order by x.no_of_unique_patients desc) as rn
from (
select pl.month, pl.account_id, count(1) as no_of_unique_patients
from (select distinct to_char(date,’month’) as month, account_id, patient_id
from patient_logs) pl
group by pl.month, pl.account_id) x
) a
where a.rn < 3;
“`
10. SQL Query to fetch “N” consecutive records from a table based on a certain condition
Note: Write separate queries to satisfy following scenarios:
10a. when the table has a primary key
10b. When table does not have a primary key
10c. Query logic based on data field
10a. when the table has a primary key
Table Name: WEATHER
WEATHER table Data
10a: Expected Output
10b. When table does not have a primary key
Table Name: VW_WEATHER
VW_WEATHER table data
10b: Expected Output
10c. Query logic based on data field
Table Name: ORDERS
ORDERS table data
10c: Expected Output