607. Sales Person
Description
Pandas Schema
data = [[1, 'John', 100000, 6, '4/1/2006'], [2, 'Amy', 12000, 5, '5/1/2010'], [3, 'Mark', 65000, 12, '12/25/2008'], [4, 'Pam', 25000, 25, '1/1/2005'], [5, 'Alex', 5000, 10, '2/3/2007']]
sales_person = pd.DataFrame(data, columns=['sales_id', 'name', 'salary', 'commission_rate', 'hire_date']).astype({'sales_id':'Int64', 'name':'object', 'salary':'Int64', 'commission_rate':'Int64', 'hire_date':'datetime64[ns]'})
data = [[1, 'RED', 'Boston'], [2, 'ORANGE', 'New York'], [3, 'YELLOW', 'Boston'], [4, 'GREEN', 'Austin']]
company = pd.DataFrame(data, columns=['com_id', 'name', 'city']).astype({'com_id':'Int64', 'name':'object', 'city':'object'})
data = [[1, '1/1/2014', 3, 4, 10000], [2, '2/1/2014', 4, 5, 5000], [3, '3/1/2014', 1, 1, 50000], [4, '4/1/2014', 1, 4, 25000]]
orders = pd.DataFrame(data, columns=['order_id', 'order_date', 'com_id', 'sales_id', 'amount']).astype({'order_id':'Int64', 'order_date':'datetime64[ns]', 'com_id':'Int64', 'sales_id':'Int64', 'amount':'Int64'})
Table: SalesPerson
+-----------------+---------+ | Column Name | Type | +-----------------+---------+ | sales_id | int | | name | varchar | | salary | int | | commission_rate | int | | hire_date | date | +-----------------+---------+ sales_id is the primary key (column with unique values) for this table. Each row of this table indicates the name and the ID of a salesperson alongside their salary, commission rate, and hire date.
Table: Company
+-------------+---------+ | Column Name | Type | +-------------+---------+ | com_id | int | | name | varchar | | city | varchar | +-------------+---------+ com_id is the primary key (column with unique values) for this table. Each row of this table indicates the name and the ID of a company and the city in which the company is located.
Table: Orders
+-------------+------+ | Column Name | Type | +-------------+------+ | order_id | int | | order_date | date | | com_id | int | | sales_id | int | | amount | int | +-------------+------+ order_id is the primary key (column with unique values) for this table. com_id is a foreign key (reference column) to com_id from the Company table. sales_id is a foreign key (reference column) to sales_id from the SalesPerson table. Each row of this table contains information about one order. This includes the ID of the company, the ID of the salesperson, the date of the order, and the amount paid.
Write a solution to find the names of all the salespersons who did not have any orders related to the company with the name “RED”.
Return the result table in any order.
The result format is in the following example.
Example 1:
- Input:
SalesPerson table: +----------+------+--------+-----------------+------------+ | sales_id | name | salary | commission_rate | hire_date | +----------+------+--------+-----------------+------------+ | 1 | John | 100000 | 6 | 4/1/2006 | | 2 | Amy | 12000 | 5 | 5/1/2010 | | 3 | Mark | 65000 | 12 | 12/25/2008 | | 4 | Pam | 25000 | 25 | 1/1/2005 | | 5 | Alex | 5000 | 10 | 2/3/2007 | +----------+------+--------+-----------------+------------+ Company table: +--------+--------+----------+ | com_id | name | city | +--------+--------+----------+ | 1 | RED | Boston | | 2 | ORANGE | New York | | 3 | YELLOW | Boston | | 4 | GREEN | Austin | +--------+--------+----------+ Orders table: +----------+------------+--------+----------+--------+ | order_id | order_date | com_id | sales_id | amount | +----------+------------+--------+----------+--------+ | 1 | 1/1/2014 | 3 | 4 | 10000 | | 2 | 2/1/2014 | 4 | 5 | 5000 | | 3 | 3/1/2014 | 1 | 1 | 50000 | | 4 | 4/1/2014 | 1 | 4 | 25000 | +----------+------------+--------+----------+--------+
- Output:
+------+ | name | +------+ | Amy | | Mark | | Alex | +------+
- Explanation:
According to orders 3 and 4 in the Orders table, it is easy to tell that only salesperson John and Pam have sales to company RED, so we report all the other names in the table salesperson.
💡 Hint 1:
You need to query who sold to company 'RED' first, then output the sales person who is not in the first query result.
Submitted Code
import pandas as pd
def sales_person(sales_person: pd.DataFrame, company: pd.DataFrame, orders: pd.DataFrame) -> pd.DataFrame:
# orders와 company 데이터프레임 merge
orders_with_company = pd.merge(orders, company, on='com_id', how='left')
# 회사 이름이 RED인 행만 필터링
orders_with_RED = orders_with_company.loc[orders_with_company.name == 'RED']['sales_id']
# sales_person 데이터프레임에서 sales_id가 orders_with_RED 있지 않은 행만 필터링
person_not_with_RED = sales_person[~(sales_person['sales_id'].isin(orders_with_RED))][['name']]
return person_not_with_RED
Runtime: 353 ms | Beats 83.28%
Memory: 68.98 MB | Beats 18.62%
~
연산자로 isin()
의 반대 조건을 지정했다.
Other Solutions
1st
def sales_person(sales_person: pd.DataFrame, company: pd.DataFrame, orders: pd.DataFrame) -> pd.DataFrame:
return sales_person[
~sales_person['sales_id'].isin(
orders[orders['com_id'].isin(
company[company['name'] == 'RED']['com_id']
)]['sales_id']
)
][['name']]
데이터프레임끼리 합치지 않고 필터링하는 방법도 참고했다.
2nd
def sales_person(sales_person: pd.DataFrame, company: pd.DataFrame, orders: pd.DataFrame) -> pd.DataFrame:
return sales_person[
~sales_person['sales_id'].isin(
pd.merge(
left=orders,
right=company[company['name'] == 'RED'],
on='com_id',
how='inner',
)['sales_id'].unique()
)
][['name']]
merge하기 전에 company 데이터프레임을 필터링해서 넣는 방법도 있다. join 방식은 inner로 지정해서 com_id가 한 쪽에만 있는 행은 제거하도록 했다.
3rd
SELECT name FROM Salesperson -- 3. 필터링한 sales_id에 없는 판매자 찾기
WHERE sales_id NOT IN
(SELECT sales_id FROM Orders -- 2. RED와 거래한 sales_id 찾기
WHERE com_id=(SELECT com_id FROM Company WHERE name='RED')); -- 1. 이름이 RED인 회사 com_id 찾기
MySQL로도 JOIN없이 조건을 만족하는 행만 찾을 수 있다.