596. Classes With at Least 5 Students
Description
Pandas Schema
data = [['A', 'Math'], ['B', 'English'], ['C', 'Math'], ['D', 'Biology'], ['E', 'Math'], ['F', 'Computer'], ['G', 'Math'], ['H', 'Math'], ['I', 'Math']]
courses = pd.DataFrame(data, columns=['student', 'class']).astype({'student':'object', 'class':'object'})
Table: Courses
+-------------+---------+ | Column Name | Type | +-------------+---------+ | student | varchar | | class | varchar | +-------------+---------+ (student, class) is the primary key (combination of columns with unique values) for this table. Each row of this table indicates the name of a student and the class in which they are enrolled.
Write a solution to find all the classes that have at least five students.
Return the result table in any order.
The result format is in the following example.
Example 1:
- Input:
Courses table: +---------+----------+ | student | class | +---------+----------+ | A | Math | | B | English | | C | Math | | D | Biology | | E | Math | | F | Computer | | G | Math | | H | Math | | I | Math | +---------+----------+
- Output:
+---------+ | class | +---------+ | Math | +---------+
- Explanation:
- Math has 6 students, so we include it.
- English has 1 student, so we do not include it.
- Biology has 1 student, so we do not include it.
- Computer has 1 student, so we do not include it.
Submitted Code
import pandas as pd
def find_classes(courses: pd.DataFrame) -> pd.DataFrame:
groups = courses.groupby('class', as_index=False).count()
return groups[groups['student'] >= 5][['class']]
Runtime: 255 ms | Beats 88.53%
Memory: 68.06 MB | Beats 62.98%
groupby()와 count()를 사용했다.
Other Solutions
1st
def find_classes(courses: pd.DataFrame) -> pd.DataFrame:
df = courses.groupby('class')['class'].size().reset_index(name='count')
return df[df['count']>=5][['class']]
count() 대신 size()로 그룹별 행 갯수를 셀 수도 있다.
2nd
def find_classes(courses: pd.DataFrame) -> pd.DataFrame:
return courses['class'].value_counts()[(courses['class'].value_counts()>4)].index.to_frame()
value_counts()로 class 열에서 각 클래스가 등장한 횟수를 셀 경우 클래스 이름은 인덱스로, 등장 횟수는 값으로 반환된다. 5 이상만 필터링한 후 인덱스만 가져와서 이를 DataFrame으로 변환했다. 참고로 courses['class'].value_counts()
까지를 변수에 저장하면 함수 중복 호출을 줄일 수 있다.
3rd
SELECT class
FROM Courses
GROUP BY class
HAVING COUNT(student) >= 5;
MySQL에서는 GROUP BY와 HAVING을 사용한다.