반응형
전체 DataFrame 구조/통계 파악하기
1. DataFrame 내용/구조 살펴보기
df.info() # index & data types
dfh = df.head(k) # first k rows
dft = df.tail(k) # last k rows
dfs = df.describe() # summary stats cols
top_left_corner_df = df.iloc[:4, :4]
2. DataFrame 비인덱싱 속성
df = df.T # rows and cols 전환
l = df.axes # row and col indexes 확인
(r_idx, c_idx) = df.axes # 위에서 부터
s = df.dtypes # Series column data types
b = df.empty # empty DataFrame 만들기
i = df.ndim # number of axes (it is 2)
t = df.shape # (row-count, column-count)
i = df.size # row-count * column-count
a = df.values # numpy array for df
3. DataFrame 유틸리티 메서드
df = df.copy() # DataFrame 복사
df = df.rank() # rank each col (default)
df = df.sort_values(by=col)
df = df.sort_values(by=[col1, col2])
df = df.sort_index()
df = df.astype(dtype) # type conversion
4. DataFrame 반복 방법
df.iteritems() # (col-index, Series) pairs
df.iterrows() # (row-index, Series) pairs
for (name, series) in df.iteritems():
print('\nCol name: ' + str(name))
print('1st value: ' + str(series.iat[0]))
5. 전체 DataFrame에 대한 수학/통계 계산 방법
df = df.abs() # absolute values
df = df.add(o) # add df, Series or value
s = df.count() # non NA/null values
df = df.cummax() # (cols default axis)
df = df.cummin() # (cols default axis)
df = df.cumsum() # (cols default axis)
df = df.diff() # 1st diff (col def axis)
df = df.div(o) # div by df, Series, value
df = df.dot(o) # matrix dot product
s = df.max() # max of axis (col def)
s = df.mean() # mean (col default axis)
s = df.median() # median (col default)
s = df.min() # min of axis (col def)
df = df.mul(o) # mul by df Series val
s = df.sum() # sum axis (cols default)
df = df.where(df > 0.5, other=np.nan)
6. 색인 레이블 값을 기준으로 행/열 선택/필터링
df = df.filter(items=['a', 'b']) # by col
df = df.filter(items=[5], axis=0) # by row
df = df.filter(like='x') # keep x in col
df = df.filter(regex='x') # regex in col
df = df.select(lambda x: not x%5) # 5th rows
참고 : https://www.geeksforgeeks.org/pandas-cheat-sheet/
Pandas Cheat Sheet for Data Science in Python
This cheat sheet provides a quick reference to the most common Pandas commands, covering everything from data loading and manipulation to plotting and visualization. Whether you're a beginner or a seasoned data scientist, this cheat sheet is a valuable res
www.geeksforgeeks.org
반응형
'Python' 카테고리의 다른 글
python 판다스(pandas) DataFrame rows 행 작업 (0) | 2023.11.12 |
---|---|
python 판다스(pandas) DataFrame columns 열 작업 (0) | 2023.10.21 |
python 판다스(pandas) DataFrame 만들기 CSV 가져오기 (2) | 2023.10.21 |
Rosalind - Consensus and Profile (2) | 2023.10.15 |
Rosalind - Finding a Motif in DNA (0) | 2023.10.15 |