데이터분석

[23.06.14] Python Seaborn - 09(2)

gmwoo 2023. 6. 14. 17:45

In [6]:

import pandas as pd

In [8]:

pd.read_excel("../data/teenage_mental.xls")

Out[8]:

	기간	구분	스트레스 인지율	스트레스 인지율.1	스트레스 인지율.2	우울감 경험률	우울감 경험률.1	우울감 경험률.2	자살 생각률	자살 생각률.1	자살 생각률.2
0	기간	구분	전체	남학생	여학생	전체	남학생	여학생	전체	남학생	여학생
1	2018	구분	42.7	34.5	51.5	29.6	24.2	35.4	15.4	11.8	19.2

In [9]:

pd.read_excel("../data/teenage_mental.xls",header=1)

Out[9]:

	기간	구분	전체	남학생	여학생	전체.1	남학생.1	여학생.1	전체.2	남학생.2	여학생.2
0	2018	구분	42.7	34.5	51.5	29.6	24.2	35.4	15.4	11.8	19.2

In [10]:

pd.read_excel("../data/teenage_mental.xls",header=1, usecols="C:K")

Out[10]:

	전체	남학생	여학생	전체.1	남학생.1	여학생.1	전체.2	남학생.2	여학생.2
0	42.7	34.5	51.5	29.6	24.2	35.4	15.4	11.8	19.2

In [11]:

col_names = ['스트레스', '스트레스남학생', '스트레스여학생', '우울감경험률', 
             '우울남학생', '우울여학생', '자살생각율', '자살남학생', '자살여학생']

pd.read_excel("../data/teenage_mental.xls",header=1, usecols="C:K", names=col_names)

Out[11]:

	스트레스	스트레스남학생	스트레스여학생	우울감경험률	우울남학생	우울여학생	자살생각율	자살남학생	자살여학생
0	42.7	34.5	51.5	29.6	24.2	35.4	15.4	11.8	19.2

In [12]:

raw_data = pd.read_excel("../data/teenage_mental.xls",
                          header=1,
                          usecols="C:K",
                          names=col_names)

raw_data

Out[12]:

	스트레스	스트레스남학생	스트레스여학생	우울감경험률	우울남학생	우울여학생	자살생각율	자살남학생	자살여학생
0	42.7	34.5	51.5	29.6	24.2	35.4	15.4	11.8	19.2

In [13]:

raw_data.loc[1] = 100. - raw_data.loc[0]
raw_data

Out[13]:

	스트레스	스트레스남학생	스트레스여학생	우울감경험률	우울남학생	우울여학생	자살생각율	자살남학생	자살여학생
0	42.7	34.5	51.5	29.6	24.2	35.4	15.4	11.8	19.2
1	57.3	65.5	48.5	70.4	75.8	64.6	84.6	88.2	80.8

In [14]:

raw_data['응답'] = ['그렇다', '아니다']
raw_data

Out[14]:

	스트레스	스트레스남학생	스트레스여학생	우울감경험률	우울남학생	우울여학생	자살생각율	자살남학생	자살여학생	응답
0	42.7	34.5	51.5	29.6	24.2	35.4	15.4	11.8	19.2	그렇다
1	57.3	65.5	48.5	70.4	75.8	64.6	84.6	88.2	80.8	아니다

In [16]:

raw_data.set_index('응답', inplace=True)
raw_data

Out[16]:

	스트레스	스트레스남학생	스트레스여학생	우울감경험률	우울남학생	우울여학생	자살생각율	자살남학생	자살여학생
응답
그렇다	42.7	34.5	51.5	29.6	24.2	35.4	15.4	11.8	19.2
아니다	57.3	65.5	48.5	70.4	75.8	64.6	84.6	88.2	80.8

In [17]:

raw_data

Out[17]:

	스트레스	스트레스남학생	스트레스여학생	우울감경험률	우울남학생	우울여학생	자살생각율	자살남학생	자살여학생
응답
그렇다	42.7	34.5	51.5	29.6	24.2	35.4	15.4	11.8	19.2
아니다	57.3	65.5	48.5	70.4	75.8	64.6	84.6	88.2	80.8

In [19]:

from matplotlib import font_manager, rc
f_path = "C:/Windows/Fonts/malgun.ttf"
font_name = font_manager.FontProperties(fname=f_path).get_name()
rc('font', family=font_name)

In [20]:

raw_data['스트레스'].plot.pie()

Out[20]:

<Axes: ylabel='스트레스'>

In [22]:

raw_data['스트레스'].plot.pie(explode=[0,0.02])

Out[22]:

<Axes: ylabel='스트레스'>

In [24]:

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

matplotlib.pyplot.subplots는 figure와 axes 오브젝트를 반환하는데
f,ax로 받으면, figure와 axes의 속성을 하나하나 지정할 수 있습니다.
subplot로 3개 이므로 ax[0], ax[1], ax[2]까지 각각 속성을 지정할 수 있습니다.

In [26]:

f, ax = plt.subplots(1,3, figsize=(16,8))

raw_data['스트레스'].plot.pie(explode=[0,0.02], ax=ax[0], autopct='%1.1f%%')
ax[0].set_title('스트레스를 받은적 있다')
ax[0].set_ylabel('')

raw_data['우울감경험률'].plot.pie(explode=[0,0.02], ax=ax[1], autopct='%1.1f%%')
ax[1].set_title('우울증을 경험한적 있다')
ax[1].set_ylabel('')

raw_data['자살생각율'].plot.pie(explode=[0,0.02], ax=ax[2], autopct='%1.1f%%')
ax[2].set_title('자살을 고민한적 있다')
ax[2].set_ylabel('')

plt.show()

문제) 연령별 통계

연령별 통계¶

In [57]:

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

In [2]:

from matplotlib import font_manager, rc
f_path = "C:/Windows/Fonts/malgun.ttf"
font_name = font_manager.FontProperties(fname=f_path).get_name()
rc('font', family=font_name)

In [6]:

df = pd.read_excel("../../data/seoul.xls")
df.head()

Out[6]:

	기간	대분류	분류	합계	TV 또는 비디오시청	여행야외나들이	컴퓨터게임인터넷검색 등	휴식	사회활동	운동	문화예술관람	창작적 취미활동	운동경기관람	종교활동	기타
0	2016	서울시	서울시	100	43.0	13.2	6.9	6.5	1.2	6.1	7.5	4.0	2.9	8.5	0.1
1	2016	성별	남자	100	43.7	13.1	7.6	6.0	0.7	8.5	6.6	3.5	3.5	6.5	0.1
2	2016	성별	여자	100	42.4	13.2	6.3	7.0	1.6	3.8	8.5	4.5	2.2	10.4	0.1
3	2016	연령별	10대	100	52.5	12.8	2.1	5.6	1.0	7.9	3.0	2.6	2.6	9.9	0
4	2016	연령별	20대	100	50.1	15.1	3.5	5.3	1.3	7.2	3.1	2.5	2.4	9.4	0.1

In [4]:

data = df.loc[df['대분류'] == '연령별']
data.index = ['10대', '20대', '30대', '40대', '50대', '60대 이상']
data

Out[4]:

	기간	대분류	분류	합계	TV 또는 비디오시청	여행야외나들이	컴퓨터게임인터넷검색 등	휴식	사회활동	운동	문화예술관람	창작적 취미활동	운동경기관람	종교활동	기타
10대	2016	연령별	10대	100	52.5	12.8	2.1	5.6	1.0	7.9	3.0	2.6	2.6	9.9	0
20대	2016	연령별	20대	100	50.1	15.1	3.5	5.3	1.3	7.2	3.1	2.5	2.4	9.4	0.1
30대	2016	연령별	30대	100	42.4	15.6	5.3	5.3	1.9	7.6	6.2	4.4	2.8	8.4	0.1
40대	2016	연령별	40대	100	38.3	16.0	9.8	6.7	1.4	5.2	6.3	5.2	3.5	7.6	0.1
50대	2016	연령별	50대	100	44.7	10.7	8.6	7.8	0.8	4.3	7.4	3.1	2.5	10.1	0.1
60대 이상	2016	연령별	60대 이상	100	38.6	9.5	8.3	7.4	0.8	5.8	14.3	5.0	3.1	7.1	0.1

In [5]:

f, ax = plt.subplots(1,3, figsize=(16,8))
explode=[0.02,0.02,0.02,0.02,0.02,0.02]

data['TV  또는 비디오시청'].plot.pie(ax=ax[0], autopct='%1.1f%%')
ax[0].set_title('연령별 TV TV 또는 비디오시청 통계')
ax[0].set_ylabel('')

data['종교활동'].plot.pie(ax=ax[1], autopct='%1.1f%%')
ax[1].set_title('연령별 종교활동 통계')
ax[1].set_ylabel('')

data['문화예술관람'].plot.pie(ax=ax[2], autopct='%1.1f%%')
ax[2].set_title('연령별 문화예술관람 통계')
ax[2].set_ylabel('')

plt.show()

저작자표시 (새창열림)

'데이터분석' 카테고리의 다른 글

[23.06.15] Python merge - 10(2) (0)	2023.06.15
[23.06.15] Python concat - 10(1) (0)	2023.06.15
[23.06.14] Python Seaborn - 09(1) (0)	2023.06.14
[23.06.13] Python Series, DataFrame - 08(4) (0)	2023.06.13
[23.06.13] Python Series, DataFrame - 08(3) (0)	2023.06.13

현재글[23.06.14] Python Seaborn - 09(2)

Woogi

[23.06.14] Python Seaborn - 09(2)

문제) 연령별 통계

연령별 통계¶

'데이터분석' 카테고리의 다른 글

'데이터분석'의 다른글

티스토리툴바

[23.06.14] Python Seaborn - 09(2)

문제) 연령별 통계

연령별 통계¶

'데이터분석' 카테고리의 다른 글

'데이터분석'의 다른글

관련글

티스토리툴바