728x90
예전에 공부했던 판다스 코드를 클론코딩해보기! 2탄
마치 예전부터 알았던 것 마냥 얼른 복기해보자😉
In [1]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))
#티스토리 업로드 원활하게:-)
🍒모두를 위한 데이터사이언스 클론코딩하기-2🍒¶
라이브러리 로드¶
In [5]:
import pandas as pd
import seaborn as sns
In [6]:
pd.__version__
Out[6]:
'1.3.5'
In [7]:
sns.__version__
Out[7]:
'0.11.2'
데이터셋 불러오기¶
In [8]:
df = sns.load_dataset("mpg")
In [9]:
df
Out[9]:
mpg | cylinders | displacement | horsepower | weight | acceleration | model_year | origin | name | |
---|---|---|---|---|---|---|---|---|---|
0 | 18.0 | 8 | 307.0 | 130.0 | 3504 | 12.0 | 70 | usa | chevrolet chevelle malibu |
1 | 15.0 | 8 | 350.0 | 165.0 | 3693 | 11.5 | 70 | usa | buick skylark 320 |
2 | 18.0 | 8 | 318.0 | 150.0 | 3436 | 11.0 | 70 | usa | plymouth satellite |
3 | 16.0 | 8 | 304.0 | 150.0 | 3433 | 12.0 | 70 | usa | amc rebel sst |
4 | 17.0 | 8 | 302.0 | 140.0 | 3449 | 10.5 | 70 | usa | ford torino |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
393 | 27.0 | 4 | 140.0 | 86.0 | 2790 | 15.6 | 82 | usa | ford mustang gl |
394 | 44.0 | 4 | 97.0 | 52.0 | 2130 | 24.6 | 82 | europe | vw pickup |
395 | 32.0 | 4 | 135.0 | 84.0 | 2295 | 11.6 | 82 | usa | dodge rampage |
396 | 28.0 | 4 | 120.0 | 79.0 | 2625 | 18.6 | 82 | usa | ford ranger |
397 | 31.0 | 4 | 119.0 | 82.0 | 2720 | 19.4 | 82 | usa | chevy s-10 |
398 rows × 9 columns
데이터셋 일부가져오기¶
In [10]:
df.head(3)
Out[10]:
mpg | cylinders | displacement | horsepower | weight | acceleration | model_year | origin | name | |
---|---|---|---|---|---|---|---|---|---|
0 | 18.0 | 8 | 307.0 | 130.0 | 3504 | 12.0 | 70 | usa | chevrolet chevelle malibu |
1 | 15.0 | 8 | 350.0 | 165.0 | 3693 | 11.5 | 70 | usa | buick skylark 320 |
2 | 18.0 | 8 | 318.0 | 150.0 | 3436 | 11.0 | 70 | usa | plymouth satellite |
In [11]:
df.tail(2)
Out[11]:
mpg | cylinders | displacement | horsepower | weight | acceleration | model_year | origin | name | |
---|---|---|---|---|---|---|---|---|---|
396 | 28.0 | 4 | 120.0 | 79.0 | 2625 | 18.6 | 82 | usa | ford ranger |
397 | 31.0 | 4 | 119.0 | 82.0 | 2720 | 19.4 | 82 | usa | chevy s-10 |
In [12]:
df.sample(3)
Out[12]:
mpg | cylinders | displacement | horsepower | weight | acceleration | model_year | origin | name | |
---|---|---|---|---|---|---|---|---|---|
94 | 13.0 | 8 | 440.0 | 215.0 | 4735 | 11.0 | 73 | usa | chrysler new yorker brougham |
389 | 22.0 | 6 | 232.0 | 112.0 | 2835 | 14.7 | 82 | usa | ford granada l |
176 | 19.0 | 6 | 232.0 | 90.0 | 3211 | 17.0 | 75 | usa | amc pacer |
In [14]:
df.sample(2,random_state=42)
Out[14]:
mpg | cylinders | displacement | horsepower | weight | acceleration | model_year | origin | name | |
---|---|---|---|---|---|---|---|---|---|
198 | 33.0 | 4 | 91.0 | 53.0 | 1795 | 17.4 | 76 | japan | honda civic |
396 | 28.0 | 4 | 120.0 | 79.0 | 2625 | 18.6 | 82 | usa | ford ranger |
기술통계 확인하기¶
In [15]:
df.describe()
Out[15]:
mpg | cylinders | displacement | horsepower | weight | acceleration | model_year | |
---|---|---|---|---|---|---|---|
count | 398.000000 | 398.000000 | 398.000000 | 392.000000 | 398.000000 | 398.000000 | 398.000000 |
mean | 23.514573 | 5.454774 | 193.425879 | 104.469388 | 2970.424623 | 15.568090 | 76.010050 |
std | 7.815984 | 1.701004 | 104.269838 | 38.491160 | 846.841774 | 2.757689 | 3.697627 |
min | 9.000000 | 3.000000 | 68.000000 | 46.000000 | 1613.000000 | 8.000000 | 70.000000 |
25% | 17.500000 | 4.000000 | 104.250000 | 75.000000 | 2223.750000 | 13.825000 | 73.000000 |
50% | 23.000000 | 4.000000 | 148.500000 | 93.500000 | 2803.500000 | 15.500000 | 76.000000 |
75% | 29.000000 | 8.000000 | 262.000000 | 126.000000 | 3608.000000 | 17.175000 | 79.000000 |
max | 46.600000 | 8.000000 | 455.000000 | 230.000000 | 5140.000000 | 24.800000 | 82.000000 |
pandas-profiling¶
In [2]:
!pip install pandas-profiling
Collecting pandas-profiling
Downloading pandas_profiling-3.1.0-py2.py3-none-any.whl (261 kB)
Requirement already satisfied: PyYAML>=5.0.0 in c:\users\admin\anaconda3\lib\site-packages (from pandas-profiling) (6.0)
Requirement already satisfied: markupsafe~=2.0.1 in c:\users\admin\anaconda3\lib\site-packages (from pandas-profiling) (2.0.1)
Requirement already satisfied: tqdm>=4.48.2 in c:\users\admin\anaconda3\lib\site-packages (from pandas-profiling) (4.62.3)
Requirement already satisfied: jinja2>=2.11.1 in c:\users\admin\anaconda3\lib\site-packages (from pandas-profiling) (2.11.3)
Requirement already satisfied: seaborn>=0.10.1 in c:\users\admin\anaconda3\lib\site-packages (from pandas-profiling) (0.11.2)
Requirement already satisfied: scipy>=1.4.1 in c:\users\admin\anaconda3\lib\site-packages (from pandas-profiling) (1.7.1)
Requirement already satisfied: pandas!=1.0.0,!=1.0.1,!=1.0.2,!=1.1.0,>=0.25.3 in c:\users\admin\anaconda3\lib\site-packages (from pandas-profiling) (1.3.5)
Collecting joblib~=1.0.1
Downloading joblib-1.0.1-py3-none-any.whl (303 kB)
Requirement already satisfied: numpy>=1.16.0 in c:\users\admin\anaconda3\lib\site-packages (from pandas-profiling) (1.20.3)
Requirement already satisfied: matplotlib>=3.2.0 in c:\users\admin\anaconda3\lib\site-packages (from pandas-profiling) (3.4.3)
Collecting tangled-up-in-unicode==0.1.0
Downloading tangled_up_in_unicode-0.1.0-py3-none-any.whl (3.1 MB)
Collecting missingno>=0.4.2
Downloading missingno-0.5.0-py3-none-any.whl (8.8 kB)
Collecting multimethod>=1.4
Downloading multimethod-1.6-py3-none-any.whl (9.4 kB)
Collecting htmlmin>=0.1.12
Downloading htmlmin-0.1.12.tar.gz (19 kB)
Requirement already satisfied: requests>=2.24.0 in c:\users\admin\anaconda3\lib\site-packages (from pandas-profiling) (2.26.0)
Collecting visions[type_image_path]==0.7.4
Downloading visions-0.7.4-py3-none-any.whl (102 kB)
Collecting phik>=0.11.1
Downloading phik-0.12.0-cp39-cp39-win_amd64.whl (659 kB)
Collecting pydantic>=1.8.1
Downloading pydantic-1.9.0-cp39-cp39-win_amd64.whl (2.1 MB)
Requirement already satisfied: attrs>=19.3.0 in c:\users\admin\anaconda3\lib\site-packages (from visions[type_image_path]==0.7.4->pandas-profiling) (21.2.0)
Requirement already satisfied: networkx>=2.4 in c:\users\admin\anaconda3\lib\site-packages (from visions[type_image_path]==0.7.4->pandas-profiling) (2.6.3)
Requirement already satisfied: Pillow in c:\users\admin\anaconda3\lib\site-packages (from visions[type_image_path]==0.7.4->pandas-profiling) (8.4.0)
Collecting imagehash
Downloading ImageHash-4.2.1.tar.gz (812 kB)
Requirement already satisfied: pyparsing>=2.2.1 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib>=3.2.0->pandas-profiling) (3.0.4)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib>=3.2.0->pandas-profiling) (1.3.1)
Requirement already satisfied: cycler>=0.10 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib>=3.2.0->pandas-profiling) (0.10.0)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib>=3.2.0->pandas-profiling) (2.8.2)
Requirement already satisfied: six in c:\users\admin\anaconda3\lib\site-packages (from cycler>=0.10->matplotlib>=3.2.0->pandas-profiling) (1.16.0)
Requirement already satisfied: pytz>=2017.3 in c:\users\admin\anaconda3\lib\site-packages (from pandas!=1.0.0,!=1.0.1,!=1.0.2,!=1.1.0,>=0.25.3->pandas-profiling) (2021.3)
Requirement already satisfied: typing-extensions>=3.7.4.3 in c:\users\admin\anaconda3\lib\site-packages (from pydantic>=1.8.1->pandas-profiling) (3.10.0.2)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\admin\anaconda3\lib\site-packages (from requests>=2.24.0->pandas-profiling) (2021.10.8)
Requirement already satisfied: charset-normalizer~=2.0.0 in c:\users\admin\anaconda3\lib\site-packages (from requests>=2.24.0->pandas-profiling) (2.0.4)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\admin\anaconda3\lib\site-packages (from requests>=2.24.0->pandas-profiling) (1.26.7)
Requirement already satisfied: idna<4,>=2.5 in c:\users\admin\anaconda3\lib\site-packages (from requests>=2.24.0->pandas-profiling) (3.2)
Requirement already satisfied: colorama in c:\users\admin\anaconda3\lib\site-packages (from tqdm>=4.48.2->pandas-profiling) (0.4.4)
Requirement already satisfied: PyWavelets in c:\users\admin\anaconda3\lib\site-packages (from imagehash->visions[type_image_path]==0.7.4->pandas-profiling) (1.1.1)
Building wheels for collected packages: htmlmin, imagehash
Building wheel for htmlmin (setup.py): started
Building wheel for htmlmin (setup.py): finished with status 'done'
Created wheel for htmlmin: filename=htmlmin-0.1.12-py3-none-any.whl size=27098 sha256=68a631fa86cb0a7594dac5aa43a92abe9856b67e169a9d1c8f1877f1597a364d
Stored in directory: c:\users\admin\appdata\local\pip\cache\wheels\1d\05\04\c6d7d3b66539d9e659ac6dfe81e2d0fd4c1a8316cc5a403300
Building wheel for imagehash (setup.py): started
Building wheel for imagehash (setup.py): finished with status 'done'
Created wheel for imagehash: filename=ImageHash-4.2.1-py2.py3-none-any.whl size=295207 sha256=9fcfeb9a8d726c7ea83166fa0d5f6ab68436f8b4c503692789277dc04ea28d3b
Stored in directory: c:\users\admin\appdata\local\pip\cache\wheels\51\f9\a5\740af2fdb0ad1edf79aabdc41531be0b6f0b2e2be684c388cf
Successfully built htmlmin imagehash
Installing collected packages: tangled-up-in-unicode, multimethod, visions, joblib, imagehash, pydantic, phik, missingno, htmlmin, pandas-profiling
Attempting uninstall: joblib
Found existing installation: joblib 1.1.0
Uninstalling joblib-1.1.0:
Successfully uninstalled joblib-1.1.0
Successfully installed htmlmin-0.1.12 imagehash-4.2.1 joblib-1.0.1 missingno-0.5.0 multimethod-1.6 pandas-profiling-3.1.0 phik-0.12.0 pydantic-1.9.0 tangled-up-in-unicode-0.1.0 visions-0.7.4
In [4]:
from pandas_profiling import ProfileReport
In [16]:
profile = ProfileReport(df, title="Pandas Profiling Report")
In [17]:
profile
Out[17]:
In [18]:
#html 파일로 생성하기
profile.to_file("pandas_profile_report.html")
sweetviz¶
In [19]:
!pip install sweetviz
Collecting sweetviz
Downloading sweetviz-2.1.3-py3-none-any.whl (15.1 MB)
Requirement already satisfied: numpy>=1.16.0 in c:\users\admin\anaconda3\lib\site-packages (from sweetviz) (1.20.3)
Requirement already satisfied: scipy>=1.3.2 in c:\users\admin\anaconda3\lib\site-packages (from sweetviz) (1.7.1)
Requirement already satisfied: pandas!=1.0.0,!=1.0.1,!=1.0.2,>=0.25.3 in c:\users\admin\anaconda3\lib\site-packages (from sweetviz) (1.3.5)
Requirement already satisfied: matplotlib>=3.1.3 in c:\users\admin\anaconda3\lib\site-packages (from sweetviz) (3.4.3)
Requirement already satisfied: jinja2>=2.11.1 in c:\users\admin\anaconda3\lib\site-packages (from sweetviz) (2.11.3)
Collecting importlib-resources>=1.2.0
Downloading importlib_resources-5.4.0-py3-none-any.whl (28 kB)
Requirement already satisfied: tqdm>=4.43.0 in c:\users\admin\anaconda3\lib\site-packages (from sweetviz) (4.62.3)
Requirement already satisfied: zipp>=3.1.0 in c:\users\admin\anaconda3\lib\site-packages (from importlib-resources>=1.2.0->sweetviz) (3.6.0)
Requirement already satisfied: MarkupSafe>=0.23 in c:\users\admin\anaconda3\lib\site-packages (from jinja2>=2.11.1->sweetviz) (2.0.1)
Requirement already satisfied: pillow>=6.2.0 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib>=3.1.3->sweetviz) (8.4.0)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib>=3.1.3->sweetviz) (2.8.2)
Requirement already satisfied: cycler>=0.10 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib>=3.1.3->sweetviz) (0.10.0)
Requirement already satisfied: pyparsing>=2.2.1 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib>=3.1.3->sweetviz) (3.0.4)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\admin\anaconda3\lib\site-packages (from matplotlib>=3.1.3->sweetviz) (1.3.1)
Requirement already satisfied: six in c:\users\admin\anaconda3\lib\site-packages (from cycler>=0.10->matplotlib>=3.1.3->sweetviz) (1.16.0)
Requirement already satisfied: pytz>=2017.3 in c:\users\admin\anaconda3\lib\site-packages (from pandas!=1.0.0,!=1.0.1,!=1.0.2,>=0.25.3->sweetviz) (2021.3)
Requirement already satisfied: colorama in c:\users\admin\anaconda3\lib\site-packages (from tqdm>=4.43.0->sweetviz) (0.4.4)
Installing collected packages: importlib-resources, sweetviz
Successfully installed importlib-resources-5.4.0 sweetviz-2.1.3
In [20]:
import sweetviz as sv
In [21]:
my_report = sv.analyze(df)
In [23]:
#html 페이지로 분석통계 보여줌
my_report.show_html()
Report SWEETVIZ_REPORT.html was generated! NOTEBOOK/COLAB USERS: the web browser MAY not pop up, regardless, the report IS saved in your notebook/colab files.
728x90
'😁 빅데이터 문제 풀기 & Study > - 클론코딩하기' 카테고리의 다른 글
[pandas] 🍒모두를 위한 데이터사이언스 클론코딩하기-3.수치형변수 (0) | 2022.01.27 |
---|---|
[pandas] 🍒모두를 위한 데이터사이언스 클론코딩하기-1 (0) | 2022.01.25 |