#graphviz
#graphviz
Вопрос:
Я хочу визуализировать дерево решений с помощью graphviz.
Я нашел несколько примеров кода (https://gist.github.com/WillKoehrsen/ff77f5f308362819805a3defd9495ffd ):
from sklearn.datasets import load_iris
iris = load_iris()
# Model (can also use single decision tree)
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=10)
# Train
model.fit(iris.data, iris.target)
# Extract single tree
estimator = model.estimators_[5]
from sklearn.tree import export_graphviz
# Export as dot file
export_graphviz(estimator, out_file='tree.dot',
feature_names = iris.feature_names,
class_names = iris.target_names,
rounded = True, proportion = False,
precision = 2, filled = True)
# Convert to png using system command (requires Graphviz)
from subprocess import call
call(['dot', '-Tpng', 'tree.dot', '-o', 'tree.png', '-Gdpi=600'])
# Display in jupyter notebook
from IPython.display import Image
Image(filename = 'tree.png')
Однако, когда я делаю это с записными книжками jupyter, я получаю FileNotFoundError:
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-3-6d9aafea91ef> in <module>()
21 # Convert to png using system command (requires Graphviz)
22 from subprocess import call
---> 23 call(['dot', '-Tpng', 'tree.dot', '-o', 'tree.png', '-Gdpi=600'])
24
25 # Display in jupyter notebook
C:ProgramDataAnaconda3libsubprocess.py in call(timeout, *popenargs, **kwargs)
302 retcode = call(["ls", "-l"])
303 """
--> 304 with Popen(*popenargs, **kwargs) as p:
305 try:
306 return p.wait(timeout=timeout)
C:ProgramDataAnaconda3libsubprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)
754 c2pread, c2pwrite,
755 errread, errwrite,
--> 756 restore_signals, start_new_session)
757 except:
758 # Cleanup if the child failed starting.
C:ProgramDataAnaconda3libsubprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, unused_restore_signals, unused_start_new_session)
1153 env,
1154 os.fspath(cwd) if cwd is not None else None,
-> 1155 startupinfo)
1156 finally:
1157 # Child is launched. Close the parent's copy of those pipe
FileNotFoundError: [WinError 2] Das System kann die angegebene Datei nicht finden
(В последнем сообщении говорится, что моя система не смогла найти указанный файл)
С подсказкой Anaconda в режиме администратора я работал conda install -c anaconda graphviz
без ошибок:
Collecting package metadata: done
Solving environment:
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:
- defaults/win-64::anaconda==5.3.1=py37_0
- defaults/win-64::astropy==3.0.4=py37hfa6e2cd_0
- defaults/win-64::bkcharts==0.2=py37_0
- defaults/win-64::blaze==0.11.3=py37_0
- defaults/win-64::bokeh==0.13.0=py37_0
- defaults/win-64::bottleneck==1.2.1=py37h452e1ab_1
- defaults/win-64::dask==0.19.1=py37_0
- defaults/win-64::datashape==0.5.4=py37_1
- defaults/win-64::h5py==2.8.0=py37h3bdd7fb_2
- defaults/win-64::imageio==2.4.1=py37_0
- defaults/win-64::matplotlib==2.2.3=py37hd159220_0
- defaults/win-64::mkl-service==1.1.2=py37hb217b18_5
- defaults/win-64::mkl_fft==1.0.4=py37h1e22a9b_1
- defaults/win-64::mkl_random==1.0.1=py37h77b88f5_1
- defaults/win-64::numba==0.39.0=py37h830ac7b_0
- defaults/win-64::numexpr==2.6.8=py37h9ef55f4_0
- defaults/win-64::numpy==1.15.1=py37ha559c80_0
- defaults/win-64::numpy-base==1.15.1=py37h8128ebf_0
- defaults/win-64::odo==0.5.1=py37_0
- defaults/win-64::pandas==0.23.4=py37h830ac7b_0
- defaults/win-64::patsy==0.5.0=py37_0
- defaults/win-64::pytables==3.4.4=py37he6f6034_0
- defaults/win-64::pytest-arraydiff==0.2=py37h39e3cac_0
- defaults/win-64::pytest-astropy==0.4.0=py37_0
- defaults/win-64::pytest-doctestplus==0.1.3=py37_0
- defaults/win-64::pywavelets==1.0.0=py37h452e1ab_0
- defaults/win-64::scikit-image==0.14.0=py37h6538335_1
- defaults/win-64::scikit-learn==0.19.2=py37heebcf9a_0
- defaults/win-64::scipy==1.1.0=py37h4f6bf74_1
- defaults/win-64::seaborn==0.9.0=py37_0
- defaults/win-64::statsmodels==0.9.0=py37h452e1ab_0
done
## Package Plan ##
environment location: C:ProgramDataAnaconda3
added / updated specs:
- graphviz
The following packages will be downloaded:
package | build
---------------------------|-----------------
ca-certificates-2019.1.23 | 0 158 KB anaconda
certifi-2019.3.9 | py37_0 155 KB anaconda
conda-4.6.12 | py37_1 2.1 MB anaconda
graphviz-2.38.0 | 4 37.7 MB anaconda
openssl-1.1.1 | he774522_0 5.7 MB anaconda
vc-14.1 | h21ff451_3 5 KB anaconda
vs2015_runtime-15.5.2 | 3 2.2 MB anaconda
------------------------------------------------------------
Total: 48.1 MB
The following packages will be UPDATED:
openssl conda-forge::openssl-1.1.1b-hfa6e2cd_2 --> anaconda::openssl-1.1.1-he774522_0
vs2015_runtime pkgs/main::vs2015_runtime-14.15.26706~ --> anaconda::vs2015_runtime-15.5.2-3
The following packages will be SUPERSEDED by a higher-priority channel:
ca-certificates conda-forge::ca-certificates-2019.3.9~ --> anaconda::ca-certificates-2019.1.23-0
certifi conda-forge --> anaconda
conda conda-forge::conda-4.6.12-py37_2 --> anaconda::conda-4.6.12-py37_1
graphviz conda-forge::graphviz-2.38.0-h6538335~ --> anaconda::graphviz-2.38.0-4
vc pkgs/main::vc-14.1-h0510ff6_4 --> anaconda::vc-14.1-h21ff451_3
Proceed ([y]/n)? y
Downloading and Extracting Packages
certifi-2019.3.9 | 155 KB | ############################################################################ | 100%
conda-4.6.12 | 2.1 MB | ############################################################################ | 100%
graphviz-2.38.0 | 37.7 MB | ############################################################################ | 100%
openssl-1.1.1 | 5.7 MB | ############################################################################ | 100%
vc-14.1 | 5 KB | ############################################################################ | 100%
ca-certificates-2019 | 158 KB | ############################################################################ | 100%
vs2015_runtime-15.5. | 2.2 MB | ############################################################################ | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Удивительно, но это сработало просто отлично, когда я запустил код на kaggle.com .
Есть идеи о том, в чем может быть проблема и как ее решить?
Ответ №1:
Просто решил это.
Хитрость заключалась в добавлении пути graphviz к переменным среды Windows, вот хорошее описание: https://bobswift.atlassian.net/wiki/spaces/GVIZ/pages/20971549/How для установки программного обеспечения Graphviz
После этого я перезагрузил свой компьютер и вуаля.