This topic describes how to reference a third-party package in a PyODPS node.



  1. Download the packages that are listed in the following table.
    Package File Resource file
    six six-1.11.0.tar.gz six.tar.gz
    pandas pandas-0.20.2-cp27-cp27m-manylinux1_x86_64.whl
    scipy scipy-0.19.0-cp27-cp27m-manylinux1_x86_64.whl
    scikit-learn scikit_learn-0.18.1-cp27-cp27m-manylinux1_x86_64.whl
    Note You must manually compress the resource files in the pandas, scipy, and scikit-learn packages in the same format as those in the Resource file column.
  2. Log on to the DataWorks console.
  3. Create a workflow.
    1. On the DataStudio page, right-click Business Flow and select Create Workflow.
    2. In the Create Workflow dialog box, specify Workflow Name and click Create.
  4. Create and commit resources.
    1. On the DataStudio page, move the pointer over the Create icon icon and choose MaxCompute > Resources > Archive.
      Alternatively, you can unfold Business Flow, right-click a workflow, and choose Create > MaxCompute > Resource > Archive.
    2. In the Create Resource dialog box, click Upload and select the file.Upload a file
    3. Enter in the Resource Name field and click OK.Set the resource name
    4. Click the Commit icon to complete the upload.Commit the resource
    5. Repeat the preceding steps to create and commit the resource files named, six.tar.gz,,, and
  5. Create a PyODPS 2 node.
    1. Right-click the workflow that you created and choose Create > MaxCompute > PyODPS 2.
    2. In the Create Node dialog box, specify Node Name and click Commit.
    3. On the tab of the PyODPS 2 node, enter the code of the node in the code editor.
      Sample code:
      def test(x):
          from sklearn import datasets, svm
          from scipy import misc
          import numpy as np
          iris = datasets.load_iris()
          assert == (150, 4)
          assert np.array_equal(np.unique(,  [0, 1, 2])
          clf = svm.LinearSVC()
          pred = clf.predict([[5.0, 3.6, 1.3, 0.25]])
          assert pred[0] == 0
          assert misc.face().shape is not None
          return x
      from odps import options
      hints = {
          'odps.isolation.session.enable': True
      libraries = ['', '', 'six.tar.gz', '', '', '']
      iris = o.get_table('pyodps_iris').to_df()
      print iris[:1], libraries=libraries)
  6. Click the Run icon.
  7. View the running result of the PyODPS 2 node on the Run Log tab.
    Sql compiled:
    CREATE TABLE tmp_pyodps_a3172c30_a0d7_4c88_bc39_434168263897 LIFECYCLE 1 AS
    SELECT pyodps_udf_1576485276_94d9d978_af66_4e27_a874_e787022dfb3d(t1.`sepallength`) AS `sepallength`
    FROM WB_BestPractice_dev.`pyodps_iris` t1
    LIMIT 1
    Instance ID: 20191216083438175gcv6n4pr2
      Log view:
    0          5.1
    Note For more information about best practices, see Use a PyODPS node to segment Chinese text based on Jieba.