捕獲和轉換Web的工具

使用PHP從網站捕獲HTML表

PHP API

有多種轉換HTML表的方法 into使用JSON,CSV或Excel電子表格 GrabzIt的PHP API,這裡詳細介紹了一些最有用的技術。 但是,在開始之前,請記住 URLToTable, HTMLToTable or 文件到表格 方法 Save or SaveTo 必須調用方法來捕獲表。 如果您想快速查看此服務是否適合您,可以嘗試 捕獲HTML表的實時演示 從URL。

基本選項

以下代碼示例自動轉換在指定網頁中發現的第一個HTML表 intoa CSV文件。

$grabzIt->URLToTable("https://www.tesla.com");
//Then call the Save or SaveTo method
$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>");
//Then call the Save or SaveTo method
$grabzIt->FileToTable("tables.html");
//Then call the Save or SaveTo method

默認情況下,它將轉換它標識的第一個表 intoa表。 但是,可以通過將2傳遞給網頁中的第二個表來進行轉換 setTableNumberToInclude 方法。

$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setTableNumberToInclude(2);

$grabzIt->URLToTable("https://www.tesla.com", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.csv");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setTableNumberToInclude(2);

$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.csv");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setTableNumberToInclude(2);

$grabzIt->FileToTable("tables.html", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.csv");

您也可以使用 setTargetElement 確保只轉換指定元素ID中的表的方法。

$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setTargetElement("stocks_table");

$grabzIt->URLToTable("https://www.tesla.com", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.csv");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setTargetElement("stocks_table");

$grabzIt->HTMLToTable("<html><body><table id='stocks_table'><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.csv");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setTargetElement("stocks_table");

$grabzIt->FileToTable("tables.html", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.csv");

另外,您也可以通過將true傳遞給 setIncludeAllTables 方法,但是僅適用於XLSX和JSON格式。 此選項會將每個表放在生成的電子表格工作簿中的新表中。

$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setFormat('xlsx');
$options->setIncludeAllTables(true);

$grabzIt->URLToTable("https://www.tesla.com", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.xlsx");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setFormat('xlsx');
$options->setIncludeAllTables(true);

$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.xlsx");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setFormat('xlsx');
$options->setIncludeAllTables(true);

$grabzIt->FileToTable("tables.html", $options);
//Then call the Save or SaveTo method
$grabzIt->SaveTo("result.xlsx");

將HTML表轉換為JSON

有時需要以編程方式讀取HTML表格,GrabzIt使您可以通過轉換在線HTML表格來使用PHP來執行此操作 into JSON。 為此,請指定 json 作為格式參數。 例如,在下面的示例中,我們正在轉換HTML表 同步地 然後使用內置 json_decode PHP解析JSON的方法 string into我們可以輕鬆使用的對象。

$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setFormat("json");
$options->setTableNumberToInclude(1);

$grabzIt->URLToTable("https://www.tesla.com", $options);

$json = $grabzIt->SaveTo();
if ($json != null)
{
    $tableObj = json_decode($json);
}
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setFormat("json");
$options->setTableNumberToInclude(1);

$grabzIt->HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", $options);

$json = $grabzIt->SaveTo();
if ($json != null)
{
    $tableObj = json_decode($json);
}
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");
    
$options = new \GrabzIt\GrabzItTableOptions();
$options->setFormat("json");
$options->setTableNumberToInclude(1);

$grabzIt->FileToTable("tables.html", $options);

$json = $grabzIt->SaveTo();
if ($json != null)
{
    $tableObj = json_decode($json);
}

自訂識別碼

您可以將自定義標識符傳遞給 方法,如下所示,然後將該值返回到您的GrabzIt PHP處理程序。 例如,此自定義標識符可以是數據庫標識符,從而允許將提取的表與特定的數據庫記錄相關聯。

$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setCustomId(123456);

$grabzIt->URLToTable("https://www.tesla.com", $options);
//Then call the Save method
$grabzIt->Save("http://www.example.com/handler.php");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setCustomId(123456);

$grabzIt->HTMLToTable("<html><body><h1>Hello World!</h1></body></html>", $options);
//Then call the Save method
$grabzIt->Save("http://www.example.com/handler.php");
$grabzIt = new \GrabzIt\GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");

$options = new \GrabzIt\GrabzItTableOptions();
$options->setCustomId(123456);

$grabzIt->FileToTable("example.html", $options);
//Then call the Save method
$grabzIt->Save("http://www.example.com/handler.php");