ویب پر گرفت اور تبدیل کرنے کے اوزار

Capture HTML Tables from Websites with Pythonازگر API

ایچ ٹی ایم ایل ٹیبلز کو تبدیل کرنے کے متعدد طریقے ہیں into CSV's and Excel spreadsheets using GrabzIt کا ازگر API، کچھ مفید تکنیک یہاں تفصیلی ہیں۔ تاہم ، اس سے پہلے کہ آپ فون کریں ، یاد رکھیں URLToTable, HTMLToTable or فائلٹوٹوبل طریقوں Save or SaveTo میز پر قبضہ کرنے کے ل method طریقہ کو بلایا جانا چاہئے۔ اگر آپ جلدی دیکھنا چاہتے ہیں کہ آیا یہ خدمت آپ کے لئے ٹھیک ہے یا نہیں ، آپ کوشش کر سکتے ہیں HTML ٹیبلوں پر قبضہ کرنے کا براہ راست ڈیمو ایک URL سے۔

بنیادی اختیارات

The below code snippet automatically converts the first HTML table in a specified webpage into a CSV document that can then be downloaded or parsed.

grabzIt.URLToTable("https://www.tesla.com")
# Then call the Save or SaveTo method
grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>")
# Then call the Save or SaveTo method
grabzIt.FileToTable("tables.html")
# Then call the Save or SaveTo method

پہلے سے طے شدہ طور پر یہ پہلا جدول جس میں اس کی شناخت ہوتی ہے اسے تبدیل کردے گا intOA میز. تاہم ویب پیج میں موجود دوسری ٹیبل کو 2 پاس کرکے تبدیل کیا جاسکتا ہے tableNumberToInclude وصف.

from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.tableNumberToInclude = 2

grabzIt.URLToTable("https://www.tesla.com", options)
# Then call the Save or SaveTo method
grabzIt.SaveTo("result.csv")
from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.tableNumberToInclude = 2

grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", options)
# Then call the Save or SaveTo method
grabzIt.SaveTo("result.csv")
from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.tableNumberToInclude = 2

grabzIt.FileToTable("tables.html", options)
# Then call the Save or SaveTo method
grabzIt.SaveTo("result.csv")

آپ یہ بھی بیان کرسکتے ہیں targetElement attribute that will ensure only tables within the specified element id will be converted.

from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.targetElement = "stocks_table"

grabzIt.URLToTable("https://www.tesla.com", options)
# Then call the Save or SaveTo method
grabzIt.SaveTo("result.csv")
from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.targetElement = "stocks_table"

grabzIt.HTMLToTable("<html><body><table id='stocks_table'><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", options)
# Then call the Save or SaveTo method
grabzIt.SaveTo("result.csv")
from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.targetElement = "stocks_table"

grabzIt.FileToTable("tables.html", options)
# Then call the Save or SaveTo method
grabzIt.SaveTo("result.csv")

متبادل کے طور پر آپ ویب پیج پر سارے ٹیبلز پر درست ہو کر قبضہ کرسکتے ہیں includeAllTables attribute, however this will only work with the XLSX and JSON formats. This option will put each table in a new sheet within the generated spreadsheet workbook.

from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.format = 'xlsx'
options.includeAllTables = True

grabzIt.URLToTable("https://www.tesla.com", options)
# Then call the Save or SaveTo method
grabzIt.SaveTo("result.xlsx")
from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.format = 'xlsx'
options.includeAllTables = True

grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", options)
# Then call the Save or SaveTo method
grabzIt.SaveTo("result.xlsx")
from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.format = 'xlsx'
options.includeAllTables = True

grabzIt.FileToTable("tables.html", options)
# Then call the Save or SaveTo method
grabzIt.SaveTo("result.xlsx")

HTML میزیں JSON میں تبدیل کریں

Using Python and GrabzIt's HTML table conversion service enables you to convert HTML tables into JSON. The first step as shown below is to specify json in the format parameter. We then get the JSON string ہم وقت سازی کے ساتھ کے ساتھ SaveTo method, you can then use your favourite JSON parser for Python to convert the JSON string into a object.

from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.format = "json"
options.tableNumberToInclude = 1

grabzIt.URLToTable("https://www.tesla.com", options)

json = grabzIt.SaveTo()
from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.format = "json"
options.tableNumberToInclude = 1

grabzIt.HTMLToTable("<html><body><table><tr><th>Name</th><th>Age</th></tr>
    <tr><td>Tom</td><td>23</td></tr><tr><td>Nicola</td><td>26</td></tr>
    </table></body></html>", options)

json = grabzIt.SaveTo()
from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.format = "json"
options.tableNumberToInclude = 1

grabzIt.FileToTable("tables.html", options)

json = grabzIt.SaveTo()

کسٹم شناختی

آپ کو ایک کسٹم شناخت کنندہ پاس کرسکتے ہیں ٹیبل جیسا کہ ذیل میں دکھایا گیا ہے ، اس کی قیمت آپ کے GrabzIt ازگر ہینڈلر کو واپس کردی جاتی ہے۔ مثال کے طور پر یہ کسٹم شناخت کنندہ ایک ڈیٹا بیس شناخت کنندہ ہوسکتا ہے ، جس سے اسکرین شاٹ کو کسی خاص ڈیٹا بیس ریکارڈ کے ساتھ وابستہ کیا جاسکتا ہے۔

from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.customId = "123456"

grabzIt.URLToTable("https://www.tesla.com", options)
# Then call the Save method
grabzIt.Save("http://www.example.com/handler.py")
from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.customId = "123456"

grabzIt.HTMLToTable("<html><body><h1>Hello World!</h1></body></html>", options)
# Then call the Save method
grabzIt.Save("http://www.example.com/handler.py")
from GrabzIt import GrabzItTableOptions
from GrabzIt import GrabzItClient

grabzIt = GrabzItClient.GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret")

options = GrabzItTableOptions.GrabzItTableOptions()
options.customId = "123456"

grabzIt.FileToTable("example.html", options)
# Then call the Save method
grabzIt.Save("http://www.example.com/handler.py")