Export table(s) in the web pages into downloadable CSV, JSON file(s)
In short, Mr. Table is a tool that can help extract data from table(s) (e.g. <table>*</table>) of the web pages, and the extracted data can be saved into either "csv" or "json" format.
We often have needs to collect data from the Internet for our work or our study, however, data presented in the web pages are often not in the format that we want. For example, most data in the web pages are presented using HTML tag '<table></table>' or '<div></div>', but we want data can be processed by our programs or our tools (e.g. Excel).
With "Mr. Table", data can be converted from what you can see from the web pages to the format that we can actually use.
Data often is presented using HTML table and related tags in the following ways:
- <table> for the table
- <thead> for the table column names
- <tr> for the table header row
- <th> for the table header cell
- <tbody> for the actual data
- <tr> for a data row
- <td> for a data cell
For those data you can simply using the default settings with preset table selector, column selector, cell selector, etc.
Data also often is presented using CSS, and data is grouped in <div> tags and styled with CSS classes, for Example:
<ol>
<!-- Column header is the first item of list -->
<li>
<div>#</div>
<div>
<div>ID</div>
<div>Name</div>
<div>Age</div>
</div>
...
</li>
<li>
<div>1</div>
<div>
<div>John Doe</div>
<div>23</div>
</div>
...
</li>
<li>
<div>1</div>
<div>
<div>John Smith</div>
<div>37</div>
</div>
...
</li>
...
</ol>
Unfortunately, for such tables you can only extract them by specify the selectors manually.
We can work on a smart way to extract data from such tables if we can get more support.