Capture web pages to local device or backend server for future retrieval, organization, annotation, and editing.
WebScrapBook is a browser extension that captures the web page faithfully with various archive formats and customizable configurations, for future retrieval, organization, annotation, and editing. This project inherits from legacy Firefox add-on ScrapBook X.
Features:
1. Capture faithfully: A web page shown in the browser can be captured without losing any subtle detail. Metadata such as source URL and timestamp are also recorded.
2. Customizable capture: WebScrapBook can save selected area in a page, save source page (before processed by scripts), or save page as a bookmark. How to capture images, audio, video, fonts, frames, styles, scripts, etc. are also customizable. A web page can be saved as a folder, a ZIP-based archive file (HTZ or MAFF), or a single HTML file.
3. Page editing: A web page can be highlighted, annotated, or edited before or after a capture.
4. Organizable collections: Captured pages can be organized in the browser sidebar using one or more scrapbooks, and each scrapbooks holds a hierarchical tree structure to organize data items. Notes using HTML or markdown format can also be created and managed. (*)
5. Fulltext searching: Each scrapbook can be further indexed for a rich-feature search (using title, fulltext, comment, source URL, create time, modify time, etc.). (*)
6. Remote access: Captured data can be hosted with a central backend server and be read or edited from other devices. Alternatively, a scrapbook can generate a static site index and be distributed as a static web site. (*)
7. Mobile support: WebScrapBook supports mobile browsers such as Firefox for Android and Kiwi browser. You can capture and edit the web page from a mobile phone or tablet.
8. Legacy ScrapBook support: Scrapbooks created from legacy ScrapBook or ScrapBook X can be converted into WebScrapBook-compliant format for usage. (*)
* All or partial functionality of a starred feature above requires a running collaborating backend server, which can be easily set up using PyWebScrapBook. (*)
* An HTZ or MAFF archive file can be viewed using the built-in archive page viewer, with PyWebScrapBook or other assistant tools, or by opening the index page after unzipping.
See Also:
* For further information and frequently asked questions, visit the documentation wiki: https://github.com/danny0838/webscrapbook/wiki/Intro
* For better discussion, please report an issue to the source repository: https://github.com/danny0838/webscrapbook/issues
* Donate to support us if you find this tool helpful: https://www.paypal.me/danny0838/5usd
Latest reviews
- (2023-06-04) ru ve: I was hoping to save a complete webpage for offline use. I enabled the extension and just tried the default "Capture Page" feature. I got a new dialog box to save every single resource on the page, like every image and css file. What an absurd. The page I wanted to save contained a very simple gallery of thumbnails linked to larger images. The extension saved only the thumbnails, and they were still linking to the large image online. I find this extensions completely useless, and very user unfriendly.
- (2023-03-22) Notnilc: I have very limited programming experience so there might be some dunning kreuger at play here, but this much, much, much, MUCH better than httrack or cyotek webcopy. The documentation is great and all, but I feel like most of it could be made redundant with a simple video tutorial.
- (2022-11-07) design Source: 一直显示下载失败,但Firefox是正常的。
- (2022-08-19) 张明浩: 1. 完美剪辑HTML,标注HTML 2. 提供PYTHON的后端——实现一端保存,多端查看(docker-PyWebScrapBook github仓库) 3. 免费,感谢作者用爱发电 插件很棒💖💖💖
- (2021-12-03) Алексей kabelsis: Ужасно! Открывает миллион диалоговый окон сохранения изображений и файлов при сохранении всей страницы
- (2021-11-29) Полат Османов: Очень полезное расширение! Рекомендую
- (2021-10-13) F Y: 不支持修改网页内容
- (2021-06-22) EDEN EDEN: Good!
- (2021-06-12) Li Su: Agree with Clarence Domesticus Wonderful extension, as a ScrapbookX user for years, I think this extension is able to do almost the same as scrapbookX, and additional features of PyWebScrapBook backend make it more useful. I can stop using the now very very slow old version of firefox eventually. Thanks a lot.
- (2021-05-28) null_404: 网页批注神器
- (2021-04-18) behrouz 40: don't work
- (2021-03-15) Clement Maloney: I am sorry to have to give this promising extension 1 star, however I have spent 5 hours trying to save a page and the pages linked from that page. You can play around with depth and filters (God knows what "Each following line is a full URL (with chars following a “#” or space stripped) or a regular expression (e.g. “/^http://example\.com//”). " in the options is supposed to mean,). This is by far the most frustrating, and time wasting Chrome extension that I have ever installed. It simply does not work! Lastly, I tried both the Chrome and Firefox versions and only ever end up with the orginal page being saved i.e. no subpages.
- (2021-02-22) Алексей: Страницы вроде как сохраняет, но ссылка приложения view captured pages не активна, от чего нет возможности посмотреть список сохраненных страниц
- (2021-01-23) Дмитро Доденко: При сохранении вкладки требует подтверждения для каждого файла. А если их сотни..
- (2020-11-10) Reng: Worked perfectly with my old scrapbook files, thank you!! Can import and handle thousands of entries with no problem. The search function is infinitely better than the old scrapbook. Once again thank you so much for this extension.
- (2020-11-07) 叭噗バプ: 擷取的時候,要手動,一個一個儲存檔案 非常麻煩 要如何設定
- (2020-10-16) yun kong: 太厉害了!
- (2020-10-07) Clarence Domesticus: This is my first experience using any such scrapbook extension & setting up the backend server (PyWebScrapBook, as noted in the extension overview). Overall, I'm very satisfied. It does fail on some sites, but after spending some time tweaking the settings I've been able to use it successfully on most sites I've needed it for. There are plenty of options & features, definitely more than enough to handle nearly any task for which I've needed this extension. I like that it's an ongoing project that's still being updated regularly. I see there are a number of negative reviews from individuals who are upset that this doesn't work like some other scrapbooks they've used in the past (EVEN SOME IN ALL CAPS). This seems to be a common thing reviewers like to do. In my opinion these comparisons are somewhat of an unfair basis for 1-star & 2-star ratings. If there are other options that some users enjoy using more, they should use those extensions/scrapbooks instead. I think it would be absurd, for example, if were to go out of my way to purchase a pack of markers specifically by BrandX and then left negative reviews because they're not BrandY, rather than evaluating BrandX based on its own qualities. Some 1-star reviews are clearly written by users who couldn't bother to follow basic configuration instructions or even take a simple look around the menus/folders. I wish ppl would stop doing this s*** not just here but in all their reviews. I just checked github, and it looks like are commits from as recently as 3 hours ago; like I said, this is an ongoing project, and I'm excited to see where the contributors take it. My only criticism as of right now is that, after updating my chrome, firefox, and edge extensions, and updating to the newest server version, chrome extension v 78.2 (the most recent version available in the store) is giving error 'Server app requires extension version >= 0.79.0'. But the FF extension is working perfectly fine. Again, I'm seeing github activity as recently as today, so anticipate that this will be resolved very soon.
- (2020-09-05) פרטי: לא עובד בכלל
- (2020-08-23) David Morales Molina: Muy mala no cumplio su proposito
- (2020-08-05) arshdeep singh gill: Works flawlessly for me
- (2020-07-06) Kang Chen: 非常好用的插件
- (2020-04-09) Mehdi Deilami: Seems like the extension doesn't use the cache and re-grabs the resources which is kind of inefficient
- (2020-04-06) Nima Fariba: Not working. I set Address: http://localhost:8080/ but ger error: Backend initilization error: Unable to connect to backend server.
- (2019-12-27) Jiahuang Zhang: 对于本地的html文件,可以编辑再保存,格式完美还原,简直html做笔记神器
- (2019-12-20) 00 “啊啊啊” 啊啊啊: 我试用了各种保存网页的插件,包括各种云笔记的剪藏工具,这个几乎是唯一能有效把微信公众号保存成html的。。作者加油。。
- (2019-10-27) 謝昀佑: 能夠直接擷取url嗎 擷取所有分頁雖然好用 但是我要開的分頁多到當機 希望加上這個功能
- (2019-09-08) Funny Domination: I love it! It has many useful options
- (2019-08-29) I K: It saves the pages faithfully
- (2019-08-27) Harris121 Channel: NOT THE SAME AS ORIGINAL SCRAPBOOK AT ALL. THE ONLY THING THAT IS THE SAME IS THE FRIGGIN' ICON! CONFUSING, POPUP BOXES, NO WAY TO "VIEW" THE SCRAPBOOK, NOTHING. I'VE TRIED TO FIGURE THIS OUT FOR THREE HOURS....WHERE ARE THE SAVED PAGES?...GEEZ. NO MARKUPS, HIGHLIGHTING, NOTHING....THIS IS (NOT) THE ORIGINAL FIREFOX EXTENSION....NOT EVEN A "CLOSE" COPY OF IT. (EXCEPT THE FRIGGIN' ICON). SAD. ****IF ANYONE KNOWS OF A GREAT ALTERNATIVE TO THE ORIGINAL FF SCRAPBOOK...PLZ POST!!!!! THANKS!
- (2019-08-02) Arcadiy Tpr: Saves pages nicely. Has a lot of features. I highly recommend it for a reliable offline websites archive.
- (2019-07-19) Dima: Great extension! Please continue development.
- (2019-06-29) Dmitry Kislitsyn: This extension works great! There are still some sites that it fails on, but very few and this is work in progress (check out developer's github), so the extension gets better! Excellent job and keep it up please!
- (2019-06-20) Север Петров: Расширение хорошее но Mozilla Archive Format (для Firefox) наследником которого он является сохраняет страницы в maff файлы меньшего размера. Вот пример:https://en.wikipedia.org/wiki/Mozilla_Archive_Format в firefox сохранилось в файл 57кб версия для Chrome сохранила в файл размером 206кб
- (2019-05-03) Michael Johnson: I tried this app to save webpages completely and accurately. It works on some pages like ghacks.net perfectly with scripted single html . On other pages like nytimes.com it captures the page out of sync even though all of the content seems to be there (large gap spaces, enlarged photos, etc.) Save Page WE has the same issue. On Washingtonpost.com WebScrapbook was almost perfect but there is a bug that will add incorrect characters if there is an apostrophe in the text(which in a news article there will undoubtedly be). I used scripted single html option on this also. I do have specific scripts for the Times and WPost running, but they are not the issue since Mozilla Archive Format and SingleFile always works perfectly on the same sites with the same scripts running. But since MAF doesnt work for current browsers and SingleFile works somewhat inconsistently (it stalls a lot), I was hoping WebScrapbook would work but no go. Also, I havent seen an option to save the original page url either in the title or in the .html file for reference like MAF, Singlefile, or SavePage WE can. This app might be able to save websites but if it cant do it accurately what's the point of using it.
- (2019-03-05) Cesar Andrés Vacca Devia: Muy una extensivo para descarga de paginas web, falta la opción de exportar
- (2019-02-04) 雷雨: 以前用過FF版,非常之好用的擴展,在CHROME上就是一坨屎,保存HTZ完全沒法用CHROME打開,打開又自動下載然後就沒有然後了,不知道作者自己是不是連測試都沒測試過,還有那個建立索引也不知道什麼鬼功能,碼了一堆字然而完全看不懂什麼玩意.
- (2019-01-09) Matt Cooper: Seems to work well. One thing it seems to lack that the old Firefox add in did, is drill down beyond the current tab. I'd like to copy a web page and the linked files as well, but it does not go beyond, even though I can click on the linked files and capture them individually. Is this a supported option I don't see or is it planned to be supported in the future?
- (2018-10-17) Crihy Chu: 無作動
- (2018-04-25) Option "Save captured data in ScrapBook" is ignoring and all data saved to Download folder. It's problem is missed in Firefox, but new Firefox (57+) is crap.
- (2018-04-09) Budi Susilo: Thou this is NOT the same as Firefox's Scrapbook, but this extension can READ and WRITE to the now deprecated Firefox .maff (I have hundreds of them, and I think maff is great format, see the discussion on the WebScrapbook github). The limitation is this extension currently limited to single tab .maff. Based on the discussion in the WebScrapbook github this might change in the near future. Thank you for the developer for creating such a nice extension.
- (2018-03-03) Avi Schwartz: Crashes when trying to import from the original Scrapbook X.
- (2018-01-12) zech xu: not working in the same way with firefox scrapbook at all.
- (2017-12-12) Дмитрий Горбачёв: Everything was fine, but after the update 06.10.17, it stopped working (does not save script files and httml in "Folder" mode). Please fix the problem, because I really liked the extension.
- (2017-12-08) Mirek eS: How can I change the default name of a saved file?
- (2017-11-22) G. Ivan: Респект!
- (2017-11-10) Nils Andrey Telleria Martinez: Just miss the organizer like in Firefox's old version. But is just great to have single-html and maff features. Thanks!
- (2017-11-09) Bill Gates: It's buggy in vivaldi. And, also, very not functional. It's not fair to call it scrapbook. Some alternatives is closer to firefox scrapbook, with file system access.
- (2017-10-21) Darren Bardsley: Just saves the page which you can do anyway. I need something like Firefox's Scrapbook.
- (2017-09-15) Юрий: I cant believe - its a legendary scrapbook from firefox!? All functions working so far so good, only missing list of all saved pages.