GitHub – Florents-Tselai/WarcDB: WarcDB: Web crawl data as SQLite databases.
WarcDB: Web crawl data as SQLite databases. WarcDB is a an SQLite-based file format that makes web crawl data easier to share and query. It is based on the standardized Web ARChive format, used by web archivers. Usage # Load the `archive.warcdb` file with data. warcdb import archive.warcdb ./test…
もっと詳しく