Skip to content

crawlcore/scp-protocol

Repository files navigation

Site Content Protocol (SCP)

A collection-based protocol that reduces waste of bandwidth, processing power, and energy through pre-generated snapshots and deltas.

The Problem

Web crawlers (search engines, AI bots, aggregators) consume massive bandwidth and server resources by parsing web-pages designed for human viewing. With the explosion of AI crawlers, this traffic has become a significant cost for websites and strain on internet infrastructure.

The Solution

SCP enables websites to serve pre-generated collections of their content in compressed format from CDN or Cloud Object Storage.

Target Goals:

  • 50-60% bandwidth reduction for initial snapshots vs compressed HTML
  • 90-95% bandwidth reduction with delta updates (after initial download)
  • 90% faster parsing than HTML/CSS/JS processing
  • 90% fewer requests - one download fetches entire site sections
  • Zero impact on user experience (users continue accessing regular sites)

Resources

Contact

Vasiliy Kiryanov

About

Site Content Protocol (SCP) reduces waste of bandwidth & processing power during site crawling.

Topics

Resources

License

Stars

Watchers

Forks

Languages