Skip to content
We Sort.

Removing assets from a repo

For most Statamic projects I work on, I leave the site’s images in the Git repository. This is simple, centralised and allows GitHub to be a full and effective backup. However, this won’t suit a site with hundreds or thousands of images.

Below is an approach I’ve used on a few projects to reduce the size of a repo – either at the beginning, or retrospectively.

The approach involves:

  • have image assets locally, but gitignore'd

  • use Digitalocean Spaces for backup and inter-site syncing

  • use s3cmd to move files between server and Spaces

  • use a lifecycle.xml policy to auto-expiry date-stamped backups

  • use git-filter-repo to rewrite history (if required)

Last updated: July 2024


Caveats and disclaimers

  • YMMV

  • Assets in a Statamic site consist of:

    • The asset files themselves (eg: example.jpg)

    • Asset meta data (eg: width, height, alt text, credits, etc.)

    • Cached versions of the asset files created with Glide (we aren’t backing these up)

  • Losing or changing the location of images in a Statamic project will break the content.

  • I use Spaces as I use Droplets so it makes sense to keep them on the same provider.

    • This approach might well work with AWS S3, but I don’t know the exact steps.

    • You could even connect your site directly to a remote filesystem as Laravel accommodates that (see Statamic Assets Drivers)

  • I use Laravel Forge to provision servers, and manage them with things like cronjobs.

  • My projects are in the UK so I make some choices based on that (ie: data centre region)

  • I’ll use example as shorthand for a client project’s name (ie: example-prod-01)

  • I’ll use main as the name for the Statamic asset container

  • Architecturally, Spaces are “flat” in that they do not have directory structures (as per S3). Although they appear to have folders (DO even has a “create a folder” button”), they are not hierarchical. A folder is actually a prefix that has some /s in it.


    Create a Digitalocean Space

    1. Login or create a Digitalocean account or team
      - $200 credit for first 60 days with this referral link: m.do.co/c/c677cf2cc36b

    2. Create a Space

      1. datacenter region: AMS3 (Amsterdam)

      2. “Enable CDN”: your choice, but I don’t use this feature.

      3. Spaces bucket name: example-assets-main-01

    Install & configure s3cmd

    • This is a command line tool for syncing with DigitalOcean Spaces or Amazon S3.

    • You’ll want to include this approach in your docs in some way.

    • See this Gist

    Expiry Lifecycle

    • This creates an isolated, timestamped backup as some protection against a ‘bad sync’.

    • The script creates a top-level folder with a date-stamp

    • These backups are auto-deleted if they match the pattern (eg: start with `backup-`)

    • See this Gist

    Remove from Git

    • We now need to tell Git to stop tracking the assets directory

    • Add the asset container’s path to .gitignore

      • For a container called main, this would be /public/main

    • Remove the directory from Git, but not delete the files:

      • git rm -r --cached public/main

    Once this is done Git no longer actively knows about the images.

    However, the images are still in the .git/ directory and repo as objects within its history. The next step with Git Filter Repo resolves this.

    Git-Filter-Repo

    • This is a command line utility recommended by GitHub to remove large or sensitive files

    • By its nature it rewrites Git history with a rebase (so teams beware!)

    • See this Gist