Removing assets from a repo
For most Statamic projects I work on, I leave the site’s images in the Git repository. This is simple, centralised and allows GitHub to be a full and effective backup. However, this won’t suit a site with hundreds or thousands of images.
Below is an approach I’ve used on a few projects to reduce the size of a repo – either at the beginning, or retrospectively.
The approach involves:
have image assets locally, but
gitignore'd
use Digitalocean Spaces for backup and inter-site syncing
use s3cmd to move files between server and Spaces
use a
lifecycle.xml
policy to auto-expiry date-stamped backupsuse git-filter-repo to rewrite history (if required)
Last updated: July 2024
Caveats and disclaimers
Assets in a Statamic site consist of:
The asset files themselves (eg:
example.jpg
)Asset meta data (eg: width, height, alt text, credits, etc.)
Cached versions of the asset files created with Glide (we aren’t backing these up)
Losing or changing the location of images in a Statamic project will break the content.
I use Spaces as I use Droplets so it makes sense to keep them on the same provider.
This approach might well work with AWS S3, but I don’t know the exact steps.
You could even connect your site directly to a remote filesystem as Laravel accommodates that (see Statamic Assets Drivers)
I use Laravel Forge to provision servers, and manage them with things like cronjobs.
My projects are in the UK so I make some choices based on that (ie: data centre region)
I’ll use
example
as shorthand for a client project’s name (ie:example-prod-01
)I’ll use
main
as the name for the Statamic asset containerArchitecturally, Spaces are “flat” in that they do not have directory structures (as per S3). Although they appear to have folders (DO even has a “create a folder” button”), they are not hierarchical. A folder is actually a prefix that has some
/
s in it.Create a Digitalocean Space
Login or create a Digitalocean account or team
- $200 credit for first 60 days with this referral link: m.do.co/c/c677cf2cc36bCreate a Space
datacenter region: AMS3 (Amsterdam)
“Enable CDN”: your choice, but I don’t use this feature.
Spaces bucket name:
example-assets-main-01
Install & configure s3cmd
This is a command line tool for syncing with DigitalOcean Spaces or Amazon S3.
You’ll want to include this approach in your docs in some way.
Expiry Lifecycle
This creates an isolated, timestamped backup as some protection against a ‘bad sync’.
The script creates a top-level folder with a date-stamp
These backups are auto-deleted if they match the pattern (eg: start with `backup-`)
Remove from Git
We now need to tell Git to stop tracking the assets directory
Add the asset container’s path to
.gitignore
For a container called
main
, this would be/public/main
Remove the directory from Git, but not delete the files:
git rm -r --cached public/main
Once this is done Git no longer actively knows about the images.
However, the images are still in the
.git/
directory and repo as objects within its history. The next step with Git Filter Repo resolves this.Git-Filter-Repo
This is a command line utility recommended by GitHub to remove large or sensitive files
By its nature it rewrites Git history with a
rebase
(so teams beware!)