Skip to main content
  1. Posts/

From SSH to GitOps

·963 words·5 mins
Photograph By Kaleidico
Blog DevOps CI/CD
Table of Contents

The rsync Era
#

The first deployment system I worked with in production was SSH + rsync. A GitHub Actions workflow would SSH into the server, rsync the code, run npm install, execute Prisma migrations, and restart the service with PM2. For each service. In sequence. One at a time.

# The old way (simplified, obfuscated)
deploy:
  steps:
    - name: Deploy service
      run: |
        rsync -avz ./src $SERVER_HOST:/app/service/
        ssh $SERVER_HOST "cd /app/service && npm install && npx prisma migrate deploy && pm2 restart service"

It worked. Deployments took a few minutes. If something broke, you’d SSH in and pm2 restart manually. Rollback meant reverting the commit and running the pipeline again (if you were lucky) or SSHing in and git checkout the previous version (if you were desperate).

The problem wasn’t that it was bad — it was that it was fragile. One failed SSH connection and the deployment was half-done. One missed migration and the app crashed. One developer running a quick fix directly on the server and the deployed code diverged from Git.

The Docker Compose Chapter
#

The next evolution was Docker-based deployments . Instead of syncing files, the pipeline built a Docker image, pushed it to a registry, and ran docker compose up on the server.

deploy:
  needs: test
  steps:
    - name: Tear down
      run: docker compose -f compose.prod.yaml down
    - name: Build and deploy
      run: docker compose -f compose.prod.yaml up --build -d

Better. The image was immutable — what you tested is what you deployed. No more “it works on the server but not in CI” because the environment was the same Docker image everywhere. Rollback was still “deploy the previous version,” but at least the previous version was a known, tested image.

We also started testing inside Docker containers — a test Dockerfile ran the test suite, and if it exited non-zero, the pipeline stopped. No deployment without passing tests.

But this was still a single machine. One server, one Docker daemon. And the deployment had a brief downtime window — docker compose down followed by docker compose up.

The Matrix Evolution
#

As services multiplied, we started using GitHub Actions matrix builds to build them in parallel:

strategy:
  matrix:
    service: [api-service, auth-service, scheduler, worker]
  fail-fast: false

fail-fast: false was a deliberate choice — if the auth-service build fails, you don’t want to cancel the api-service build. They’re independent. This cut total build time significantly because services built simultaneously instead of sequentially.

Self-Hosted Runners
#

We moved to self-hosted GitHub Actions runners for two reasons:

  1. Cost — cloud runner minutes add up, especially for Windows container builds
  2. Access — self-hosted runners can talk directly to the Docker daemon and internal registry without exposing credentials externally

Some of our services need Windows containers (legacy .NET workloads). Self-hosted Windows runners handle those builds while Linux runners handle everything else. Same pipeline, different runner labels.

The Tagging Strategy
#

This was a surprisingly important decision. We needed to distinguish between QA and production builds without ambiguity:

EnvironmentTag PatternTrigger
QAqa-v1.0.X (auto-increment)PR merge to develop
Productionv1.3.0 (explicit semver)Manual git tag

QA tags auto-increment: the pipeline reads the latest qa-v1.0.* tag, bumps the patch, and tags the build. This means every merge to develop automatically deploys to QA — no human in the loop.

Production requires someone to explicitly create a semantic version tag. This is intentional — deploying to production should be a deliberate, auditable action. The ArgoCD Image Updater filters tags with regexp:^v[0-9]+\.[0-9]+\.[0-9]+$, so only proper semver tags trigger production deployments. QA tags (qa-v1.0.47) are invisible to production.

The ArgoCD Chapter
#

The final evolution was GitOps with ArgoCD . The pipeline no longer deploys anything — it builds an image, pushes it, and walks away. ArgoCD watches the registry, detects the new tag, and syncs the cluster.

Before: Pipeline → SSH → server → restart service
Middle: Pipeline → Docker build → push → docker compose up
Now:    Pipeline → Docker build → push → done
        ArgoCD → detects new image → syncs cluster → deployed

Nobody runs kubectl apply. Nobody SSH-es into anything. The cluster state is defined in Git and enforced by ArgoCD. If someone manually modifies the cluster, ArgoCD reverts it.

Deployment Strategies I Haven’t Used (But Should Know)
#

Our current setup uses Kubernetes rolling updates — the default. Pods replace one at a time, zero downtime, works fine. But there are more sophisticated strategies:

Blue-Green: two identical environments. Deploy to the inactive one, test it, switch traffic instantly. Rollback = switch back. The cost: double the infrastructure during deployments.

Canary: route 5% of traffic to the new version. Monitor. If it’s stable, increase to 20%, then 50%, then 100%. The smallest blast radius, but you need traffic splitting and good monitoring to detect issues in that 5%.

Feature Flags: deploy the code but hide it behind a flag. Enable for specific users or percentages. Decouple deployment (code is live) from release (feature is enabled). Kill switch if something breaks.

We use rolling updates because they’re simple and sufficient for our scale. If I were starting fresh with higher traffic and stricter SLAs, I’d look at canary deployments with Prometheus-based automated rollback.

The Pattern
#

Looking back, each evolution solved a specific pain point:

EraPainSolution
SSH + rsyncFragile, manual, divergent stateDocker (immutable images)
Docker ComposeSingle machine, brief downtimeKubernetes (multi-node, rolling updates)
kubectl applyDrift, no audit trail, manualArgoCD (Git-driven, self-healing)
Manual taggingHuman error in deploysAuto-tagging QA, semver filtering prod

Each step removed a manual intervention. The pipeline got shorter (from the CI/CD perspective) while the delivery system got more sophisticated. The current state: a developer merges a PR, and 5 minutes later, it’s running in QA. No human touched the deployment.

Aaron Yong
Author
Aaron Yong
Building things for the web. Writing about development, Linux, cloud, and everything in between.

Related

GitOps with ArgoCD
·760 words·4 mins
Photograph By Fabio Sasso
Blog ArgoCD Kubernetes DevOps
How Git became our deployment tool and kubectl became obsolete
Building and Shipping with Docker
·618 words·3 mins
Photograph By William
Blog Docker CI/CD
Docker multi-stage builds and CI/CD with GitHub Actions
The Middleman
·732 words·4 mins
Photograph By Jahanzeb Ahsan
Blog Software Engineering Infrastructure
Reverse proxies, API gateways, and the layers between your users and your code