Maintenance and updates for your On-Premise AI infrastructure are professionally planned and executed to minimise downtime and ensure system stability. The monthly maintenance includes regular updates, security patches and proactive maintenance.

What’s included in the monthly maintenance?

Regular software updates

AI models:

  • Updates to newer model versions
  • Performance improvements
  • Bug fixes

Platform software:

  • Operating system updates
  • Docker/Kubernetes updates
  • API gateway updates (LiteLLM)
  • Monitoring tool updates

Security updates:

  • Security patches (critical, immediate)
  • Bug fixes
  • Vulnerability fixes

Proactive maintenance

Monitoring and surveillance:

  • 24/7 system monitoring
  • Performance metrics tracking
  • Early problem detection

Optimisations:

  • Performance optimisations
  • Configuration adjustments
  • Resource optimisations

Backup and disaster recovery:

  • Regular backups
  • Disaster recovery tests
  • Data backup

Update process

1. Planning and coordination

Before each update:

  • Analysis of update requirements
  • Risk assessment
  • Coordination with your team
  • Maintenance window planning

Communication:

  • Advance notice (usually 1–2 weeks before)
  • Clear information on changes
  • Expected downtime (usually minimal)

2. Update execution

Standard updates:

  • Usually during maintenance windows
  • Coordinated with your team
  • Minimal downtime

Critical updates:

  • Security patches: immediately if critical
  • Coordinated but prioritised

Zero-downtime updates:

  • Possible with Kubernetes clusters
  • Rolling updates without downtime
  • Automatic rollback on problems

3. Testing and validation

After each update:

  • Functional tests
  • Performance tests
  • Integration tests
  • Functionality validation

Maintenance windows

Planned maintenance windows

Typical maintenance windows:

  • Weekly: Small updates (usually without downtime)
  • Monthly: Larger updates (coordinated)
  • Quarterly: Major updates (planned)

Scheduling:

  • Usually outside business hours
  • Coordinated with your team
  • Minimal downtime

Emergency updates

Critical security patches:

  • Immediate installation required
  • Coordinated but prioritised
  • Minimal downtime

Critical bug fixes:

  • Quick resolution required
  • Coordinated with your team

Update strategies

1. Rolling updates (Kubernetes)

For Kubernetes clusters:

  • Updates without downtime
  • Incremental update
  • Automatic rollback on problems

Advantage: Zero-downtime updates possible

2. Blue-green deployment

For critical systems:

  • Parallel systems during updates
  • Seamless switchover
  • Immediate rollback possible

Advantage: Maximum availability

3. Canary deployments

For larger updates:

  • Gradual rollout
  • Testing with small user group
  • Full rollout after validation

Advantage: Risk minimisation

Backup strategy

Regular backups

What is backed up:

  • Configurations
  • Models (if custom)
  • Data (if stored locally)
  • System states

Backup frequency:

  • Daily: Automatic backups
  • Before updates: Additional backups
  • Monthly: Full backups

Disaster recovery

Recovery tests:

  • Regular tests of backup restoration
  • Validation of recovery times
  • Documentation of recovery processes

Monitoring during updates

Real-time monitoring

During updates:

  • Live monitoring of system performance
  • Automatic alerts on problems
  • Immediate response to problems

After updates:

  • Validation of functionality
  • Performance comparison
  • Error detection

Frequent questions

How often are updates carried out?

Standard updates:

  • Weekly: Small updates (usually automatic)
  • Monthly: Larger updates (coordinated)
  • As needed: Security patches (immediate)

Can updates be rolled back?

Yes:

  • Automatic rollback on problems (Kubernetes)
  • Manual rollback possible
  • Backup restoration as fallback

Is downtime communicated?

Yes:

  • Advance notice (1–2 weeks before)
  • Clear information on expected downtime
  • Usually minimal or no downtime

Can updates be postponed?

Yes:

  • Non-critical updates can be postponed
  • Coordination with your team possible
  • Critical security updates have priority

Next steps

Would you like to know more about maintenance and updates?

  • Contact us – Get advice on maintenance processes

Sources and further information: