Skip to content
All posts

Network Configuration and Change Management Best Practices: From Manual Mess to CI/CD-Driven Automation

Over the last decade, network engineering has gone from logging into devices and typing commands to a world of software-defined, API-driven, and fully automated infrastructure. Whether you're a network engineer in the trenches configuring devices every day or a tech leader pushing for digital transformation, the challenge is the same: how do you make network changes reliable, repeatable, and completely risk-free?

This is where solid Network Configuration and Change Management (NCCM) practices - powered by network automation, CI/CD pipelines, and dedicated lab testing environments - become absolutely critical for modern network operations.

New to CloudMyLab? It's a cloud-based network lab platform that lets you focus on building, testing, and automating networks—not managing hardware. Access professional-grade network emulators like EVE-NG, GNS3, and Cisco CML 2.0, plus pre-configured automation environments with Ansible, GitLab, and NetDevOps tools, all instantly from your browser. Contact us to discuss your lab requirements or start a free trial .

Table of Contents

Why Traditional Network Management Approaches Fall Short

For years, network engineers logged into routers, switches, and firewalls via SSH, copy-pasted some CLI commands from a text file, and just hoped for the best. That manual approach might have worked in small, static networks, but it just doesn't scale for modern companies or data centers where:

  • Hundreds or thousands of devices need consistent configuration.
  • Compliance and security standards demand strict audit trails.
  • A single network outage from human error can cost the business millions.
I've personally seen organizations lose hours of critical business operations because a single line of config was mistyped during a late-night change window.

The Critical Need for Network Automation

Network automation addresses three critical pain points in traditional configuration management:

  • Consistency: Stop "configuration drift" across your devices by using standard, version-controlled templates.
  • Speed: Roll out complex changes to 500 devices in minutes, not days. This speeds up your ability to deliver new services and respond to what the business needs.
  • Auditability: Make sure every single change is tracked, logged, and version-controlled in Git. This gives you a complete, unchangeable audit trail for compliance and troubleshooting.

Practical Network Automation Examples

Ansible Network Automation - VLAN Deployment

Instead of manually typing CLI commands on each switch, this Ansible playbook ensures all target switches have VLAN 20 configured consistently, eliminating configuration drift.

- name: Configure VLANs on switches
hosts: campus_switches
gather_facts: no
tasks:
   - name: Create VLAN
     ios_vlan:
       vlan_id: 20
       name: HR_VLAN
       state: present


Python Network Configuration Management - Backup and Rollback

A Python script using libraries like Netmiko or NAPALM can automatically back up a device's running config before it applies any changes. If the change fails validation, the same script can automatically restore the last-known-good config, preventing network downtime.

from netmiko import ConnectHandler
import datetime

device = {
  
"device_type": "cisco_ios",
  
"ip": "10.1.1.1",
  
"username": "admin",
  
"password": "password"
}

conn = ConnectHandler(*
*device)

# Backup running config
config
= conn.send_command("show running-config")
filename = f
"backup_{device['ip']}_{datetime.date.today()}.txt"

with open(filename,
"w") as f:
   f.write(config)

# Rollback example
conn.send_config_from_file(
"last_known_good_config.txt")


Ansible BGP Configuration Management

Managing complex BGP peerings and route policies manually is a recipe for disaster. An Ansible playbook can enforce consistent BGP configs, peerings, and route-maps across all your routers, ensuring predictable and stable routing.

- name: Configure BGP
hosts: routers
tasks:
   - name: Set BGP neighbors
     ios_config:
       lines:
         - router bgp 65001
         - neighbor 10.10.10.2 remote-as 65002

Why CI/CD Pipelines Are Essential for Network Operations

Just having network automation isn't enough. Without proper testing and governance, you risk deploying bad configurations at machine speed. That's where CI/CD pipelines for network configuration management come in. Borrowed from DevOps practices, they bring discipline to network automation:

  • Continuous Integration: Every network configuration change is version-controlled in Git and validated automatically
  • Continuous Delivery/Deployment: Approved configurations are staged, tested in lab environments, and deployed safely

This GitOps workflow makes sure that only thoroughly tested, validated, and peer-reviewed changes ever touch your production network.

The Critical Role of Network Testing Labs

Here's the question where most companies get stuck: Where can you safely test these network automation workflows and complex device configurations before deploying them to production?

This is where CloudMyLab's EVE-NG and GNS3-powered network simulation labs change the game for network configuration management:

  • A Safe Place to Test: Your engineers can replicate your enterprise-scale network topologies, complete with routers, switches, firewalls, and controllers, without ever touching a production device.
  • Full Automation Validation: You can run your Ansible playbooks, Python scripts, and full CI/CD jobs against the lab devices to check their behavior before a live rollout.
  • Confidence for Leadership: As a tech leader, you get the assurance that every single change has been rigorously tested in a controlled, production-like environment. This dramatically reduces business risk.
  • Scalable Training: Your network teams can upskill and experiment in labs that accurately mimic your real-world data center, campus, or WAN topologies, fostering innovation without the fear of causing an outage.

I often call CloudMyLab the "flight simulator" for network engineers. Before you fly a real aircraft (your production network), you train, test, and validate every move in a high-fidelity simulator (your CloudMyLab environment).

Network Configuration Management Workflow with Lab Integration

A practical, enterprise-grade workflow for your existing (brownfield) network should look something like this:

  • Engineer writes configuration change → Commit to Git repository
  • CI pipeline triggers automated tests → This includes syntax validation, linting (`ansible-lint`), and maybe even automated network validation tests with frameworks like pyATS.
  • Deploy to CloudMyLab lab environment (EVE-NG/GNS3 topology replicating production)
  • Automated tests run inside the lab:
    • Validate routing adjacencies and network connectivity
    • Test firewall rules and security policies
    • Ensure automation scripts run end-to-end without configuration drift.
  • Peer review and approval process → Senior engineers or architects review lab test results
  • Controlled production rollout → Stage changes on a small device subset first
  • Monitor and rollback if needed → Pre-tested rollback scripts ensure you have a quick and reliable way to recover if any unexpected issues pop up.
This lab-first approach dramatically cuts risk, speeds up your solution-building process, and ensures a smoother, more successful adoption of network automation, even in complex legacy environments.

Network Configuration Rollback Planning - Three-Tier Rollback Strategy

No matter how fancy your network automation is, things can still go wrong. A comprehensive, well-tested rollback strategy is absolutely essential for keeping the network up and ensuring business continuity.

Implement a three-tier rollback approach that provides multiple safety nets:

Tier 1: Automated Rollback Scripts

Automated scripts tested extensively in lab environments provide the fastest recovery path. Lets see a Python example.

import os

def rollback_config(device, backup_file):
   print(
f"Rolling back {device} using {backup_file}")
   os.system(
f"scp {backup_file} {device}:/config/startup.cfg")
   os.system(
f"ssh {device} reload")

Before this rollback script is linked to production, it can be tested multiple times in CloudMyLab’s labs, ensuring engineers know it works flawlessly.

Tier 2: Configuration Backup Management

Always take comprehensive backups before any change:

  • Pre-change snapshots of running configurations
  • Timestamped backups with clear naming conventions
  • Automated backup validation to ensure file integrity
  • Multiple backup locations for redundancy

Tier 3: Golden Configuration Templates

Maintain golden configurations stored in Git repositories:

  • Version-controlled templates for each device type
  • Standardized configurations that serve as known-good baselines
  • Environment-specific variations (production, staging, development)
  • Rollback to last-known-good configurations when other methods fail

Staged Rollout Strategy

Never push changes globally at once. Use a phased approach:

  • Lab validation: Test all changes in CloudMyLab simulation environments
  • Pilot deployment: Deploy to 5-10% of devices first
  • Monitoring phase: Observe performance and connectivity for defined period
  • Gradual expansion: Roll out to device groups in 25% increments
  • Full deployment: Complete rollout only after successful validation

Essential Network Automation Tools and Frameworks

Picking the right automation tools is critical for a successful NCCM implementation. Here’s a TLDR guide to the most effective frameworks for network engineers:

Ansible: Declarative, scalable config automation

Ansible's declarative approach and extensive network modules make it the go-to choice for most organizations. Its agentless architecture and YAML-based playbooks provide consistency across multi-vendor environments.

Deep dive into Ansible Execution Environment for network automation or learn about Ansible Navigator, a modern CLI for running and debugging playbooks.

Python + NAPALM/Netmiko - Flexible scripting for custom workflows

Python offers unmatched flexibility for complex automation scenarios. NAPALM provides vendor-agnostic APIs, while Netmiko excels at device-specific operations.

pyATS/Genie - Automated testing, validation, and health checks.

Cisco's pyATS framework provides comprehensive network testing capabilities, from device health checks to complex validation workflows.

GitLab/GitHub Actions - CI/CD pipelines with reviews and approvals.

Modern network operations require the same discipline as software development. Git-based workflows ensure every change is tracked, reviewed, and deployed safely.

Multi-Vendor Environment Considerations

NCCM works excellently with multi-vendor environments (Cisco, Juniper, Arista), but requires vendor-agnostic tools:

  • Use NAPALM for unified API access across vendors
  • Maintain separate Ansible playbooks for vendor-specific modules
  • Leverage CloudMyLab's multi-vendor lab topologies to test interoperability before deployment
  • Implement standardized configuration templates that can be adapted per vendor

CloudMyLab Lab Environment Integration - Labs for safe, repeatable, and realistic validation.

All these tools can be tested and validated in CloudMyLab's EVE-NG and GNS3-powered simulation labs before production deployment, ensuring your automation workflows perform exactly as expected across different vendor platforms.

Business Value of Lab-Driven Network Change Management

For technology leaders, it is all about business resilience, operational efficiency, and quantifiable ROI.

  • Reduce outages: Labs prevent bad configs from ever reaching production.
  • Accelerate innovation: Faster time-to-market by validating automation in labs.
  • Empower engineers: Network engineers can practice automation workflows without fear of breaking production.
  • Improve governance: Every lab test is documented and can be part of the change approval process.

Network Configuration Management: Next Steps

Network automation and CI/CD pipelines are incredibly powerful, but without a comprehensive lab testing strategy, they can leave your company exposed to config drift, security vulnerabilities, and unexpected, costly downtime. By integrating CloudMyLab's EVE-NG and GNS3 network simulation labs into your NCCM workflow, both your engineers and your leadership team gain confidence. Your engineers can test, refine, and validate their automation scripts in a safe, realistic environment. Your leaders know that every change is safe, scalable, and aligned with your security and compliance needs.

The move from manual network operations to CI/CD-driven automation requires the right mix of tools, processes, and safe testing environments where your network teams can validate every single change before it impacts your business.

Want to explore how CloudMyLab can support next-gen network testing and automation? Check out our hosted platforms and start building with confidence.

 

Frequently Asked Questions

What is the biggest challenge in implementing NCCM in brownfield environments?

The main challenge is configuration drift across your existing devices. Legacy networks often have inconsistent configs that have evolved over years of manual, one-off changes. The best approach is to start with a comprehensive audit using tools like NAPALM to get a baseline. Then, you can gradually standardize your configs through automated templates while keeping detailed, well-tested rollback procedures for every step.

How do I safely test network automation scripts before production deployment?

You must use dedicated network simulation labs, like those powered by EVE-NG or GNS3, to replicate your production topology. CloudMyLab provides pre-built, on-demand enterprise topologies where you can safely test your Ansible playbooks, Python scripts, and config changes in a risk-free environment. Always run your automation workflows from end to end in the lab before they ever touch your production devices.

What's the difference between configuration management and change management in networking?

Configuration management is about maintaining a consistent, desired state for your device settings and preventing unauthorized or inconsistent config drift over time.
Change management is the formal process of planning, approving, testing, and deploying any modifications to your network.

What's the minimum team size needed to implement NCCM effectively?

A single, dedicated network engineer can start with basic automation using tools like Ansible and leverage a platform like CloudMyLab for all their testing needs. For larger enterprise setups, a good starting point is a small team with one dedicated network automation engineer, one lab environment manager (if self-hosted), and one or two senior engineers to lead the change approval process.

How long does it typically take to implement NCCM from scratch?

  • Phase 1 (basic automation, version control, and lab setup): 2-3 months
  • Phase 2 (CI/CD pipeline integration and more advanced workflows): 3-4 months
  • Phase 3 (full brownfield migration and standardization): 6-12 months
Success depends heavily on your network's complexity, your team's existing experience, and your company's commitment to change management. Starting with a platform like CloudMyLab can significantly speed up your timeline by getting rid of the infrastructure setup phase.