Skip to main content
Codeview Digital
IT Operations14 min read

Setting Up IT Operations for a Growing Company: What You Actually Need

TL;DR

Most growing companies hit an IT inflection point between 20 and 100 employees - the point where ad-hoc IT support, shared passwords, and 'just ask Dave' stops working. You do not need a full ITIL implementation or an enterprise ServiceNow instance yet. You need the right foundations: incident management, basic change control, a service catalog, and documentation. Getting these right early prevents the operational chaos that derails companies during their next growth phase.

The IT Inflection Point

Every growing company hits a moment where the way they have been doing IT simply stops scaling. It usually happens somewhere between 20 and 100 employees, and the symptoms are predictable. One person - let us call them Dave - has been handling IT since the company was five people. Dave knows where everything is, how the VPN works, why the staging server needs to be restarted every Thursday, and which Slack channel to post in when email goes down.

The problem is that Dave is also a developer. Or a product manager. Or a founder. IT is not Dave's job - it is something Dave does on the side because someone has to. And now you have 30 employees, three offices, a production environment that actually matters, and Dave is a single point of failure for everything operational.

This is the IT inflection point. It is not about technology - it is about the moment when tribal knowledge, ad-hoc processes, and informal support structures can no longer keep up with the demands of a growing organisation. If you recognise these signs, you are there.

More than two people regularly answer IT questions, but nobody officially owns IT support
Outages happen and nobody fully understands why or how to prevent them from recurring
Onboarding a new employee takes a week or longer because nothing is documented - account setup, tool access, environment configuration all live in someone's head
The same problems keep recurring because fixes are applied without root cause analysis or documentation
Your developers are spending significant time on ops work instead of building product
Shared passwords exist in Slack messages, sticky notes, or a Google Doc titled 'passwords'
Nobody knows the full inventory of SaaS tools the company is paying for
Security incidents get handled reactively with no defined process or escalation path
Vendor relationships are managed by whoever set up the account originally, with no central tracking

If you checked three or more of those boxes, you are past the inflection point. The good news is that you do not need to hire a 10-person IT department or implement enterprise frameworks to fix this. You need the right foundations for your current size, with a clear path to scale as you grow.

The Minimum Viable IT Operations Stack

The biggest mistake growing companies make is either doing nothing (hoping the problem solves itself) or over-investing (buying enterprise tools they will not use for three years). The right approach is to match your IT operations maturity to your actual size and complexity. Here is what that looks like at each stage.

20 to 50 employees: The Foundation

At this size, you need just enough structure to eliminate single points of failure and give people a clear way to get help. You are not building an IT department yet. You are putting guardrails on what has been an informal process.

  • A ticketing system for IT requests - Jira Service Management, Freshservice, or even a dedicated Slack channel with a bot that creates tickets. The tool matters less than having a single place where requests go.
  • An identity provider with SSO - Google Workspace or Microsoft 365 with enforced multi-factor authentication. Stop managing individual accounts on every SaaS tool.
  • A password manager - 1Password Business or Bitwarden. Eliminate shared passwords entirely.
  • A basic asset inventory - a spreadsheet is fine at this stage. Track every laptop, monitor, and license. Know what you have and who has it.
  • A documented onboarding checklist - step-by-step account setup, tool access, and environment configuration. This is the single highest-value document you can create.
  • Uptime monitoring - Uptime Robot, Better Stack, or Pingdom. Know when your critical services go down before your customers tell you.

50 to 100 employees: The Structure

At this size, you need actual processes - not just tools. You probably need a dedicated IT person (or at least a fractional one), and your tooling needs to support more than just ticket tracking.

  • A proper ITSM platform - Jira Service Management, Freshservice, or Zendesk with a service catalog that employees can browse. Self-service reduces ticket volume significantly.
  • Incident management process - a documented triage flow, severity levels, escalation paths, and post-incident reviews for anything that causes downtime.
  • Basic change management - not a formal Change Advisory Board, but a process for tracking and approving changes to production systems. A shared calendar and a Slack approval workflow work at this stage.
  • Endpoint management - Jamf for Mac, Intune for Windows. Remote wipe capability, standard configurations, automated patching.
  • A knowledge base - Confluence, Notion, or GitBook. Document common issues, workarounds, and procedures. Make it searchable.
  • Infrastructure monitoring - basic dashboards showing CPU, memory, disk, and network for your production environment. Datadog, Grafana Cloud, or New Relic at the free or starter tier.
  • Vendor management tracking - who are your vendors, what do the contracts say, when do they renew, who is the internal owner.

100+ employees: The Maturity

At this point, you are building a real IT operations function. You likely need a small team, formal processes, and enterprise-grade tooling. This is where frameworks like ITIL start to become relevant - not as a rigid methodology, but as a reference for the processes you need.

  • Full ITSM suite with incident, problem, change, and request management workflows
  • A dedicated IT operations team - at minimum a manager, a support analyst, and a systems administrator
  • Formal change management with a CAB or lightweight approval process for production changes
  • Compliance and audit capabilities - SOC 2 preparation, access reviews, security policy enforcement
  • Advanced monitoring and observability - APM, log management, synthetic monitoring, alerting workflows
  • Disaster recovery and business continuity planning - documented and tested recovery procedures
  • IT budgeting and financial management - tracking spend by department, forecasting, and cost optimisation

Incident Management Without the Overhead

Incident management is the single most important IT operations process to get right, and it is also the one most companies overcomplicate. At its core, incident management answers three questions: something is broken, who is fixing it, and when will it be fixed. Everything else is refinement.

For a growing company, you do not need PagerDuty, a formal on-call rotation, or a 15-page incident management policy. You need a process that people will actually follow. Here is the minimum viable incident management process that works for teams of 20 to 100.

The Simple Triage Flow

Step one: something breaks. The report comes in through your ticketing system, a Slack message, or an automated alert. Someone needs to acknowledge it within a defined timeframe. At a small company, this can be as simple as reacting to the Slack message with an eyes emoji - it means 'I see this and I am looking at it.'

Step two: triage. Is this affecting production? Is it affecting one person or many? Is there a workaround? This determines severity. Keep it simple - three levels are enough. Critical means production is down or a significant number of users are affected. High means something important is broken but there is a workaround. Normal means everything else.

Step three: assign and communicate. Someone owns the fix. That person posts updates at a regular interval (every 30 minutes for critical, every few hours for high, daily for normal) until the issue is resolved. This is the part most teams skip, and it is the part that matters most. Nothing erodes trust faster than an outage with no communication.

Step four: resolve and learn. Fix the issue, document what happened, and decide if a post-incident review is warranted. For critical incidents, always do a review. Keep it blameless and focused on what you can change to prevent recurrence. A 15-minute meeting with notes in a shared document is enough.

The goal of incident management at a growing company is not zero incidents - it is consistent, predictable handling of incidents when they occur. Getting from chaos to consistency is a bigger improvement than getting from good to perfect.

Do You Need ServiceNow?

This is one of the most common questions growing companies ask once they realise they need real IT operations. The short answer: probably not yet. ServiceNow is a powerful enterprise ITSM platform, but it is designed for organisations with hundreds or thousands of employees, dedicated IT teams, and complex compliance requirements. For a company with 30 to 80 employees, ServiceNow is like buying a commercial kitchen to make sandwiches.

The licensing costs alone will run you $50,000 to $100,000 per year, depending on modules and user count. Implementation typically costs another $50,000 to $200,000 if you want it configured properly. And you will need someone who knows the platform to maintain it - either an internal admin or an ongoing support contract.

That said, there are situations where ServiceNow makes sense earlier than you might expect. Here are the thresholds to watch for.

  • You have more than 100 employees and a dedicated IT support team of three or more people
  • You have compliance requirements (SOC 2, ISO 27001, FedRAMP) that mandate auditable change management and access controls
  • You need complex workflow automation that spans multiple departments - IT, HR, facilities, security
  • You have multiple IT support tiers and need sophisticated routing and escalation logic
  • You are a government contractor or subcontractor and your clients require ITSM tooling that meets specific standards

If none of those apply to you, start with Jira Service Management, Freshservice, or Zendesk. All three offer solid ITSM capabilities at a fraction of the cost. You can always migrate to ServiceNow later when your complexity justifies it. The processes you build now will transfer regardless of the tool.

One thing to be wary of: consultants who recommend ServiceNow for every engagement regardless of company size. If someone is pushing you toward a $200,000 ITSM implementation when you have 40 employees, they are selling you their expertise, not solving your problem. A good consultant - a practitioner, not just a consultant - will right-size the recommendation to your actual needs and budget.

Monitoring and Observability for Growing Teams

Monitoring follows the same principle as everything else in this guide: match your investment to your complexity. A 20-person company does not need a $3,000/month Datadog contract. But a 100-person company with a production SaaS product and uptime SLAs absolutely needs comprehensive monitoring. The trick is knowing where you are on that spectrum and investing accordingly.

Tier 1: Uptime and Availability

This is where every company should start, regardless of size. You need to know when your critical services go down. Set up external uptime monitoring for your website, API endpoints, and any customer-facing services. Tools like Uptime Robot (free tier covers most small companies), Better Stack, or Pingdom will check your endpoints every minute and alert you via Slack, email, or SMS when something goes down.

At this tier, you should also set up basic health checks inside your application. A simple /health endpoint that returns 200 when the app is running and can connect to its database is surprisingly useful. Pair it with your uptime monitor and you have basic availability covered.

Tier 2: Error Tracking and Application Performance

Once you have uptime monitoring, the next layer is understanding what is happening inside your applications. Error tracking tools like Sentry, Bugsnag, or Rollbar capture exceptions in real time and group them into actionable issues. This is the difference between 'the site is slow' and 'there is an unhandled exception in the checkout flow that is affecting 12% of users.'

Application performance monitoring (APM) comes in at this tier as well. New Relic, Datadog, or Elastic APM can trace requests through your application stack, identify slow database queries, and pinpoint bottlenecks. Most of these tools offer free or affordable tiers that cover a single application with reasonable data retention.

Tier 3: Infrastructure and Observability

This tier is for companies running their own infrastructure (or managing significant cloud resources) with production workloads that require deeper visibility. You are looking at infrastructure metrics (CPU, memory, disk, network), log aggregation, distributed tracing, and custom dashboards.

At this level, tools like Grafana Cloud (which bundles metrics, logs, and traces), Datadog, or New Relic become worth the investment. If you are cost-conscious and have engineering capacity, the open-source stack of Prometheus for metrics, Grafana for dashboards, and Loki for logs is a powerful combination - but it requires someone to maintain it.

A common mistake is buying an enterprise observability platform and then only using it for uptime monitoring because nobody has time to configure it properly. Start with what you will actually use and expand as your team and needs grow. Unused monitoring is wasted money.

Documentation That Actually Gets Used

Most companies know they should document their IT operations. Most companies also have a graveyard of outdated Confluence pages that nobody reads. The problem is not documentation itself - it is creating documentation that people actually maintain and use.

The key insight is that less documentation, maintained well, is infinitely more valuable than comprehensive documentation that is out of date. Focus on the documents that prevent the highest-impact problems: knowledge loss when someone leaves, slow onboarding, repeated troubleshooting of the same issues, and confusion during incidents.

Here is the minimum set of documents every growing company needs. If you have these and they are current, you are ahead of 90% of companies your size.

Employee onboarding checklist - every account, tool, and access right a new hire needs, step by step, with links to each tool's admin panel
Employee offboarding checklist - the reverse of onboarding. Every account to disable, device to recover, and access to revoke. This is also a security document.
Runbooks for critical systems - step-by-step procedures for common operational tasks: restarting services, rolling back deployments, clearing caches, rotating credentials
Incident response procedures - who to contact, how to triage, where to communicate, how to escalate. This is the document people reach for at 2 AM.
Architecture overview - a single diagram showing your major systems, how they connect, and where they are hosted. Does not need to be detailed - a box-and-arrow diagram is fine.
Vendor and license registry - every SaaS tool, cloud service, and hardware vendor. Include contract details, renewal dates, costs, and the internal owner.
Network and access documentation - VPN configuration, firewall rules, IP ranges, DNS records. The things that are impossible to reconstruct if the person who set them up leaves.
Disaster recovery procedures - what to do if your primary environment goes down. Even a basic plan (restore from backup, failover to secondary region) documented and tested once is better than no plan.

Two practical tips for keeping documentation alive. First, make documentation part of the process, not an afterthought. If you change a procedure, update the document in the same ticket. Second, assign owners. Every document should have a named person responsible for keeping it current. Review quarterly at minimum.

AI-enhanced delivery can accelerate documentation significantly. Tools that generate initial drafts from existing configurations, code comments, and runbook templates can cut documentation time by 60% or more. The practitioner's job shifts from writing everything from scratch to reviewing and refining AI-generated drafts for accuracy. This is one area where AI delivers immediate, practical value without requiring a major readiness assessment.

When to Hire vs When to Contract

The hire vs contract decision is one of the most consequential choices a growing company makes about IT. Get it wrong and you either spend too much too early or underinvest and pay for it later in outages and technical debt. There are three main options, and each fits a different situation.

Full-time IT Manager or Director

This makes sense when IT operations is a daily, ongoing responsibility that requires consistent attention and institutional knowledge. Typically, this is the right move at 75 to 100+ employees, or earlier if your product is heavily technology-dependent (SaaS companies, for example). A good IT manager costs $90,000 to $140,000 per year in Canada, plus benefits. The advantage is dedicated focus and deep knowledge of your specific environment. The risk is that a single hire has limited breadth of experience - they know what they know, and you are dependent on that one person's expertise.

Fractional CTO or IT Operations Consultant

A fractional arrangement gives you senior-level expertise at a fraction of the cost of a full-time executive hire. This is ideal for companies in the 20 to 75 employee range that need strategic IT leadership but do not have enough work (or budget) for a full-time senior role. A fractional CTO typically works one to two days per week, sets direction, makes architectural decisions, and mentors your internal team.

The key advantage of this model is that you get boutique attention from someone who has seen dozens of companies at your stage and knows which investments pay off and which are premature. A good fractional CTO will tell you what not to buy as often as what to buy. Look for practitioners who build what your team can maintain after they leave - not consultants who create dependency. For more on this option, see our guide on fractional CTOs for growing companies.

Managed Service Provider (MSP)

An MSP handles day-to-day IT support - help desk, device management, network management, patching, and monitoring. This is a good option for companies that need consistent IT support but are not large enough to justify a full internal IT team. MSPs typically charge per user per month, ranging from $75 to $200 per user depending on the scope of services.

The trade-off with MSPs is that they manage your operations but rarely improve them. An MSP keeps the lights on. If you need strategic improvement - better processes, architecture decisions, tool selection, compliance readiness - you will still need a consultant or fractional CTO for that layer. Many growing companies find the best model is an MSP for day-to-day support combined with a fractional CTO or consultant for strategic direction.

The worst outcome is doing nothing. Companies that delay the hire or contract decision inevitably end up in a crisis - a major outage, a security incident, a failed audit - that forces an expensive, rushed decision. It is always cheaper and less painful to build IT operations proactively than to rebuild them after something breaks.

What to Look For in an IT Operations Consultant

If you decide to bring in outside help to set up or improve your IT operations, here is how to evaluate whether a consultant is the right fit. Use this as a screening checklist - you want someone who meets at least seven of these ten criteria.

They have personally scaled IT operations at companies your size before - not just advised on it, but actually built the processes and selected the tools. Ask for specific examples.
They right-size their recommendations to your budget and complexity. If their first recommendation for a 30-person company is a $200,000 ITSM platform implementation, walk away.
They build systems your internal team can maintain after the engagement ends. The goal is independence, not ongoing dependency on the consultant.
They ask about your business goals before your technology stack. IT operations should support what the company is trying to achieve, not exist for its own sake.
They can explain trade-offs in plain language. If every answer is 'it depends' without a clear framework for making the decision, they are not adding value.
They have experience with your industry's compliance and regulatory requirements. A SaaS company, a healthcare company, and a financial services company all have different IT operations needs driven by their regulatory environment.
They provide a clear scope, timeline, and deliverable list upfront. Vague engagements with open-ended timelines are a red flag.
They focus on outcomes, not activities. The deliverable should be 'incident response time improved from 4 hours to 30 minutes' - not 'delivered 47 pages of documentation.'
They involve your team throughout the engagement. Knowledge transfer should happen continuously, not in a handoff session at the end.
They are honest about what you do not need yet. The best consultants save you money by recommending against premature investments. A practitioner, not just a consultant, knows the difference between what is necessary now and what can wait.

One final note: be cautious of firms that only offer staff augmentation. Placing a body in a seat is not the same as improving your operations. If you need someone to build and improve your IT operations, you need a consulting engagement with defined outcomes and accountability - not a contractor who does what they are told.

Frequently Asked Questions

We are a tech company - do we still need IT operations?

Yes, and arguably more so. Tech companies often assume that because their engineering team is technically skilled, IT operations will take care of itself. What actually happens is that engineers end up spending 20 to 30 percent of their time on operational work - managing cloud infrastructure, handling employee laptop issues, troubleshooting VPN problems, onboarding new hires - instead of building product. Formalising IT operations does not mean creating bureaucracy. It means giving your engineers their time back by putting the right processes and support structures in place so that operational work is handled efficiently by the right people.

How much should we budget for IT operations?

A reasonable rule of thumb for a growing company (20 to 100 employees) is 4 to 7 percent of revenue, including all IT spend - tools, infrastructure, support, and personnel. For a company with $5 million in revenue, that is $200,000 to $350,000 per year. This covers your SaaS subscriptions, cloud hosting, devices, an MSP or internal IT support, and periodic consulting for strategic projects. Companies that underinvest in IT operations typically spend more in the long run through inefficiency, security incidents, and reactive crisis management. The cheapest approach is almost never the most cost-effective one.

Should we use a managed service provider?

An MSP is a solid choice for day-to-day IT support if you are not ready to hire a full internal IT team. The key is understanding what an MSP does and does not do. A good MSP handles help desk tickets, device management, patching, basic monitoring, and network management. They keep the lights on. What an MSP typically does not do is make strategic decisions about your IT architecture, select and implement new tools, improve your processes, or align IT with your business strategy. Many companies find that the best model is an MSP for operational support combined with a fractional CTO or consultant for the strategic layer. This gives you both consistent daily support and senior-level expertise for the decisions that shape your IT direction.

What is the first thing we should set up?

Start with identity and access management - specifically, an identity provider with single sign-on and enforced multi-factor authentication. This is the foundation everything else builds on. With proper IAM in place, onboarding and offboarding become straightforward, you eliminate shared passwords, and you have a single source of truth for who has access to what. Google Workspace or Microsoft 365 both work well as a foundation. Add a password manager (1Password Business or Bitwarden) for any tools that do not support SSO. This combination solves the most immediate security and operational risks and takes days, not months, to implement.

How do we know when we have outgrown our current setup?

The clearest signal is when the same problems keep happening and nobody has time to fix the root cause. If your team is stuck in a cycle of reactive firefighting - fixing the same outage every month, answering the same support questions, manually doing things that should be automated - your current setup is no longer adequate. Other signals include: new employee onboarding takes more than three days, you have had a security incident caused by a process gap (like a departed employee still having access), your team cannot answer basic questions about your infrastructure without asking a specific person, or your compliance requirements have changed and your current tooling cannot support them. When you see these signals, it is time to invest in the next level of IT operations maturity.

Can we just use AI tools to handle IT operations instead of building processes?

AI tools can accelerate IT operations significantly, but they cannot replace the foundational processes. AI-enhanced delivery works best when it automates well-defined processes - auto-categorising tickets, generating runbook drafts, summarising incident timelines, detecting anomalies in monitoring data. But if your processes are undefined or chaotic, AI just automates the chaos faster. The right approach is to establish your core processes first (incident management, change control, documentation) and then layer AI tools on top to make those processes more efficient. Companies that try to skip the fundamentals and jump straight to AI-powered operations end up with expensive tools and no improvement in outcomes.

Related Services

About the Author

Corey Derouin is the founder and principal consultant at Codeview Digital. With extensive experience in federal government IT operations, ServiceNow platform delivery, and digital transformation, Corey brings a practitioner's perspective to every engagement - not a slide deck, but hands-on delivery from someone who has done the work inside government.

Learn more about our team

Ready to talk?

We don't do high-pressure sales. Just a straightforward conversation about your challenges and whether we can help.

Start a Conversation