Cedexis Uses LogicMonitoring for DevOps Measurement

May 20, 2013 – 10:58 pm

cedexis_logoContinuous cloud server and application monitoring is essential to Cedexis, a SaaS firm that manages hosts in over 50 different countries around the globe to provide their industry leading Cloud Benchmarking and Cloud Load Balancing solutions.

Cedexis prides itself on looking at performance data in unique ways to get a grasp of the technical media quicker than anyone, providing increased value to their customers. The use of Puppet & LogicMonitor is critical to their deployment success.

With its Ops structure managed geographically, Cedexis manages dynamic host deployments in four regions: western US, eastern US, Europe and Asia-Pac. Cedexis configures all new machines with Puppet, ensuring the machine is prepared with a “blueprint” to take the Cedexis code. Cedexis engineers also include LogicMonitor settings in the config package so they can receive instant feedback on each machine’s performance.

Deployment Assurance

DevOps Monitoring Cycle

When Cedexis fires up a new machine, they utilize their DNS naming structure combined with LogicMonitor’s autodiscovery feature to ensure that all newly deployed devices are identified within the appropriate device or logical grouping and are using the proper datasource – all automatically.

LogicMonitor provides the operational insight that each remote machine is fully deployed and functional. Cedexis’ Senior Operations Engineer, Josh Cody, stated “LogicMonitor’s active discovery and SNMP polling checks are hugely important to our deployment operations. We install the collector agent and almost immediately can say ‘Whoa! Why are there extra ports open on that device!’ Then we can dive in and tune things to ensure a fully secure deployment.”

Once the new machine is no longer required, Puppet then removes the device from LogicMonitor as part of the take-down procedures.  Again fully automated.

“Both LogicMonitor and Puppet are critical to our operations. Without them we could not do what we do. As an Ops guy I don’t want to work anywhere that doesn’t deploy these tools.”

 

Google+LinkedInShare

LogicMonitor London User Group 2013

May 20, 2013 – 3:53 pm

 

LogicMonitor is coming to London for our Annual UK User Group this Wednesday, May 22nd from 12-3PM. We will be meeting at Smith’s Bar & Grill, which is located just around Paddington Station, we have the place reserved till 3 but you’re invited to hang out for however long you’d like!

Our CEO Kevin will talk through some of our latest product releases and share LogicMonitor’s product road map over lunch (we recommend getting the Fish & Chips, they’re amazing)! Immediately after lunch, our Support Engineers will lead a 2 hour working session. You can get an insight as to how others are using LogicMonitor as well as get help on your own application.

You can RSVP here or e-mail natasha.sidhu@logicmonitor.com.

Hope to see you all there!

Cheers,

LogicMonitor

Google+LinkedInShare

Monitoring as an acceptance-test for configuration management tools

April 12, 2013 – 1:06 pm

As Devops Borat says:  ”To make error is human. To propagate error to all server in automatic way is #devops.”

The safety catch to this is good monitoring. We demonstrated this to ourselves this morning.  We did a software release on some of our servers last night. This particular release involved quite a few changes to various components, including to various configuration files.  As we have a excellent tech ops team, they automate all our server configurations with puppet, so that everything is scalable, repeatable, and manageable over a growing fleet of infrastructure.

So last night, we upgraded a portion of our servers – the new puppet configuration that matched the new software deployment was run; configuration files modified; the software upgraded, and everything was happy.

Until this morning, when we got an alert about one of the upgraded servers: it was no longer submitting requests to the sitemonitor service, which checks websites for performance, availability and reachability from various places on the Internet, external to the customer’s infrastructure.  The quickly identified cause, apparent in a log file at the time the alert was triggered, was that the server, running the new code, suddenly was trying to talk the sitemonitor service using a configuration that only worked with the old code.

(more…)

Google+LinkedInShare

LogicMonitor User Group in Los Angeles March 7

February 28, 2013 – 4:11 pm

One week from today, we’ll be in downtown LA for our first LogicMonitor User Group on March 7 (rsvp online).

Our fearless founder Steve will be presenting our latest releases and talking through our API’s and new functionality such as Netflow and some roadmap ideas.  You’ll also get the chance to rub elbows with other LogicMonitor customers to swap best practices on infrastructure monitoring, server monitoring, performance tuning, virtualization, etc.

LA Picture Downtown241

We’ll meet at Pitfire Pizza in downtown LA and we’ll be supplying the pizza and beer.  Nothing says good pizza like fake tatoo’d fingers.

woodfire

We’d love you to see you there – signup here.

 

Google+LinkedInShare

A tale of two metrics: Windows CPU or vCenter VM CPU

February 25, 2013 – 9:52 am

A not uncommon question from our customers, or even from our own support people, is “Why does monitoring a Windows system running on VMWare report different CPU data than monitoring the virtual machine from the ESXi host? The ESX monitoring must be wrong!”

For example, here is LogicMonitor graphing the CPU load of a Windows system running as a Virtual Machine on ESXi. In this case, the CPU is gathered from WMI, by querying the Windows OS:
CPU load of a Windows system running as a Virtual Machine on ESXi

Here is the same machine at the same time, but this is how ESXi sees the load: (more…)

Google+LinkedInShare

Puppet monitoring: how to monitor the success or failure of Puppet runs

February 20, 2013 – 2:58 pm

This post, written by LogicMonitor’s Director of Tech Ops, Jesse Aukeman, originally appeared on HighScalability.com on February 19, 2013

If you are like us, you are running some type of linux configuration management tool. The value of centralized configuration and deployment is well known and hard to overstate. Puppet is our tool of choice. It is powerful and works well for us, except when things don’t go as planned. Failures of puppet can be innocuous and cosmetic, or they can cause production issues, for example when crucial updates do not get properly propagated.

Why?

In the most innocuous cases, the puppet agent craps out (we run puppet agent via cron). As nice as puppet is, we still need to goose it from time to time to get past some sort of network or host resource issue. A more dangerous case is when an administrator temporarily disables puppet runs on a host in order to perform some test or administrative task and then forgets to reenable it. In either case it’s easy to see how a host may stop receiving new puppet updates. The danger here is that this may not be noticed until that crucial update doesn’t get pushed, production is impacted, and it’s the client who notices.

How to implement monitoring?

Monitoring is clearly necessary in order to keep on top of this. Rather than just monitoring the status of the puppet server (a necessary, but not sufficient, state), we would like to monitor the success or failure of actual puppet runs on the end nodes themselves. For that purpose, puppet has a built in feature to export status info (more…)

Google+LinkedInShare

MongoDB and GridFS for Inter and Intra Datacenter Data Replication

January 31, 2013 – 10:23 am

This post originally appeared on HighScalability.com on January 14, 2013

—————————————————————————————————————–

Monday, January 14, 2013 at 9:30AM

This is a guest post by Jeff Behl, VP Ops @ LogicMonitor. Jeff has been a bit herder for the last 20 years, architecting and overseeing the infrastructure for a number of SaaS based companies.  

Data Replication for Disaster Recovery

An inevitable part of disaster recovery planning is making sure customer data exists in multiple locations.  In the case of LogicMonitor, a SaaS-based monitoring solution for physical, virtual, and cloud environments, we wanted copies of customer data files both within a data center and outside of it.  The former was to protect against the loss of individual servers within a facility, and the latter for recovery in the event of the complete loss of a data center.

(more…)

Google+LinkedInShare

Simple ways to be confident in automated server and application deployments.

January 7, 2013 – 10:41 am

Sample SAT question: xUnit is to Continuous Integration as what is to automated server deployments?

We’ve been going through lots of growth here at LogicMonitor. Part of growth means firing up new servers to deal with more customers, but we also have been adding a variety of new services: proxies that allow our customers to route around Internet issues that BGP doesn’t catch; servers that test performance and reachability of customers sites from various locations, and so on.  All of which means spinning up new servers: sometimes lots of times, in QA, staging and development environments.

As old hands in running datacenter operations, we have long adhered to the tenet of not trusting people – including ourselves. People make mistakes, and can’t remember things they did to make things work. So all our servers and applications are deployed by automated tools. We happen to use Puppet, but collectively we’ve worked with cfengine, chef, and even Rightscripts.

So, for us to bring up a new server – no problem. It’s scripted, repeatable, takes no time. But how about splitting the functions of what was one server into several? And how do we know that the servers being deployed are set up correctly, if there are changes and updates? (more…)

Google+LinkedInShare

Cisco’s Meraki purchase indicates value of ‘Cloud Managed’ approach to AP configuration and monitoring

December 20, 2012 – 2:01 pm

‘Meraki’ may not be the best known name in networking, but their technology is going to touch you soon if it hasn’t already.  Meraki was just acquired by Cisco in November for a cool $1.2 billion to incorporate into their new Cloud Networking Group.

Cisco is predicting explosive growth in cloud computing, the practice of running applications and storing data on remote servers accessed over the internet instead of running apps and storing data on your local computer. And increasingly, these cloud services will be accessed with with mobile devices over wireless networks.

What Meraki brings to the table is their cloud managed wireless network infrastructure hardware.  The Access Point (AP) is the critical bridge from the wired to the wireless world. The unique feature of the Meraki APs is you plug them into your wired network, the AP connects to  the mother ship at Meraki, and you go to meraki.com to configure and manage them via a web UI.

This is a stellar leap from the typically clumsy and slow embedded web interfaces found on most APs, and the emphasis is on managing your wireless network as a whole, not a bunch of individual APs. The web UI is clean and easy to use, the network can be managed from anywhere, and the APs are kept up to date by Meraki with automatic firmware and security updates.

(more…)

Google+LinkedInShare

Is an RMM tool the only thing MSPs need for monitoring?

December 5, 2012 – 9:25 am

If you’re an MSP providing IT services for desktop environments where basic up/down server monitoring will do, an RMM tool is more than adequate. But for MSPs offering fixed-fee monthly service packages, or cloud services, important advantages can be gained by complementing an RMM tool with an advanced performance monitoring solution.

Monitoring customers on fixed-fee contracts
When you offer an “all you can eat” package where you are on the hook for anything that breaks in your customer’s infrastructure, the ability to be proactive and fix small problems before they become big problems becomes paramount.

An advanced performance monitoring solution will monitor your customer’s high-end systems (network, servers, virtualization, storage, etc…) in such depth that you will know about problems well before your customer. And when there is an issue, these tools provide historical trending graphs that enable your front line help desk technicians to solve many problems without having to bring in the help of more costly engineers.

(more…)

Google+LinkedInShare