OpsRamp Summer 2019 Release

The Summer 2019 Release delivers new AIOps innovations that let incident responders safely test the accuracy of machine learning predictions, reduce alert noise by learning and suppressing repetitive alerts, and offer the right context to troubleshoot issues for IT resources that are not natively monitored by OpsRamp. The release also introduces topology context capabilities for AWS public cloud services and cross-site network connections. The May release also announces new features for Kubernetes dashboards, enterprise application monitoring, and management of modern workloads like Azure Stack and Mesosphere.

Service-Centric AIOps: Reduce Network Downtime and Drive Real-Time Performance Analysis

OpsRamp OpsQ helps IT operations teams resolve high-impact issues with a modern, scalable and collaborative incident management solution. OpsQ processes numerous IT events from different monitoring tools through open ingestion APIs, analyzes and prioritizes events with alert inference models, escalates critical incidents to available teams using on-call schedules, and remediates incidents without human intervention. Some of the innovative service-centric AIOps capabilities announced in the Summer 2019 release include:

OpsQ Observed Mode: The 2019 State of AIOps report shows 67% of IT leaders have concerns about the relevance and reliability of the insights delivered by AIOps tools. OpsQ Observed Mode ensures greater transparency into machine learning models for performance analysis by letting IT teams access the power of AIOps recommendations in shadow mode. OpsQ Observed Mode builds trust and confidence in how machine learning algorithms can surface relevant insights for recognizing, repairing, and fixing problems with actionable insights.

OpsQ Observed Mode delivers shadow inferences for assessing the accuracy of AIOps recommendations.

Learning-Based Auto-Alert Suppression: The 2019 State of AIOps report found 64% of IT practitioners are looking to replace manual tasks with automated processes. Alert fatigue is a real problem for IT teams as a large volume of alerts generated are part of standard operating procedures. OpsQ automatically suppresses known and expected alerts using first-response policies and reduces alert noise with time-based and attributed-based pattern matching.

Alert-Auto Suppression techniques act as a first-response mechanism for repetitive alerts.

Automatic Resource Creation from Third-Party Events: For resources not natively managed by OpsRamp’s monitoring engine, OpsQ now creates new managed resources (if the resource does not already exist in OpsRamp) from third-party alerts. Automatic resource creation ensures faster issue identification and rapid root cause(s) analysis with event context.

Auto-extract resource metadata and deliver contextual visibility for resources not managed by OpsRamp.

Continuous Learning for Alert Escalation: OpsQ alert escalation policies for auto-incident creation and routing now get smarter with machine learning models that get refreshed every week with live event data. Continuous learning in OpsQ adjusts and optimizes alert assignment and priority across dynamic IT environments and ensures timely action for outstanding alerts.

Impact Visibility and Service Context Auto-create and assign incidents with alert escalation policies that use live event data.

Impact Visibility and Service Context: Dynamic Dependency Mapping for Hybrid Infrastructure

The 2018 Gartner Market Guide for AIOps Platforms shares how dynamic relationship context is critical for IT event and performance analysis: “For the patterns AIOps detects to be relevant and actionable, a context must be placed around the data ingested. That context is topology.” OpsRamp’s business service maps and dynamic network maps deliver real-time application to infrastructure dependency views that establish the right context for incident response teams. The latest enhancements on the Impact Visibility front in the Summer release include:

Cloud Topology for Amazon Web Services (AWS): OpsRamp now delivers resource dependency visibility for the different moving parts of AWS public cloud services. DevOps and site reliability engineering (SRE) teams can visualize the topology context for AWS resources like EC2, VPC, RDS, or ELB and troubleshoot issues with context-aware confidence.

Access real-time dependency information for AWS public cloud workloads.

Cross-Site Connection Topology: OpsRamp now supports WAN discovery protocols (BGP/OSPF) so that IT teams can keep of track network connections across multiple enterprise deployments. Network admins can leverage cross-site topology to understand the connections between different datacenter sites or from a datacenter site to a public cloud environment.

AIOps for Proactive IT Operations Understand the connections between different datacenter sites using cross-site topology.

Cloud Native Discovery and Monitoring: End-to-End Visibility for Modern Infrastructure Services

Cloud native applications embrace microservices architectures built on ephemeral infrastructure workloads like Docker containers. How do enterprises ensure highly available and reliable cloud native apps while deploying at scale? The Summer 2019 release features new dashboards for Kubernetes monitoring, support for open source application stacks, and integrations for new-age infrastructure workloads like Azure Stack and Mesosphere.

Out-of-the-Box Kubernetes Dashboards: IT teams can track the performance of cloud native services running on containerized deployments with resource utilization metrics for Kubernetes cluster health. OpsRamp delivers granular insights for Kubernetes clusters and underlying containers, pods, and nodes with default Kubernetes dashboards across both on-prem and public cloud environments (AKS, EKS, and GKE).

AIOps for Proactive IT Operations Scale and optimize container infrastructure with out-of-the-box Kubernetes dashboards.

Expanded Application Monitoring: OpsRamp’s application adaptors monitor the availability of popular open source apps with the right performance indicators. Application owners can deliver compelling customer experiences with proactive agentless monitoring and optimize the health of business-critical apps like ActiveMQ, RabbitMQ, Apache Spark, Apache Solr, Elasticsearch, CockroachDB, Couchbase, Fluentd, and Neo4j with relevant metrics.

AIOps for Proactive IT Operations Manage popular apps used in cloud and cloud native stacks with agentless monitoring.

Integrations for Azure Stack and Mesosphere: OpsRamp’s Azure Stack integration ensures dynamic discovery, comprehensive visibility, and consolidated control of Azure Stack deployments through API-based data collection and agent-based virtual machine monitoring. The Mesosphere DC/OS integration automatically discovers and tracks metrics for Mesos master and agent nodes so that IT teams can scale the performance of modern enterprises apps.

Cloud Native Monitoring and Event Management Discover and monitor Microsoft Azure Stack instances with robust integrations.

Other Platform Updates

The OpsRamp Summer 2019 Release also introduces new platform capabilities for synthetic monitoring, service map enhancements, bulk export of operational data for data mining, and automatic notifications for failures of existing tool integrations.

OpsRamp Summer Release, May 2019

OpsRamp Winter Release, January 2019

Impact Visibility and Service Context: Greater Service Centricity For Faster Resolution.

OpsRamp helps modern IT teams manage end-to-end services across multiple business units, geographies, and distributed stakeholders with actionable service performance insights using business service maps and dynamic topology maps. Impact visibility lets DevOps teams effectively manage the hybrid and cloud-native infrastructure involved in supporting enterprise-level digital services. Service context lets IT teams maintain desired service levels and drive critical business outcomes with the right levels of transparency and visibility:

Application Topology: OpsRamp enables dynamic discovery and topology mapping for forty popular enterprise applications like Apache, Cassandra, Couchbase, Docker, Hadoop, Kafka, Mesos, MongoDB, MySQL, Redis, Solr, and Zookeeper. Application topology not only discovers application clusters, hosts, processes, and services but also establishes relationships between application components and infrastructure.

Hypervisor Topology: OpsRamp now helps tame virtualization sprawl by discovering and visualizing relationships across virtual machines, hypervisor servers and clusters in VMware vSphere and KVM environments.

Enhanced Service Maps: OpsRamp’s service maps have a new user interface that makes it easy to deliver highly available IT services with visual indicators for service health and performance of underlying application and infrastructure resources.

AIOps for Proactive IT Operations: Data-Driven Insights For Modern Hybrid Infrastructure Management.

OpsRamp OpsQ, the intelligent event management engine for service-centric AIOps, helps incident response teams drive accurate problem diagnosis and improve collaboration with reduced alert volumes, contextual correlation, intelligent alerting, and automated remediation. New features in the Winter release include:

Auto-Incident Creation and Routing: Machine learning-based alert escalation capabilities in OpsRamp’ OpsQ drive faster incident creation, assignment, and routing for rapid problem resolution. IT teams no longer have to manually provide incident assignment information with OpsRamp’s ability to automatically create and dispatch incidents to the right teams. Alert escalation policies can auto-assign incidents using prior alert, incident, and notification data.

Augmented Training for Inference Models: OpsRamp’s machine learning-based inference models correlate alerts linked by a common cause using historical alert data. Opsramp’ OpsQ now allows users to augment alert co-occurrence models with additional user-provided training data for improved accuracy and better predictability.

Frequency-Driven Alert Escalation: OpsQ now supports policies to escalate alerts for monitored resources that change alert state frequently (also known as alert flapping). Frequency-based alerting lets IT teams safely ignore alerts that flap only occasionally and pay attention to alerts that flap repeatedly.

Cloud Native Monitoring and Event Management: Access Real-Time Analytics for Better Performance Visibility.

OpsRamp introduces new capabilities for supporting cloud native infrastructure along with enhanced features for AWS infrastructure and platform event correlation and analysis:

Cloud Native Monitoring: 451 Research analysis shows that the adoption of cloud-enabling technologies is accelerating, with 50 percent of enterprises already using or planning to use containers. The January 2019 release supports discovery and monitoring of Kubernetes environments for both on-prem and managed Kubernetes as a service environments. OpsRamp’s instrumentation and dashboards for cloud native services help DevOps teams track the different servers, pods, containers, and Kubernetes services and ensure that there is enough capacity to support the availability and health of container workloads.

Cloud Event Monitoring: Most enterprises work with a large number of AWS infrastructure and platform services to host their digital services. OpsRamp now offers the ability to process, analyze, and centrally access daily events from AWS Health, Database Migration Services, EBS, ECS, ELB, Redshift, and CloudWatch. Site reliability engineers can view and respond to AWS events across multiple cloud accounts and better manage the health and performance of their public cloud services using OpsRamp.

OpsRamp Winter Release, January 2019

Introducing OpsRamp OpsQ

OpsRamp OpsQ offers three different inference models that you can apply to your IT application and infrastructure stack. Inference models offer the ability to set filter criteria and apply an analytical model to a particular type of IT resource. OpsQ’s inference models allow you to easily configure and analyze your incoming alert streams to reduce the noise and maximize productivity. The three Inference Models today are:

  • Topology. Understand the relationships between IT services and underlying infrastructure. Identify the root cause alerts for an incident with the right situational context and impact analysis.
  • Clustering. Cluster events based on their attributes by analyzing similarities and correlating different alerts into one inference alert.
  • Co-occurrence. Analyze alert sequence patterns for existing alerts to correlate alerts and identify the root cause(s) for an incident.

October 2018 Update | Product Webinar | Blog Post Overview


OpsRamp Fall 2018 Release

Topology Explorer: Track Network Performance To Prevent Unforeseen Surprises.

How do you monitor changes across your IT environment while being able to pinpoint root cause when there is a network failure? Topology Explorer delivers dynamic network insights and real-time dependencies for your application and infrastructure layers. Embrace service-oriented operations management with:

Network Mapping: Automate infrastructure discovery and resource mapping for faster impact analysis:

  • Visualize upstream and downstream dependencies for hybrid infrastructure by understanding your network topology interconnections
  • Accurately profile your network with detailed resource-level information (OS, make, model, device type, alerts, incidents, patches, and uptime) for any device in your network
  • Deliver contextual troubleshooting for incident management with a holistic view of your IT environment
Network Dependency Mapping for enhanced diagnosis of performance anomalies

Application Mapping: Discover and visualize critical dependencies across applications, server, and network components so that you can:

  • Deliver business-service context by better understanding how your applications interact with each other
  • Remove blind spots and gain operational visibility with end-to-end visibility for your application services
  • Visualize your entire application stack with dynamic discovery for over 40 applications
Application Dependency Mapping for better end-user and business outcomes

Enhanced Service Maps: Drive Application Availability And Business Resilience.

Manage IT outages better by viewing the relationships between business services and the underlying infrastructure in a Service Map. Drive better context with situational awareness and restore IT services faster with inline visualization of alerts for impacted IT resources. You'll reduce the pain of coordinating incident response across different teams with enriched alert information in our improved Service Maps.

Service Maps for improved performance visibility and customer experiences

Multi-Cloud Database Monitoring: Deep-Dive Performance Insights For Your Cloud Databases.

Gain proactive monitoring for Amazon Relational Database Services (Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server), Microsoft Azure (SQL Database, Azure Database for PostgreSQL, Azure Database for MySQL), and Google Cloud Platform (Cloud SQL) with OpsRamp. While OpsRamp has access to RDS metrics through our CloudWatch API integration, obtain deeper database-engine level health metrics through our agentless monitoring to:

  • Identify performance bottlenecks for your production databases with query-level performance insights for transaction and query throughput, query execution performance, connection errors, and buffer pool usage
  • Drive database performance tuning and troubleshooting with smart alerts and instant notifications for any issues in your cloud databases
Multi-Cloud Database Monitoring for quicker detection of database failures

Improved Alert Management: Focus On The Incidents That Demand Urgent Action.

Reduce alert fatigue by pausing all alerts during scheduled maintenance work. Drive alert prioritization with policies that notify changes in alert state, so that you can resolve issues in a single place. Alert filters now include text-based search which, when combined with the AIOps-powered machine learning platform, enables faster incident triage and response.

Alert Management for gaining control of alert floods

Comprehensive Reporting: Manage Service-Level Performance With The Right Metrics.

Stop building operational spreadsheets and manage your IT performance with our easy-to-use reports. The Custom Alerts report optimizes incident management processes with detailed analytics on the IT resources that generate the most alerts and helps you track alert volume trends over time. The Cloud Cost report lets you scrutinize multi-cloud spending trends across your enterprise and analyze the ROI of your cloud consumption in a single place.

Comprehensive Reporting for data-driven operational insights

Summer 2018: OpsRamp 5.0

Multi-Cloud Visibility Dashboard: Comprehensive Control For A Cloud-Native World

How do you efficiently handle the complexity of managing multi-cloud services while still keeping a handle on your ever-expanding cloud budgets? Our Multi-Cloud Visibility Dashboard offers much-needed clarity on the different cloud services that you're consuming and better manage cloud budgets across business units. With OpsRamp, IT teams can configure budget policies to receive alerts when cloud billing exceeds budgeted amounts. Enterprise IT teams have access to three powerful new widgets in the multi-cloud dashboard that enables them to:

  • Locate Global IT Assets. The Global Assets widget displays a geographical distribution of hybrid IT assets across datacenter and cloud.
  • Analyze Cloud Spend. The Cloud Cost Insights widget provides a quick snapshot of public cloud consumption by cloud account, custom attributes, and other criteria.
  • Uncover Usage Patterns. The Cloud Cost Trend widget delivers trend analysis for multi-cloud services by resource type, custom attributes, and other criteria.
Multi-Cloud Visibility Dashboard for optimal cloud management

Learn more about Unified Service Intelligence, the hybrid visibility solution for service-oriented management.

AIOps Inference Engine: Extract Signal From Noise

Our big data platform for IT Operations just got smarter with machine learning capabilities for intelligent event correlation. The AIOps Inference Engine groups similar alerts together to reduce unnecessary noise so that IT teams can focus their attention on the incidents that truly matter.

Reduce alert noise with Topology-based Correlation (correlate alerts based on logical topology dependencies) and Clustering-based correlation (correlate alerts that share similar properties). With Inference Engine, you’ll gain faster situational awareness and drive quicker remediation and restoration for critical incidents.

AIOps Inference Engine for intelligent correlation
451 Research

The OpsRamp 5.0 platform can help effectively manage cross-platform cloud services, avoid vendor lock-in or the dreaded underutilization of multi-cloud resources."

– William Fellows, Founder & Research Vice President, 451 Research

Custom Reports: Drill-Down Into The Operational Metrics That Matter

Our new Custom Reports feature lets you design your own spreadsheet style reports for IT infrastructure management data. Gain the right operational insights for your IT management with three types of reports (Inventory, Inventory Breakdown, and Metrics).

Custom Reports for the right operational insights

Redesigned Service Maps: Optimize Service and Process Performance

Our fully redesigned Service Maps let you deliver predictable service outcomes by understanding key infrastructure dependencies for service performance. See all relevant information about a service’s availability in one place and map hybrid dependencies for optimal service delivery. Service group relationships drive impact analysis and root cause remediation for business-critical IT services.

Service maps for reliable service performance

Learn more about Unified Service Discovery, the real-time discovery and dependency mapping solution for hybrid environments.

Expanded Integrations: Manage Your Hybrid Estate In A Single Platform

Given the increasing adoption of public cloud, we now offer 90 different integrations for commonly used cloud services from Amazon Web Services, Microsoft Azure, and Google Cloud Platform. With the 5.0 release, we’ve also announced integrations with Google Stackdriver for cloud monitoring and management, ManageEngine ServiceDesk Plus for real-time incident management, and Micro Focus Operations Manager i for IT teams looking for a modern alternative to legacy event management.