SME – Observability, ELK Stack & Monitoring Engineer
ASCENDING
- Fairfax, VA
- Permanent
- Full-time
- Maintain and deploy monitoring and alerting systems within the ELK Stack.
- Design, configure, and maintain our large-scale log aggregation solution using Elasticsearch and Logstash.
- Set up and manage data ingestion pipelines and transformations using tools like Filebeat, Logstash, and/or Fluentd/Fluentbit.
- Embrace the mindset of "automate any task" to improve efficiency.
- Build and maintain robust monitoring systems using Elasticsearch, Kibana, and Beats to proactively detect potential issues and trigger timely alerts.
- Maintain associated documentation as it applies to our audit and certification requirements.
- Participate in troubleshooting, capacity planning, and performance analysis activities related to the ELK Stack.
- Research new observability requirements and, in many cases, write code to implement them.
- Possess strong expertise in setting up monitoring policies, rules, and templates, and writing scripts to accomplish observability requirements.
- BS/MS in CS/Engineering or equivalent, OR 5+ years of experience.
- 4+ years of experience working directly with the
- Expert-level knowledge of the ELK Stack (on-prem and cloud), including best practices related to performance, security, and component setup (Elasticsearch, Logstash, Kibana, Beats).
- Fluent in writing scripts in languages like Python and (Bash or PowerShell) to automate tasks.
- Experience in Terraform and Ansible, including syntax, best practices, and managing complex configurations to build and manage infrastructure and applications.
- Very good working knowledge of Linux OS.
- Highly self-motivated and directed.
- Good analytical and problem-solving/troubleshooting abilities.